[Issue 17746] New: Improve BigInt memory usage

August 11, 2017
Posted by hsteoh@quickfur.ath.cx
Permalink
hsteoh@quickfur.ath.cx
Permalink
https://issues.dlang.org/show_bug.cgi?id=17746

          Issue ID: 17746
           Summary: Improve BigInt memory usage
           Product: D
           Version: D2
          Hardware: x86_64
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P1
         Component: phobos
          Assignee: nobody@puremagic.com
          Reporter: hsteoh@quickfur.ath.cx

Currently, operations on BigInt always allocates a new BigInt instance to hold the result, even if it is valid and possible to reuse the space of (one of) its operand(s).  For example, `BigInt x=...; x++;` will always allocate a new BigInt in the course of evaluating `x++`, even though it could have more efficiently updated x in-place.

As far as I can tell, the reason BigInt operations are implemented this way is because it was intended to be a drop-in replacement for fixed-width ints, and so it must replicate equivalent semantics, in particular by-value semantics. Since BigInt.opAssign only copies by reference (probably for efficiency reasons, since otherwise passing BigInt between functions could entail expensive copying), the only way to ensure by-value semantics is to never modify an existing BigInt instance, but always allocate a new instance for holding the result of operations.

The current implementation has the following drawbacks:

- Excessive allocations: every operation on a BigInt involves an allocation, including trivial operations like ++, that only rarely actually requires allocating a new BigInt (i.e., only once every 2^n calls to opUnary!"++", where n is the current size of the BigInt).  This increases GC pressure in BigInt code unnecessarily.

- Suboptimal performance when the result of an operation fits within one of its operations and said operand is not aliased. `x++`, for example, entails allocating a new BigInt and copying the contents of x over, whereas updating in-place would, most of the time, require updating only 1 word of BigInt data or just a few words.


This enhancement request proposes the following change to BigInt's implementation:

1) Add a reference counting scheme to BigInt. Since BigInt is already a wrapper around what's essentially a reference to the actual data, this would not be a huge change.

2) Implement in-place versions of the current BigInt operations, where possible. (Certain operations like * may not make sense as an in-place algorithm, since allocation of temporary working space will be needed anyway.)

3) In BigInt's overloaded operators, whenever the current reference count is 1, and an in-place algorithm is available, use the in-place algorithm to update the BigInt in-place rather than allocate a new BigInt to hold the result.

In addition to better memory usage and better efficiency for trivial operations like ++, the proposed change also opens up the possibility of making BigInt usable in a @nogc context.

--
Forums