Thread overview
[Issue 9092] New: GC.extend allocates less then it reports
Nov 30, 2012
Rainer Schuetze
Dec 01, 2012
Rainer Schuetze
Dec 03, 2012
Rainer Schuetze
Jan 13, 2013
yebblies
November 28, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=9092

           Summary: GC.extend allocates less then it reports
           Product: D
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: critical
          Priority: P2
         Component: DMD
        AssignedTo: nobody@puremagic.com
        ReportedBy: monarchdodra@gmail.com


--- Comment #0 from monarchdodra@gmail.com 2012-11-28 08:53:23 PST ---
Basically, when calling "GC.extend", after a while, the actual length of the allocated memory is just a few bytes short of the actual reported value.

In this particular case, it reports to have extended to 65536 bytes, but trying to access any of the last few bytes (in this case, the last 4 DWORDS) will create an access violation error.

Here is the test program:
//----
import core.memory;
import std.stdio;

void main()
{
    auto a = new ubyte[] (4000);
    for ( ; ; )
    {
        writefln("The length of a (a.length) is: %s", a.length);
        auto u = GC.extend(a.ptr, 40, 400);
        writefln("GC.extend claims to have extended it %s bytes", u);
        if(u)
        {
            a = a.ptr[0 .. u];
            foreach_reverse(k;1..21)
            {
                writef("Trying to access index at u - %2s (%s)... ", k, u-k);
                stdout.flush();
                a.ptr[u -  k] = 0x01;
                writefln("OK!");
            }
            writefln("OK! GC.Extend didn't lie to us, \"a\" really is %s bytes
long.", u);
            writefln("On to the next iteration...\n");
        }
        else
            return;
    }
}
//----

And an exerpt of the last iteration.

//----
...
Trying to access index at u -  4 (61436)... OK!
Trying to access index at u -  3 (61437)... OK!
Trying to access index at u -  2 (61438)... OK!
Trying to access index at u -  1 (61439)... OK!
OK! GC.Extend didn't lie to us, "a" really is 61440 bytes long
On to the next iteration...

The length of a (a.length) is: 61440
GC.extend claims to have extended it 65536 bytes
Trying to access index at u - 20 (65516)... OK!
Trying to access index at u - 19 (65517)... OK!
Trying to access index at u - 18 (65518)... OK!
Trying to access index at u - 17 (65519)... OK!
Trying to access index at u - 16 (65520)... object.Error: Access Violation
//----

Extra tests seem to reveal this only comes up if the returned value is a mod of
2^^16: I get an error on the lengths:
65536, 131072, 196608 etc..., up to 983040

By adding "if ((u % 0x10000) == 0) continue;",
then the program continues, until it finishes at 1044480 bytes.

Done on a win7x64 with 2.060 and 2.061alpha.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
November 30, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=9092


Rainer Schuetze <r.sagitario@gmx.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |r.sagitario@gmx.de
         Resolution|                            |INVALID


--- Comment #1 from Rainer Schuetze <r.sagitario@gmx.de> 2012-11-30 01:33:27 PST ---
I very much suspect that this is caused by the behaveour of large arrays that need more space than 2048 bytes: these arrays store the actual allocated length at the beginning of the block and reserve 16 bytes for that, in contrast to smaller arrays that place this information at the end of the memory block (see druntime/src/rt/lifetime.d for details).

So in your case, a.ptr does not point to the start of the memory block, but 16 bytes into it, leaving a little less than the actual memory size for the array. GC.extend changes the memory block, but doesn't know about the array-semantics, so it reports back the raw size of the memory block.

Your example mixes high level memory access (arrays) with low level functions
(GC.extend), I don't think that is a good idea. Instead, use GC.malloc for your
first allocation.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
November 30, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=9092


monarchdodra@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|INVALID                     |


--- Comment #2 from monarchdodra@gmail.com 2012-11-30 02:26:39 PST ---
(In reply to comment #1)
> I very much suspect that this is caused by the behaveour of large arrays that need more space than 2048 bytes: these arrays store the actual allocated length at the beginning of the block and reserve 16 bytes for that, in contrast to smaller arrays that place this information at the end of the memory block (see druntime/src/rt/lifetime.d for details).
> 
> So in your case, a.ptr does not point to the start of the memory block, but 16 bytes into it, leaving a little less than the actual memory size for the array. GC.extend changes the memory block, but doesn't know about the array-semantics, so it reports back the raw size of the memory block.
> 
> Your example mixes high level memory access (arrays) with low level functions
> (GC.extend), I don't think that is a good idea. Instead, use GC.malloc for your
> first allocation.

How could this possibly be "invalid"? "I don't think it is a good idea" is not the same as "This is wrong and invalid".

I passed a pointer (a.ptr) to extend, and extend promised that memory location was extended to a certain amount. if a.ptr did not actually point to the beginning of a memory location, then why/how was it extended? And if extend was able to detect that a.ptr was already 16 bytes into my memory block, then why can't it take that into account when replying?

Anyways, here is Appender subject to this bug...
//----
void main()
{
    auto a = new char[] (61000);
    auto app = appender!(char[])(a);
    foreach(k; 0..5000)
    {
      write(k, ' ');
      stdout.flush();
      app.put('a');
    }
}
//----
... 4517 4518 4519 4520 object.Error: Access Violation
//----

BTW, both these tests would appear to pass on DPaste, so the bug appears to be windows related. I guess such allocations on linux aren't a bad idea?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 01, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=9092



--- Comment #3 from Rainer Schuetze <r.sagitario@gmx.de> 2012-12-01 01:45:26 PST ---
> I passed a pointer (a.ptr) to extend, and extend promised that memory location was extended to a certain amount. if a.ptr did not actually point to the beginning of a memory location, then why/how was it extended? And if extend was able to detect that a.ptr was already 16 bytes into my memory block, then why can't it take that into account when replying?

You expect different stuff from GC.extend than its author: "Attempt to in-place
enlarge the memory block pointed to by p by at least minbytes beyond its
current capacity, up to a maximum of maxsize.  This does not attempt to move
the memory block (like realloc() does).
     * Returns:
     *  0 if could not extend p,
     *  total size of entire memory block if successful." (gcx.d)

At GC level, the actual usage of that memory as an array is unknown. When you access a.ptr[size-of-memory-block - 1] you read/write outside of the memory block which can cause page faults or not, depending on whether the following page was mapped by the OS or not.

You can use capacity(a) to get the number of available array slots in the
current memory block.


> Anyways, here is Appender subject to this bug...

I agree, this is a bug in the appender code that assumes the same things that you do.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 03, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=9092



--- Comment #4 from monarchdodra@gmail.com 2012-12-03 13:49:53 PST ---
(In reply to comment #3)
> You expect different stuff from GC.extend than its author...

Ok, I think I get it. This leads to 3 questions (if you'd care to educate me):
1. What exactly are inside those 16 bytes? I'd say something along the lines of
pointer to destructor, and currently used vs capacity?
2. Why is this issue *only* showing up when my allocation sizes are exact
multiples of 0x10000 ?
3. Would it be possible to somehow have extend detect when it is given an
"array allocated pointer"? In that case, it could either refuse to extend, or
assert, or something?

I was currently looking into appender, so I'll try to fix it in such a way as to not have this problem anymore.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 03, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=9092



--- Comment #5 from Rainer Schuetze <r.sagitario@gmx.de> 2012-12-03 15:07:31 PST ---
(In reply to comment #4)
> 1. What exactly are inside those 16 bytes? I'd say something along the lines of pointer to destructor, and currently used vs capacity?

Currently only the number of entries that are actually used in the memory block
are stored. This allows to append in place if a slice ends at the end of the
used data in the block.
It is 16 bytes in case any struct needs that alignment.

> 2. Why is this issue *only* showing up when my allocation sizes are exact multiples of 0x10000 ?

The problem happens at page boundaries (4kB), but the "commit" size of GC pools is 64kB, so pages are mapped into memory at that granularity. So you'll probably corrupt memory earlier, but the page fault is likely to happen at 64kB boundaries.

> 3. Would it be possible to somehow have extend detect when it is given an "array allocated pointer"? In that case, it could either refuse to extend, or assert, or something?

I don't think it is GC.extend's job to do this. Although slightly less efficient, I think appender should get the new capacity via capacity(array) instead of calculating it from the return value of GC.extent.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 13, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9092


yebblies <yebblies@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
                 CC|                            |yebblies@gmail.com
         Resolution|                            |INVALID


--- Comment #6 from yebblies <yebblies@gmail.com> 2013-01-13 17:36:16 EST ---
GC.extend is working as designed.  Please reopen as an enhancement if you want to change the behavior or add a new function for this.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 24, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9092



--- Comment #7 from github-bugzilla@puremagic.com 2013-03-24 09:30:13 PDT ---
Commits pushed to master at https://github.com/D-Programming-Language/druntime

https://github.com/D-Programming-Language/druntime/commit/087752206585a9c8690ebdb79e468646521c05f5 Fixes Issue 9092 - GC.extend allocates less then it reports

Not an actual issue, but required more documentation to be clear: Added a note about how extend and arrays interact to avoid confusion.

Also adds an example section, for both ways of using extend.

https://github.com/D-Programming-Language/druntime/commit/f0c3aa724c7226cb1ad1acbbd95da6fa008af626 Merge pull request #382 from monarchdodra/9092

Fixes Issue 9092 - GC.extend allocates less then it reports

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------