August 02, 2007
Dave wrote:
> Vladimir Panteleev wrote:
>> On Wed, 01 Aug 2007 17:48:26 +0300, Sean Kelly <sean@f4.ca> wrote:
>>
>>> It turns out this is because GrowBuffer uses a void[] internally to
>>> store data.  The type should probably be changed to byte[].  I'll file a
>>> ticket for it.
>>
>> Cheers, that indeed fixed it. And now it runs much faster than the Phobos version, too!
>> 
> I thought the Phobos and Tango GC's were basically the same -- has that changed over the last 1/2 year or so?

No.  They're still basically the same.  But some of the differences that exist both there and in the runtime could have a noticeable impact on performance in some situations, as this test shows.  Also, recent discussion about this issue has inspired me to make some additional changes that will further affect certain aspects of how memory is managed, etc, and these could have an impact on the performance of some corner cases (or at least that's my hope).

Eventually, I'd like to spend some time developing a new GC which is more oriented towards multithreaded programming.  But that promises to be a fairly large project, and I don't have time for it quite yet.


Sean
August 03, 2007

Sean Kelly wrote:
> kenny wrote:
>> 2. will all the documentation be available offline?
> 
> Yes, but no timetable yet.

You can do this yourself with a copy of wget.  It's not a *perfect* copy, but it does work.  I did this a while back, so forgive the sketchy instructions :)  Also, this is *just* the API reference, and doesn't include the formatted source code.  That said, I found it useful when my dialup was down.

NB: Thunderbird <3s wrapping lines, so be careful with the shell
commands (prefixed with a '$').

First, make a directory for the docs.  Then open up a shell (or command line or whatever) and get into that directory.  Then run these commands (note: these are for bash.  If you don't have bash, just expand the curly braces: "a{b,c}d" ==> "abd acd", and join the lines together at the end-of-line backslashes.)

$ wget -np -nH -m -p -k -E -x --cut-dirs=4 \ http://www.dsource.org/projects/tango/docs/current/

$ wget -np -nH -m -p -k -E -x --cut-dirs=6 \ http://svn.dsource.org/projects/tango/trunk/doc/html/candydoc/img/

$ wget -nH -x --cut-dirs=4 \ http://www.dsource.org/projects/tango/docs/current/js/\ {explorer,tree,util}.js \ http://www.dsource.org/projects/tango/chrome/common/js/\ {dsource,trac}.js

$ wget -nH -x --cut-dirs=3 \ http://www.dsource.org/projects/tango/themeengine/theme.js

That should do it.

Hope that helps.

	-- Daniel
August 03, 2007
Daniel Keep wrote:
> 
> Sean Kelly wrote:
>> kenny wrote:
>>> 2. will all the documentation be available offline?
>> Yes, but no timetable yet.
> 
> You can do this yourself with a copy of wget.  It's not a *perfect* copy, but it does work.  I did this a while back, so forgive the sketchy instructions :)  Also, this is *just* the API reference, and doesn't include the formatted source code.  That said, I found it useful when my dialup was down.
> 
> NB: Thunderbird <3s wrapping lines, so be careful with the shell
> commands (prefixed with a '$').
> 
> First, make a directory for the docs.  Then open up a shell (or command line or whatever) and get into that directory.  Then run these commands (note: these are for bash.  If you don't have bash, just expand the curly braces: "a{b,c}d" ==> "abd acd", and join the lines together at the end-of-line backslashes.)
> 
> $ wget -np -nH -m -p -k -E -x --cut-dirs=4 \ http://www.dsource.org/projects/tango/docs/current/
> 
> $ wget -np -nH -m -p -k -E -x --cut-dirs=6 \ http://svn.dsource.org/projects/tango/trunk/doc/html/candydoc/img/
> 
> $ wget -nH -x --cut-dirs=4 \ http://www.dsource.org/projects/tango/docs/current/js/\ {explorer,tree,util}.js \ http://www.dsource.org/projects/tango/chrome/common/js/\ {dsource,trac}.js
> 
> $ wget -nH -x --cut-dirs=3 \ http://www.dsource.org/projects/tango/themeengine/theme.js
> 
> That should do it.
> 
> Hope that helps.
> 
> 	-- Daniel

Ah, that was you sucking my server dry, eh?

I do have it on my list of things to do for offline docs... and potentially Gregor's DSSS could help here.

BA
August 06, 2007
Vladimir Panteleev wrote:
> On Wed, 01 Aug 2007 09:08:16 +0300, Vladimir Panteleev <thecybershadow@gmail.com> wrote:
> 
>> I initially wrote it to try to find a memory leak in Tango's GC (which was actually fixed at some point).
> 
> Turns out it's still there, and it's the old "binary data" issue with pointer-searching GCs, which was fixed in D/Phobos 1.001 by making the GC type-aware. Check out the attached sample programs for a simple example - the Tango version can't know there are no pointers in its GrowBuffer's data, and thus leaks like crazy, while the Phobos version stays at 13MB.

i ran into another issue with the phobos gc. i have constant array literals that are used in a constructor to initialize a member. the data isn't always used. each instance uses it at most once. the second time an instance uses that data, it has been overwritten. the problem disappears if i disable the gc.
it looks like the gc frees the memory that holds the constant initializer. unfortunately it's really hard to reproduce this problem in a small program.
August 06, 2007
Vladimir Panteleev wrote:
> On Wed, 01 Aug 2007 09:08:16 +0300, Vladimir Panteleev <thecybershadow@gmail.com> wrote:
> 
>> I initially wrote it to try to find a memory leak in Tango's GC (which was actually fixed at some point).
> 
> Turns out it's still there, and it's the old "binary data" issue with pointer-searching GCs, which was fixed in D/Phobos 1.001 by making the GC type-aware. Check out the attached sample programs for a simple example - the Tango version can't know there are no pointers in its GrowBuffer's data, and thus leaks like crazy, while the Phobos version stays at 13MB.

The cause of this is somewhat an artifact of the OO design in Tango. The underlying buffer being allocated is a byte[], but the reference to it is a void[].  The problem occurs when GrowBuffer grows the buffer by increasing its length, which causes the buffer to be reallocated as a void[].  The reason this is a problem is that neither runtime, Tango or Phobos, preserves memory block attributes during a reallocation--they both simply key off the type being used to perform the reallocation. Obviously, this is a problem, and I've decided to change the behavior in Tango accordingly.  It will take some doing and I'm a bit over-busy at the moment, but before long the Tango runtime will preserve all block attributes on a reallocation.  In essence, this will occur by having the runtime call gc_realloc, but before this will work gc_realloc must be fixed to handle slices.


Sean
August 06, 2007
Jascha Wetzel wrote:
> Vladimir Panteleev wrote:
>> On Wed, 01 Aug 2007 09:08:16 +0300, Vladimir Panteleev <thecybershadow@gmail.com> wrote:
>>
>>> I initially wrote it to try to find a memory leak in Tango's GC (which was actually fixed at some point).
>>
>> Turns out it's still there, and it's the old "binary data" issue with pointer-searching GCs, which was fixed in D/Phobos 1.001 by making the GC type-aware. Check out the attached sample programs for a simple example - the Tango version can't know there are no pointers in its GrowBuffer's data, and thus leaks like crazy, while the Phobos version stays at 13MB.
> 
> i ran into another issue with the phobos gc. i have constant array literals that are used in a constructor to initialize a member. the data isn't always used. each instance uses it at most once. the second time an instance uses that data, it has been overwritten. the problem disappears if i disable the gc.
> it looks like the gc frees the memory that holds the constant initializer. unfortunately it's really hard to reproduce this problem in a small program.

Have you tested this with Tango?  I would expect the same broken behavior but you never know, and any differences may help track down the issue.


Sean
August 06, 2007
Sean Kelly wrote:
> Have you tested this with Tango?  I would expect the same broken behavior but you never know, and any differences may help track down the issue.

i postponed that because it'll take a bit longer to my the D2.0 code compile with 1.018 again. but i will definitely test it.
August 07, 2007
On Mon, 06 Aug 2007 18:20:22 +0300, Sean Kelly <sean@f4.ca> wrote:

> Vladimir Panteleev wrote:
>> On Wed, 01 Aug 2007 09:08:16 +0300, Vladimir Panteleev <thecybershadow@gmail.com> wrote:
>>
>>> I initially wrote it to try to find a memory leak in Tango's GC (which was actually fixed at some point).
>>
>> Turns out it's still there, and it's the old "binary data" issue with pointer-searching GCs, which was fixed in D/Phobos 1.001 by making the GC type-aware. Check out the attached sample programs for a simple example - the Tango version can't know there are no pointers in its GrowBuffer's data, and thus leaks like crazy, while the Phobos version stays at 13MB.
>
> The cause of this is somewhat an artifact of the OO design in Tango. The underlying buffer being allocated is a byte[], but the reference to it is a void[].  The problem occurs when GrowBuffer grows the buffer by increasing its length, which causes the buffer to be reallocated as a void[].  The reason this is a problem is that neither runtime, Tango or Phobos, preserves memory block attributes during a reallocation--they both simply key off the type being used to perform the reallocation. Obviously, this is a problem, and I've decided to change the behavior in Tango accordingly.  It will take some doing and I'm a bit over-busy at the moment, but before long the Tango runtime will preserve all block attributes on a reallocation.  In essence, this will occur by having the runtime call gc_realloc, but before this will work gc_realloc must be fixed to handle slices.

I'd still rather vote towards making the GC not scan void[] - it makes the most sense.

-- 
Best regards,
  Vladimir                          mailto:thecybershadow@gmail.com
August 07, 2007
Vladimir Panteleev wrote:
> On Mon, 06 Aug 2007 18:20:22 +0300, Sean Kelly <sean@f4.ca> wrote:
> 
>> Vladimir Panteleev wrote:
>>> On Wed, 01 Aug 2007 09:08:16 +0300, Vladimir Panteleev <thecybershadow@gmail.com> wrote:
>>>
>>>> I initially wrote it to try to find a memory leak in Tango's GC (which was actually fixed at some point).
>>> Turns out it's still there, and it's the old "binary data" issue with pointer-searching GCs, which was fixed in D/Phobos 1.001 by making the GC type-aware. Check out the attached sample programs for a simple example - the Tango version can't know there are no pointers in its GrowBuffer's data, and thus leaks like crazy, while the Phobos version stays at 13MB.
>> The cause of this is somewhat an artifact of the OO design in Tango.
>> The underlying buffer being allocated is a byte[], but the reference to
>> it is a void[].  The problem occurs when GrowBuffer grows the buffer by
>> increasing its length, which causes the buffer to be reallocated as a
>> void[].  The reason this is a problem is that neither runtime, Tango or
>> Phobos, preserves memory block attributes during a reallocation--they
>> both simply key off the type being used to perform the reallocation.
>> Obviously, this is a problem, and I've decided to change the behavior in
>> Tango accordingly.  It will take some doing and I'm a bit over-busy at
>> the moment, but before long the Tango runtime will preserve all block
>> attributes on a reallocation.  In essence, this will occur by having the
>> runtime call gc_realloc, but before this will work gc_realloc must be
>> fixed to handle slices.
> 
> I'd still rather vote towards making the GC not scan void[] - it makes the most sense.

Others have expressed the same opinion.  I'll withhold my own thoughts but to say that I think the idea behind the current approach is twofold:

1. void[] is the 'any' buffer type for in-program data.  The type of the underlying data could be an array of bytes or it could be an array of structs containing pointers.  The 'any' buffer type for out-of-program data is byte[], because until such data is translated to D types, it's merely a stream of bytes.

2. It is nice to have an in-language option for specifying that an 'any' buffer type may contain pointers.

The most obvious counter-argument is that by assigning special behavior to void[], those who don't like that behavior must use something else and lose the implicit conversion that void[] provides.  This is extremely convenient in some cases.  Another being that because void[] does not specify a type, it is appropriate for out-of-program data as well, and scanning a stream of bytes read from a file for pointers is bad.

I personally don't think there is a solution to this that will make everyone happy, and am hoping that by preserving block attributes I will make void[] more usable as a reference type since then only the first allocation must be "new byte[x]".  It is also more consistent, since some reallocations obtain a new array (and lose block information), while others do not (and preserve block information).


Sean
September 17, 2007
Sean Kelly wrote:
> Jascha Wetzel wrote:
>> Vladimir Panteleev wrote:
>>> On Wed, 01 Aug 2007 09:08:16 +0300, Vladimir Panteleev <thecybershadow@gmail.com> wrote:
>>>
>>>> I initially wrote it to try to find a memory leak in Tango's GC (which was actually fixed at some point).
>>>
>>> Turns out it's still there, and it's the old "binary data" issue with pointer-searching GCs, which was fixed in D/Phobos 1.001 by making the GC type-aware. Check out the attached sample programs for a simple example - the Tango version can't know there are no pointers in its GrowBuffer's data, and thus leaks like crazy, while the Phobos version stays at 13MB.
>>
>> i ran into another issue with the phobos gc. i have constant array literals that are used in a constructor to initialize a member. the data isn't always used. each instance uses it at most once. the second time an instance uses that data, it has been overwritten. the problem disappears if i disable the gc.
>> it looks like the gc frees the memory that holds the constant initializer. unfortunately it's really hard to reproduce this problem in a small program.
> 
> Have you tested this with Tango?  I would expect the same broken behavior but you never know, and any differences may help track down the issue.

Tango has the same problem. I couldn't find a small test program that provokes this problem, but i can reproduce it with a larger one.

The problem doesn't arise the second time it is used, as i guessed earlier, it's after the GC ran once. It seems to miss the root in B.data (see below), free that memory and re-assign it to some other block. When the program crashes, the memory pointed to by B.data has been overwritten with values (used as indeces) that cause the crash.
It's basically this:

align(1) struct C
{
   uint a, b, c;
   uint[] s;
}

abstract class A
{
  C[] data; // gets initialized in subclass' c'tor - see below

  void foo()
  {
    bar(&data[calcIndex()]);
  }

  void bar(C* a)
  {
    foreach ( s; a.s )
      doSomething(s);  // crashes because s is used as an index
  }
}

class B : A
{
  this()
  {
    data = [
        C(1,2,3,[4,5,6]),
        C(7,8,9,[1,3,6]),
        C(35,5,88,[1234,78,6])
    ];
  }
}
1 2
Next ›   Last »