March 31, 2010
Steven Schveighoffer wrote:
> 
> You are correct, setCapacity ensures that *at least* the given number of elements will be available for appending.
> 
> I planned on making the function a property (but a bug would not allow that), the original intended usage was:
> 
> a.capacity = 10000;
> 
> Reserve doesn't work in this context.  Can you come up with a name that does?
> 
> I'll bring up reserve (as a function) as an alternative on the phobos mailing list, and see what people say.  I kind of liked the setter/getter idea, but you make a good point.
> 
> -Steve

Sorry if resurrecting this thread is against netiquette, but it caught my eye, and this is my first newsgroup post in years. ;)

Anyway, is there any compelling reason why setCapacity or modifying a.capacity should allocate a nondeterministic amount of storage?

Depending on the application, programmers might require strict control over memory allocation patterns and strict accounting for allocated memory.  Game programmers, especially console game programmers, tend to strongly prefer deterministic allocation patterns, and nondeterminism is one of the [several] common complaints about the C++ STL (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html is a good resource on these kind of issues).  In the case of D (which I'm considering learning), this is especially important for dynamic arrays, partly because they're so useful by themselves, and partly because they may form the backbone of custom containers.

Whereas it's easy to add "smart nondeterministic" behavior to a deterministic setCapacity function by providing a wrapper, ordinary language users can't do the opposite.  Because of this, and because dynamic arrays are so central to the D language, a nondeterministic setCapacity function may deter game programmers, especially console programmers, from adopting D.

Assuming you see this post, what are your thoughts here?
March 31, 2010
Mike S:
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html

It's a nice read. I don't see them switching to D soon. If you whisper them that D is based on a GC they will run away screaming :-)

Bye,
bearophile
April 01, 2010
bearophile wrote:
> Mike S:
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html
> 
> It's a nice read. I don't see them switching to D soon. If you whisper them that D is based on a GC they will run away screaming :-)
> 
> Bye,
> bearophile

Hah...well, there's a reason I'm still just looking into D rather than diving in headfirst! :p  Actually though, I do believe the needs of game programmers should be taken seriously while considering D's evolution: Right now, D hasn't completely found its niche, but it seems to position itself as a sane successor to C++ for systems-level programming.  As it stands, I believe there are only two major kinds of programmers who still use C++, and those are game programmers and masochists. ;)  Odds are, D won't be replacing C anytime soon for operating system kernels and such.  It's too low-level for scripting tasks, and most website designers and non-real-time applications programmers use higher-level languages already (C#, Python, PHP...shudder...etc.), and they're unlikely to go back.  I think D will eventually be used for writing other heavy-duty non-OS frameworks and software systems, but if it's really going to become the successor to C++, it's going to have to absorb most of C++'s user base...and that includes game programmers.

You're right that the garbage collector is a major issue - probably the biggest one inherent to the language design - but I haven't determined it's a dealbreaker, at least not yet.  After all, D also allows manual memory management, and freeing memory early apparently helps speed things up anyway (http://stackoverflow.com/questions/472133/turning-off-the-d-garbage-collector).  That part helps ensure control over how much memory is used/available, and the only other issue with the garbage collector is reconciling its running time with the soft real-time constraint that games have to satisfy.  I can think of a few tactics which should help here:
1.)  In addition to permitting better reasoning about memory allocations, freeing most memory manually should reduce the load on the garbage collector and reduce its runtime, right?
2.)  On the simulation side of the game engine, I believe a constant timestep promotes a more robust design, and that means some frames (relatively idle ones) will have plenty of CPU time left over.  If you can explicitly call the garbage collector to make it run during those times instead of at nondeterministic times (can you?), you can maintain a smooth framerate without any GC-induced spikes.
3.)  Can user code execute simultaneously with the GC in other threads (on other cores), or does the GC halt the entire program for safety reasons?  Assuming simultaneous threaded execution is permitted, it would also dramatically reduce the GC's impact on multi-core systems.

Assuming these strategies work, the garbage collector by itself shouldn't be a showstopper.  In the case of dynamic arrays, resizing capacity deterministically is one of those small things that would be really helpful to anal game programmers, and it probably wouldn't hurt anyone else, either.  Plus, it's easier to implement than "smart nondeterministic" resizing anyway. :)
April 01, 2010
Mike S:

>the needs of game programmers should be taken seriously while considering D's evolution:<

The short D1 history shows that designers of small games are willing to use D. Some game designers seem almost desperate to find an usable language simpler than C++. So I agree with you that D2 can be designed keeping an eye at game designers too. But that's very demanding people, it's not easy to satisfy them even with a mature language + compiler + std lib + dev tools. And currently nothing in D2 is mature. For them maybe not even the most mature thing you can find in D world, the back-end of ldc (llvm), is mature enough :-)


>Right now, D hasn't completely found its niche, but it seems to position itself as a sane successor to C++ for systems-level programming.<

I am not able to tell the future. Some parts of D design are already old-style:
- Some early design decisions make hard to inline D virtual functions (so if you write D code in Java style, you see a significant slow down compared to similar Java code running with HotSpot). So far no one seems to care of this, we'll see if I am right to see a problem here;
- Some of D unsafe characteristics are worked on to improve their safety, but there's lot of road to travel still, for example null-safety and integers-overflow-safety are far away still. People are trying to explain Walter still why null-safety has some importance.
- D2 defaults to mutables. This can be acceptable, I don't know;
- Currently D2 is not designed from the start to work with an IDE (but I think this problem can be fixed with not too much work);
- The built-in unit testing and documentation are not fit for professional usage (but the documentation is easy to extend because they are just comments, so it's a smaller problem).
- etc.

A system language is something that you can use to write very small binaries, that can be used to write a kernel like Linux, device drivers for a smaller computer+CPU, etc. Such things are hard to do in D2, I don't see Linus using D2 to write his kernel, he even thinks C++ is unfit. So I see D2 more like a "low-level application language", on a level located somewhere between C and C#. It can also become a numerics language (see below).


>As it stands, I believe there are only two major kinds of programmers who still use C++, and those are game programmers and masochists. ;)<

There's also an army of legacy programmers that have to update and debug tons of C++ code. Part of the human society works thanks to a mountain of C++ code. Online programming competitions are usually won by C++ code.
People will find ways to use C++ for many more years, it will probably outlast us all.


>It's too low-level for scripting tasks,<

I have asked several times to have Python-style array/lazy comprehensions in D :-) They help. I think their introduction can reduce by 10-30% the length of D2 programs.


>I think D will eventually be used for writing other heavy-duty non-OS frameworks and software systems,<

From what I've seen so far I think D2 will appeal to some numerics folks too, so it can eat a bit of Fortran pie too. Some improvements can make D2 more appealing to them, Don is working on this too. (Some ideas from Chapel language can help here, but I think no one here has taken a serious look at it so far).


>You're right that the garbage collector is a major issue - probably the biggest one inherent to the language design - but I haven't determined it's a dealbreaker, at least not yet.<

The situation with the D GC is interesting.
First of all D GC is not refined, Java VM GCs are way more advanced. So D GC will need a much more modern GC.

Another problem is that the current D GC is quite imprecise, this causes leaks when you use it in real programs that have to run for more than few minutes. Part of this problem can be solved using a better GC that's more precise (this can slow it down a bit, but avoids a good amount of memory leaks).

The other problem is intrinsic of the language, that makes it hard or impossible to invent a fully precise GC for D.

And D makes it hard to use a modern generational moving GC with D. You can't just adopt a JavaVM GC with D. Even the Mono GC (that knows the notion of pinned/unpinned memory) can be unfit (because it's designed for mostly unpinned memory). This is partially caused by D being a low level language with pointers, and it's partially caused by D2 type system unable to tell apart:
1) hand-managed pointers, to GC memory or C heap memory;
2) GC-managed pointers to pinned memory;
3) GC-managed pointers to unpinned memory.
I think Walter think that telling them apart in the language makes D too much complex, and he can be right. But the current situation makes it hard to design a very efficient GC for D. So I don't think high-performance game designers will use D GC for the next few years, they will manage most or all the memory manually. I am ignorant, but I think D designers have to work a little harder in finding ways to allocate unpinned objects. This (with a refined GC able to move unpinned memory, that keeps a stack-like Eden plus two or three generations of objects) can help a lot for programs written in Java-style.

But computer science history has shown that if enough people work on a problem they can often find some partial solution. At the beginning Java was very slow. So there's a bit of hope still. Of course enough GC experts will work on the D GC only if D will have some success.


>In the case of dynamic arrays, resizing capacity deterministically is one of those small things that would be really helpful to anal game programmers, and it probably wouldn't hurt anyone else, either.  Plus, it's easier to implement than "smart nondeterministic" resizing anyway. :)<

Don't nail your mind only on that problem, that's only one of many problems.
You can think that dynamic arrays are simple, but from what I've seen there's nothing simple in D dynamic arrays, people have found a way to improve them a little only now, after years of discussions about them, and I am not fully sure yet the recent changes are worth it, I mean I am not sure yet that the current D2 arrays are better than the older ones + an appender helper struct. There is no good benchmarking suite yet to test if they are an improvement.

Bye,
bearophile
April 01, 2010
bearophile wrote:
> The short D1 history shows that designers of small games are willing to use D. Some game designers seem almost desperate to find an usable language simpler than C++. So I agree with you that D2 can be designed keeping an eye at game designers too. But that's very demanding people, it's not easy to satisfy them even with a mature language + compiler + std lib + dev tools. And currently nothing in D2 is mature. For them maybe not even the most mature thing you can find in D world, the back-end of ldc (llvm), is mature enough :-)
> 
Yeah, you're right about the demanding tool and maturity requirements that game studios have, but assuming people continue working on D and other people adopt it and enhance the tools, those things will flesh out over time.  I'm young enough that I look forward to seeing it overtake C++ in the game world someday.

> 
> 
> I am not able to tell the future. Some parts of D design are already old-style:
> - Some early design decisions make hard to inline D virtual functions (so if you write D code in Java style, you see a significant slow down compared to similar Java code running with HotSpot). So far no one seems to care of this, we'll see if I am right to see a problem here;

Well, writing code Java-style is certainly no problem for game devs, considering they already minimize virtual function usage, at least in lower code layers. ;)

> <snip>
> 
> A system language is something that you can use to write very small binaries, that can be used to write a kernel like Linux, device drivers for a smaller computer+CPU, etc. Such things are hard to do in D2, I don't see Linus using D2 to write his kernel, he even thinks C++ is unfit. So I see D2 more like a "low-level application language", on a level located somewhere between C and C#. It can also become a numerics language (see below).

This is true, but I do recall seeing an executable size comparison somewhere, and the D version of a program (hello world?) beat out the C++ version by about a factor of two.  The C version killed both, but still, perhaps D might not be eternally unfit even if C++ is. ;)  Then again, maybe the C++ program was just including a superfluous amount of library code, and maybe D programs are generally larger than their C++ equivalents.  Plus, if it was hello world, it obviously wasn't using a lot of higher-level features.

Either way though, even if D does become fit for kernel/driver code someday, it'll still be a long time before someone actually starts from scratch to write a new kernel using it anyway.

> 
>> As it stands, I believe there are only two major kinds of programmers who still use C++, and those are game programmers and masochists. ;)<
> 
> There's also an army of legacy programmers that have to update and debug tons of C++ code. Part of the human society works thanks to a mountain of C++ code. Online programming competitions are usually won by C++ code.
> People will find ways to use C++ for many more years, it will probably outlast us all.

You're right, and I actually realized I misspoke here a little bit ago while I was eating.  The legacy code just might keep people using C++ until the sun dies.  Still, I maintain that game developers and masochists probably comprise a large portion of programmers starting completely new projects in C++. :D

> 
>> It's too low-level for scripting tasks,<
> 
> I have asked several times to have Python-style array/lazy comprehensions in D :-) They help. I think their introduction can reduce by 10-30% the length of D2 programs.

How difficult do you think that would be for the compiler devs to implement in the semantic sense?  Assuming it can be done without major hardship or compromising the design of the language, that would be really cool.  Syntactically speaking, Python list comprehensions make the source so much more compact, expressive, and clean that a statically compiled language using them would really stand out.  If they're implemented correctly, I can't see any reasons why the syntactic sugar would be any slower than spelling everything out explicitly, either. The syntax would have to be a bit different to feel at home in D, but the idea itself probably isn't too foreign.

I also noticed a discussion about Python tuples from October 2009 I think, and native tuples in D would also be useful...more useful than in Python, in fact.  After all, Python lists can contain mixed types (unlike arrays in D), so they make tuples largely redundant except for their different conventional meanings (and except for the ability to used named tuples).  In comparison, built-in tuples in D with similarly elegant syntax would fill in a much larger gap.  I suppose they'd work something like implicitly generated struct types, which could be hastily constructed as lvalues or rvalues, returned from functions, packed/unpacked and passed to functions, etc.  Honestly, I think there's a lot to be learned from the expressiveness of scripting languages (especially Python, given its elegant syntax without all of the Perl/PHP @#$%^&crazysymbols@#$%^&), and I think a lot of it can be readily applied to statically compiled languages without speed hits or design compromises.

 > From what I've seen so far I think D2 will appeal to some numerics folks too, so it can eat a bit of Fortran pie too. Some improvements can make D2 more appealing to them, Don is working on this too. (Some ideas from Chapel language can help here, but I think no one here has taken a serious look at it so far).

It interested me when you mentioned numerics above, because game engine design is also moving more in the direction of dataflow-oriented programming, where the object hierarchy is more streamlined and there's more of a focus on transforming one type of data to another.  This helps with both cache efficiency and exposing data-level parallelism.  With all of the vector and matrix math involved in those transformation steps (physics engines, etc.), the same improvements that appeal to FORTRAN/numerics programmers might also find some use cases here.  Of course, I could be talking out of my ass, since I don't quite know what D 2.0 improvements you're referring to, but I imagine they might apply.

> The situation with the D GC is interesting.
> First of all D GC is not refined, Java VM GCs are way more advanced. So D GC will need a much more modern GC.
> 
> Another problem is that the current D GC is quite imprecise, this causes leaks when you use it in real programs that have to run for more than few minutes. Part of this problem can be solved using a better GC that's more precise (this can slow it down a bit, but avoids a good amount of memory leaks).
> 
> The other problem is intrinsic of the language, that makes it hard or impossible to invent a fully precise GC for D.
> 
> And D makes it hard to use a modern generational moving GC with D. You can't just adopt a JavaVM GC with D. Even the Mono GC (that knows the notion of pinned/unpinned memory) can be unfit (because it's designed for mostly unpinned memory).. This is partially caused by D being a low level language with pointers, and it's partially caused by D2 type system unable to tell apart:
> 1) hand-managed pointers, to GC memory or C heap memory;
> 2) GC-managed pointers to pinned memory;
> 3) GC-managed pointers to unpinned memory.
> I think Walter think that telling them apart in the language makes D too much complex, and he can be right. But the current situation makes it hard to design a very efficient GC for D. So I don't think high-performance game designers will use D GC for the next few years, they will manage most or all the memory manually. I am ignorant, but I think D designers have to work a little harder in finding ways to allocate unpinned objects. This (with a refined GC able to move unpinned memory, that keeps a stack-like Eden plus two or three generations of objects) can help a lot for programs written in Java-style.

Yeah, that's really a shame about the current state of the garbage collector.  Any memory leaks at all are the death knell of console games, and they're also the death of pretty much any long-running application.  Are the memory leaks eternal and irrevocable, or are we just talking about memory that takes a long time for the garbage collector to figure out it should free?

That said, my interest in D is less about its state today and more about its state tomorrow.  I'm not planning on dying anytime soon, so I imagine I'll be coding for a long time...and I hope it's not just going to be C++, C++, and more C++ for the rest of my life!

> 
> But computer science history has shown that if enough people work on a problem they can often find some partial solution. At the beginning Java was very slow. So there's a bit of hope still. Of course enough GC experts will work on the D GC only if D will have some success.

That's pretty much the way I look at it too.  Assuming people don't just abandon D, it's only a matter of time before the genius programmers of the world fix the rough spots.

> 
> Don't nail your mind only on that problem, that's only one of many problems.
> You can think that dynamic arrays are simple, but from what I've seen there's nothing simple in D dynamic arrays, people have found a way to improve them a little only now, after years of discussions about them, and I am not fully sure yet the recent changes are worth it, I mean I am not sure yet that the current D2 arrays are better than the older ones + an appender helper struct. There is no good benchmarking suite yet to test if they are an improvement.

You're right:  It's just that when I saw this thread, I figured I could bring up this one problem of many which happens to be an easy fix. :)

> Bye,
> bearophile
April 01, 2010
On Wed, 31 Mar 2010 17:57:07 -0400, Mike S <mikes@notarealaddresslololololol.com> wrote:

> Steven Schveighoffer wrote:
>>  You are correct, setCapacity ensures that *at least* the given number of elements will be available for appending.
>>  I planned on making the function a property (but a bug would not allow that), the original intended usage was:
>>  a.capacity = 10000;
>>  Reserve doesn't work in this context.  Can you come up with a name that does?
>>  I'll bring up reserve (as a function) as an alternative on the phobos mailing list, and see what people say.  I kind of liked the setter/getter idea, but you make a good point.
>>  -Steve
>
> Sorry if resurrecting this thread is against netiquette, but it caught my eye, and this is my first newsgroup post in years. ;)
>
> Anyway, is there any compelling reason why setCapacity or modifying a.capacity should allocate a nondeterministic amount of storage?

What do you mean by nondeterministic?  It's very deterministic, just not always easy to determine ;)  However, given enough context, it's really easy to determine.

> Depending on the application, programmers might require strict control over memory allocation patterns and strict accounting for allocated memory.  Game programmers, especially console game programmers, tend to strongly prefer deterministic allocation patterns, and nondeterminism is one of the [several] common complaints about the C++ STL (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html is a good resource on these kind of issues).  In the case of D (which I'm considering learning), this is especially important for dynamic arrays, partly because they're so useful by themselves, and partly because they may form the backbone of custom containers.

The amount of memory given is determined by the GC, and ultimately by the OS.  The currently supported OSes allocate in Page-sized chunks, so when you allocate any memory from the OS, you are allocating a page (4k).  Most likely, you may not need a whole page for the data you are allocating, so the GC gives you more finely sized chunks by breaking up a page into smaller pieces.  This strategy works well in some cases, and can be wasteful in others.  The goal is to strike a balance that is "good enough" for everyday programming, but can be specialized when you need it.

If you want to control memory allocation yourself, you can always do that by allocating page-sized chunks and doing the memory management on those chunks yourself.  I do something very similar in dcollections to speed up allocation/destruction.

> Whereas it's easy to add "smart nondeterministic" behavior to a deterministic setCapacity function by providing a wrapper, ordinary language users can't do the opposite.  Because of this, and because dynamic arrays are so central to the D language, a nondeterministic setCapacity function may deter game programmers, especially console programmers, from adopting D.
>
> Assuming you see this post, what are your thoughts here?

I think D has deterministic allocation, and better ability than C++ to make custom types that look and act like builtins.  Therefore, you can make an array type that suits your needs and is almost exactly the same syntax as a builtin array (except for some things reserved for builtins, like literals).  Such a thing is certainly possible, even with using the GC for your allocation.

BTW, I made the change to the runtime renaming the function previously known as setCapacity to reserve.  It won't be a property, even if that bug is fixed.

-Steve
April 01, 2010
Steven Schveighoffer wrote:
 > What do you mean by nondeterministic?  It's very deterministic, just not
> always easy to determine ;)  However, given enough context, it's really easy to determine.

When I say deterministic, I'm referring to determinism from the user's point of view, where the allocation behavior is affected solely by the parameter (the size request, e.g. 10000 objects) and not by some kind of  internal state, hidden context, or arcane black magic. :p

 > The amount of memory given is determined by the GC, and ultimately by
> the OS.  The currently supported OSes allocate in Page-sized chunks, so when you allocate any memory from the OS, you are allocating a page (4k).  Most likely, you may not need a whole page for the data you are allocating, so the GC gives you more finely sized chunks by breaking up a page into smaller pieces.  This strategy works well in some cases, and can be wasteful in others.  The goal is to strike a balance that is "good enough" for everyday programming, but can be specialized when you need it.

That's understandable, and it makes sense that the actual memory being allocated would correspond to some chunk size.  It's really just opaque black box behavior that poses a problem; if users are given well-defined guidelines and chunk sizes, that would work just fine.  For instance, a spec like, "reserve a multiple of 512 bytes and that's exactly what you will be given," would allow users to minimize wastefulness and know precisely how much memory they're allocating.

> 
> If you want to control memory allocation yourself, you can always do that by allocating page-sized chunks and doing the memory management on those chunks yourself.  I do something very similar in dcollections to speed up allocation/destruction.
> 
> <snip>
> 
> I think D has deterministic allocation, and better ability than C++ to make custom types that look and act like builtins.  Therefore, you can make an array type that suits your needs and is almost exactly the same syntax as a builtin array (except for some things reserved for builtins, like literals).  Such a thing is certainly possible, even with using the GC for your allocation.

That parallels what game devs do in C++:  They tend to use custom allocators a lot, and they're likely to follow the same basic strategy in D too, if/when it becomes a suitable replacement.  I'm still just browsing though, and I'm not all that familiar with D.  If you can't actually use the built-in dynamic arrays for this purpose, how difficult would it be to reimplement a contiguously stored dynamic container using custom allocation?  I suppose you'd have to build it from the ground up using a void pointer to a custom allocated block of memory, right?  Do user-defined types in D have any/many performance disadvantages compared to built-ins?

> 
> BTW, I made the change to the runtime renaming the function previously known as setCapacity to reserve.  It won't be a property, even if that bug is fixed.
> 
> -Steve

That's a bit of a downer, since a capacity property would have nice symmetry with the length property.  I suppose there were good reasons though.  Considering the name change, does that mean reserve can only reserve new space, i.e. it can't free any that's already been allocated?  (That makes me wonder:  Out of curiosity, how does the garbage collector know how much space is allocated to a dynamic array or especially to a void pointer?  I suppose it's registered somewhere?)
April 01, 2010
Mike S:

>Well, writing code Java-style is certainly no problem for game devs,<

Right. Here I was talking about D uses more in general, sorry, like young programmers coming out of the university.


>it'll still be a long time before someone actually starts from scratch to write a new kernel using it anyway.<

People will try to use D2 for this purpose too, for example to teaching purposes.


>How difficult do you think that would be for the compiler devs to implement in the semantic sense?  Assuming it can be done without major hardship or compromising the design of the language, that would be really cool.<

They are easy to implement. Even the lazy ones. See ShedSkin "compiler".


>I also noticed a discussion about Python tuples from October 2009 I think, and native tuples in D would also be useful...<

We can talk about them again for D3. At the moment D2 needs less new features and better implementation/debugging of the already present features.


>I think there's a lot to be learned from the expressiveness of scripting languages<

That was one of the original goals of D.


>Are the memory leaks eternal and irrevocable, or are we just talking about memory that takes a long time for the garbage collector to figure out it should free?<

I am mostly talking about false pointers, values that the GC thinks are pointers, while they are not. They can keep alive blocks of memory.


>Assuming people don't just abandon D, it's only a matter of time before the genius programmers of the world fix the rough spots.<

But there are limits in what smart people can invent/solve. So the language designers have to work to allow them to find solutions.

Bye,
bearophile
April 01, 2010
On Thu, 01 Apr 2010 01:41:02 -0400, Mike S <mikes@notarealaddresslololololol.com> wrote:

> Steven Schveighoffer wrote:
>   > What do you mean by nondeterministic?  It's very deterministic, just not
>> always easy to determine ;)  However, given enough context, it's really easy to determine.
>
> When I say deterministic, I'm referring to determinism from the user's point of view, where the allocation behavior is affected solely by the parameter (the size request, e.g. 10000 objects) and not by some kind of   internal state, hidden context, or arcane black magic. :p

Its abstracted to the GC, but the current GC is well defined.  If you request to allocate blocks with length of a power of 2 under a page, you will get exactly that length, all the way down to 16 bytes.  If you request to allocate a page or greater, you get a contiguous block of memory that is a multiple of a page.

With that definition, is the allocator deterministic enough for your needs?

>
>   > The amount of memory given is determined by the GC, and ultimately by
>> the OS.  The currently supported OSes allocate in Page-sized chunks, so when you allocate any memory from the OS, you are allocating a page (4k).  Most likely, you may not need a whole page for the data you are allocating, so the GC gives you more finely sized chunks by breaking up a page into smaller pieces.  This strategy works well in some cases, and can be wasteful in others.  The goal is to strike a balance that is "good enough" for everyday programming, but can be specialized when you need it.
>
> That's understandable, and it makes sense that the actual memory being allocated would correspond to some chunk size.  It's really just opaque black box behavior that poses a problem; if users are given well-defined guidelines and chunk sizes, that would work just fine.  For instance, a spec like, "reserve a multiple of 512 bytes and that's exactly what you will be given," would allow users to minimize wastefulness and know precisely how much memory they're allocating.

I think in the interest of allowing innovative freedom, such requirements should be left up to the GC implementor, not the spec or runtime.  Anyone who wants to closely control memory usage should just understand how the GC they are using works.

>
>>  If you want to control memory allocation yourself, you can always do that by allocating page-sized chunks and doing the memory management on those chunks yourself.  I do something very similar in dcollections to speed up allocation/destruction.
>>  <snip>
>>  I think D has deterministic allocation, and better ability than C++ to make custom types that look and act like builtins.  Therefore, you can make an array type that suits your needs and is almost exactly the same syntax as a builtin array (except for some things reserved for builtins, like literals).  Such a thing is certainly possible, even with using the GC for your allocation.
>
> That parallels what game devs do in C++:  They tend to use custom allocators a lot, and they're likely to follow the same basic strategy in D too, if/when it becomes a suitable replacement.  I'm still just browsing though, and I'm not all that familiar with D.  If you can't actually use the built-in dynamic arrays for this purpose, how difficult would it be to reimplement a contiguously stored dynamic container using custom allocation?  I suppose you'd have to build it from the ground up using a void pointer to a custom allocated block of memory, right?  Do user-defined types in D have any/many performance disadvantages compared to built-ins?

No, you would most likely use templates, not void pointers.  D's template system is far advanced past C++, and I used it to implement my custom allocators.  It works great.

User-defined types are as high performance as builtins as long as the compiler inlines properly.

>
>>  BTW, I made the change to the runtime renaming the function previously known as setCapacity to reserve.  It won't be a property, even if that bug is fixed.
>>  -Steve
>
> That's a bit of a downer, since a capacity property would have nice symmetry with the length property.  I suppose there were good reasons though.  Considering the name change, does that mean reserve can only reserve new space, i.e. it can't free any that's already been allocated?

Capacity still exists as a read-only property.  I did like the symmetry, but the point was well taken that the act of setting the capacity was not exact.  It does mean that reserving space can only grow, not shrink.  In fact, the capacity property calls the same runtime function as reserve, just passing 0 as the amount requested to get the currently reserved space.

You can't use capacity to free space because that could result in dangling pointers.  Freeing space is done through the delete keyword.  We do not want to make it easy to accidentally free space.

>   (That makes me wonder:  Out of curiosity, how does the garbage collector know how much space is allocated to a dynamic array or especially to a void pointer?  I suppose it's registered somewhere?)

The GC can figure out what page an interior pointer belongs to, and therefore how much memory that block uses.  There is a GC function to get the block info of an interior pointer, which returns a struct that contains the pointer to the block, the length of the block, and its flags (whether it contains pointers or not).  This function is what the array append feature uses to determine how much capacity can be used.  I believe this lookup is logarithmic in complexity.

-Steve
April 01, 2010
bearophile wrote:
>> How difficult do you think that would be for the compiler devs to implement in the semantic sense?  Assuming it can be done without major hardship or compromising the design of the language, that would be really cool.<
> 
> They are easy to implement. Even the lazy ones. See ShedSkin "compiler".

I figured the eager ones wouldn't be a problem, but I wondered whether the lazy ones might be a pain.  Guess not, so cool. :)

> We can talk about them again for D3. At the moment D2 needs less new features and better implementation/debugging of the already present features.

That's very true...I'm looking forward to Andrei's book, but I can't imagine how he's finishing it on schedule, considering how quickly both the language itself and the compiler are evolving.  If the language specification and reference compiler are both as incomplete, volatile, and partially implemented as they are now, a June release might do some real damage to D's reputation.

As far as D3 goes though:  Obviously nothing about it has really been discussed at length, but is the general idea that it would be another backwards-incompatible overhaul, or is the plan to make D2 the target for backwards compatibility from here on out?