August 13, 2004
All good points. As long as there is some easy means of avoiding unwanted initialization overhead (particularly vis-a-vis the stack, since it is by far the fastest place to allocate locally-scoped storage), then I'd certainly be all for it.

Regarding the stack in particular, the declaration:

>      // allocate stack storage
>      char[] workspace = stackalloc char[1024*32];

... would likely map to the alloca() function (which does not waste valuable time initializing all of the requested storage space). However, as Walter has described it, the GC is apparently incompatible with usage of alloca() also ~ because potentially old GC references will suddenly appear back in the "live" stack area again. If this is truly the case, then you should expect the same pushback on this notion as I've had with the notion of an "uninitialized" qualifier, or Ben's alternative "=void". After all, your "stackalloc" keyword is a variation on the theme.

On the other hand, if the GC turns out to be compatible with alloca (as it is today), then we can all sing and dance with joy when Walter wraps that function with some "more accessible" syntax <g>. I prefer Ben's approach though, because it's potentially applicable in more places than just the stack.

Bon Chance.


"Andy Friesen" <andy@ikagames.com> wrote in message news:cfivtl$vcr$1@digitaldaemon.com...
> van eeshan wrote:
> > With respect Andy, the programmer was quite explicit about where the
memory
> > should come from ... it's a locally-declared array, and the language documentation points out such declarations are place on the stack (just
like
> > C/C++) because there are some very good reasons to do so. If you
abstract
> > that notion away, then we're not discussing D anymore. Note that the declaration does not have static, const, or any other qualifier either.
>
> One of the greatest things about D is that we have a chance to change it.  Therefore, we are discussing D even if we abstract that notion away, because D itself may follow suit if we can convince Walter. :)
>
> The declaration, taken purely at face value:
>
>      char[1024 * 32] workspace;
>
> declares 32768 consecutive chars under the name 'workspace'.  Since the 'new' keyword is not used, this storage is only guaranteed to exist within the function's scope.
>
> It doesn't necessarily have to say any more than that: the D language
> (not the D compiler implementation) is much more than spec for a
> portable assembler, (as C was) and it makes perfect sense to describe it
> in conceptual terms where possible because concepts are by far the most
> common use case. (code is rarely optimized before it is written :) )
>
> Ideally, the compiler knows what the output has to do, but is completely free to achieve that any way it wants. (this is the main reason O'Caml is so fast, as I understand it)
>
> However, there's no denying that D is a systems language, so it goes without saying that there absolutely must be a convenient way to say that you *do* care precisely where that storage comes from and how you want it to be handled.
>
> I like how C# handles this: it has a 'stackalloc' keyword which, true to its name, allocates stack storage.  By being a language primitive, it presents plenty of information to the code optimizer.
>
> So, hypothetically, we would be able to write
>
>      // want chars.  Don't care where they came from.
>      char[1024*32] workspace;
>
>      // get 'new' storage from the GC.
>      char[] workspace = new char[1024*32];
>
>      // allocate stack storage
>      char[] workspace = stackalloc char[1024*32];
>
>   -- andy


August 13, 2004
van eeshan wrote:
> All good points. As long as there is some easy means of avoiding unwanted
> initialization overhead (particularly vis-a-vis the stack, since it is by
> far the fastest place to allocate locally-scoped storage), then I'd
> certainly be all for it.
> 
> Regarding the stack in particular, the declaration:
> 
> 
>>     // allocate stack storage
>>     char[] workspace = stackalloc char[1024*32];
> 
> 
> ... would likely map to the alloca() function (which does not waste valuable
> time initializing all of the requested storage space). However, as Walter
> has described it, the GC is apparently incompatible with usage of alloca()
> also ~ because potentially old GC references will suddenly appear back in
> the "live" stack area again. If this is truly the case, then you should
> expect the same pushback on this notion as I've had with the notion of an
> "uninitialized" qualifier, or Ben's alternative "=void". After all, your
> "stackalloc" keyword is a variation on the theme.

I actually didn't mean to address implicit initialization directly at all so much as a need for some way to let the compiler decide what's faster.

I think the same reasoning applies, though: as a language which allows the programmer to hit the metal when he needs to, D needs to offer some mechanism that communicates that the programmer knows better than the compiler.

Using the 'void' keyword for this purpose sounds a lot like the Perl practice of assigning arbitrary meaning to symbols for the sake of terseness.  That being said, I can't think of anything better off the top of my head, so I'll shut up now. :)

 -- andy
August 13, 2004
> Using the 'void' keyword for this purpose sounds a lot like the Perl practice of assigning arbitrary meaning to symbols for the sake of terseness.  That being said, I can't think of anything better off the top of my head, so I'll shut up now. :)

Perl! ouch! ;-)
Void currently means "no type" so it seems very natural to me to extend that
to mean "no initial value" when used as an initializer.

On a related note, I think the only way to get uninitialized memory from the heap should be through malloc.


August 13, 2004
"Andy Friesen"  wrote ..
> I actually didn't mean to address implicit initialization directly at all so much as a need for some way to let the compiler decide what's
faster.
>
> I think the same reasoning applies, though: as a language which allows the programmer to hit the metal when he needs to, D needs to offer some mechanism that communicates that the programmer knows better than the compiler.

In light of what we now know (but still don't quite know :-) I think it's worth recapturing the initial request. I hope you folks will correct and/or embellish the following, as my knowledge of the internal GC operation is meager:

a) there are two things that a programmer uses the stack for: 1) nonchalant
locally-scoped variables, and 2) a deliberate and explicit means of
allocating temporary storage very quickly ( ~50x faster than the heap).

b) the default initialization scheme works great for most variables, and even arrays, where they belong to the nonchalant-usage group.

c) performance issues arise when the programmer attempts #2 above, where the storage requested is of anything other than a truly trivial size (~5 bytes). These issues apparently (still to be clarified) are borne due to the nature and behavior of GC stack-scans. That is, the GC apparently performs more efficiently where the "live" stack does not contain "old" things that look like GC references. The performance penalty incurred for #2 is through an attempt to eliminate such references from the stack content (wipe everything to a default value).

This begs the question, "... how often does the GC run?" And also, "... how much impact do "old" GC references have on its performance?"

Let's make an assertion here first: If a programmer is explicitly and deliberately using the stack for fast storage access (#2), one can safely assume said storage will be given up very soon. After all, if the execution time is long-lived then one might as well allocate on the heap instead; right? The core issue brought up by this thread/topic is about those cases where execution-time is very short-lived, such that the overhead of initialization is enough to be noticeably detrimental. Let's give this short execution-window a name:  X ...

Okay; so given this assertion, how many times will the GC become active during X? One might expect something close to zero ... especially so because the currently executing thread will not be making any heap requests. If it did, then the benefits of the original stack-allocation would likely be drowned out anyway.

The problem remains that /another/ thread may cause the GC to scan the (now multiple) stacks during X. So, how often does that happen? Within this short window? One might expect "very few".

Another approach would presumably be to disable GC stack-walks during X; that is, inhibit the GC from scanning a specific stack (the one belonging to the X-thread) for the duration of X.

Either way, the upshot is this: the programmer wants to take full control over what's going on; and D needs a decent mechanism to enable such control. Placing blinkers over the eyes of the programmer is not a good approach to this.







August 13, 2004
Ahh, I hadn't thought about expression temps. Very interesting stuff. I think I may order that CD.

Thanks very much for the information.


>> Just out of curiosity, how does alloca() work? I guess it just manipulates
>the
>> Stack and Base Pointers?
>>
>> My apologies for going a bit off topic, I've just never seen alloca()
>before.
>
>It manipulates the ESP register. There is some complexity in it due to needing to 'touch' each 4k page allocated, and to relocate expression temporaries that are pushed on the stack. The source for it comes on the DMC++ CD.
>
>


August 16, 2004
Pretty ignorant of some of this thread, I would like to make some points anyway.

Firstly, I would consider the ability to request storage on the stack without it
being initialised a rather important feature for a system programming language.
Yes, I absolutely want this piece of storage on the stack, and no, my dear
compiler, I do not want you to waste my time by filling it with zeroes. ;)
There have been some fine suggestions that address this point, yet they appear
to be less accessible (like alloca, or even asm which is not even portable)
than, or bring the semantics of the language too far away from, the way C or C++
do it.
I recognise that D need not be another iteration of C++, yet I consider the ease
of adoption of D for folks used to C'ish languages to be an important strength.
If someone as simple as int[2048] foo; could have a meaning (`` allocate GC'ed
heap memory '') `` totally '' different from that in C (`` subq %rsp, 8192 ''),
it would probably appear to be unintuitive.

But that is just my unbased personal opinion.

Therefore I consider some simple qualifier with obvious semantics like `` uninitialised '' to be worth the possible danger. This does not mean that everybody should default to using uninitialised memory unless they absolutely need initialisation, for methological reasons that were well stated by Walter. But if you want it, here it is, no need for alloca or asm hacks.

This might confuse the garbage collector, yet I do not see how that is much of a
problem. Some of the uses of uninitialised arrays demonstrated in this thread, I
think, quickly overwrote the array with memcpy'd data, and in others the array
might not even live long enough to `` block '' memory from being free'd for any
timespan that would matter.
Of course the problem exists, but `` false positives '' are not exactly fatal.


As another approach to this problem, I think it might be a worthwhile hack to
the semantics of the language itself to have all pointers' memory be zeroed when
it goes out of scope.
This basically means that pointers do the opposite of what arrays do, being ``
initialised '' once we do not need them anymore instead of being initialised
when they are created.
So when the memory occupied by them goes back in scope due to any means of
allocating uninitialised memory on the stack, what used to be used for pointers
does now not confuse the garbage collector.
Considering we already have a lot of initialisation going on, I do not think
that it would be a terrible overhead to have deinitialisation, but I might be
wrong, of course.


Please excuse my amateurish intervention. I certainly have no qualifications at all to point at anything you folks have said and say `` NO YOU ARE WRONG! '', yet I hope that I might have helped.


August 16, 2004
On Mon, 16 Aug 2004 04:43:43 +0000 (UTC), ben@0x539.de wrote:

> Pretty ignorant of some of this thread, I would like to make some points anyway.
> 
> Firstly, I would consider the ability to request storage on the stack without it
> being initialised a rather important feature for a system programming language.
> Yes, I absolutely want this piece of storage on the stack, and no, my dear
> compiler, I do not want you to waste my time by filling it with zeroes. ;)
> There have been some fine suggestions that address this point, yet they appear
> to be less accessible (like alloca, or even asm which is not even portable)
> than, or bring the semantics of the language too far away from, the way C or C++
> do it.
> I recognise that D need not be another iteration of C++, yet I consider the ease
> of adoption of D for folks used to C'ish languages to be an important strength.
> If someone as simple as int[2048] foo; could have a meaning (`` allocate GC'ed
> heap memory '') `` totally '' different from that in C (`` subq %rsp, 8192 ''),
> it would probably appear to be unintuitive.
> 
> But that is just my unbased personal opinion.
> 
> Therefore I consider some simple qualifier with obvious semantics like `` uninitialised '' to be worth the possible danger. This does not mean that everybody should default to using uninitialised memory unless they absolutely need initialisation, for methological reasons that were well stated by Walter. But if you want it, here it is, no need for alloca or asm hacks.
> 
> This might confuse the garbage collector, yet I do not see how that is much of a
> problem. Some of the uses of uninitialised arrays demonstrated in this thread, I
> think, quickly overwrote the array with memcpy'd data, and in others the array
> might not even live long enough to `` block '' memory from being free'd for any
> timespan that would matter.
> Of course the problem exists, but `` false positives '' are not exactly fatal.
> 
> As another approach to this problem, I think it might be a worthwhile hack to
> the semantics of the language itself to have all pointers' memory be zeroed when
> it goes out of scope.
> This basically means that pointers do the opposite of what arrays do, being ``
> initialised '' once we do not need them anymore instead of being initialised
> when they are created.
> So when the memory occupied by them goes back in scope due to any means of
> allocating uninitialised memory on the stack, what used to be used for pointers
> does now not confuse the garbage collector.
> Considering we already have a lot of initialisation going on, I do not think
> that it would be a terrible overhead to have deinitialisation, but I might be
> wrong, of course.
> 
> Please excuse my amateurish intervention. I certainly have no qualifications at all to point at anything you folks have said and say `` NO YOU ARE WRONG! '', yet I hope that I might have helped.

This idea is stilling making sense to me, even when taking Walter's objections into consideration.

Okay, so an uninitialized stack array might be a likely source of bugs. But if its not the default behaviour, and the coder explicitly takes responsibility of it by coding the keyword 'uninitialized' or similar, then why shouldn't D say "Okay if that's what you want, that's what you'll get!"

D allows 'goto' which is probably a more common source of bugs and high maintenance costs.

-- 
Derek
Melbourne, Australia
16/Aug/04 3:26:36 PM
August 30, 2004
So, the resolution on this has apparently been implemented. If you write:

{
    char[] x;
    const int size = 2048;

    x = cast(char[]) alloca(size)[0..size];
}

... then you will get the equivalent of a C/C++ stack array, without all the prior initialization overhead involved. While this is a VeryGoodThing, it could use some syntactic sugar ...

There were a number of suggestions made throughout this topic; here's a brief recap of three:

{
    uninitialized char[2048] x;

    char[2048] x = void;

    char[] x = stackalloc char[2048];
}

The first two may find use outside of stack-allocation, whilst the third is clearly stack-based. The second suggestion avoids creating an additional language 'qualifier'.

How about it?



"Walter" <newshound@digitalmars.com> wrote in message news:cfhhbe$3jf$1@digitaldaemon.com...
>
> "van eeshan" <vanee@hotmail.net> wrote in message news:cfh14k$2ouk$1@digitaldaemon.com...
> > I misread your code somewhat, Water, but what you suggest has a fatal
flaw
> > in it:  you are still initializing the original stack-based array, and you're making the assumption that the typical size is very small.
>
> I'm confused - you wrote: "The allocated (temporary) workspace is
typically
> used in only a small amounts, but sporadically will get used more fully."
My
> suggestion was tuned to that.
>
> > Further,
> > you are then extending the array via the heap which will, I imagine,
> > initialize the extended part of the array.
>
> That's correct.
>
> > That just doesn't work because:
> > a) there's a significant amount of effort to convert all the "workers"
to
> do
> > what you suggest.
>
> Correct, but resizing dynamic arrays in D is pretty easy. It's not as
klunky
> and error-prone as in C.
>
> > Plus, the code is less efficient,
>
> Since the resize will only happen in atypical cases, its efficiency should not impact overall performance.
>
> > there's more of it, and
> > there's possible heap-fragmentation involved.
>
> Heap fragmentation is much less of an issue than it used to be, with large memory pools and virtual memory. I know how to build a gc that will
compact
> memory to deal with fragmentation when it does arise, so although this is not in the current D it is not a technical problem to do it.
>
> > b) the initial size needs to be around the 2 to 3KB mark. Given that,
your
> > example is hardly faster than the original code. In fact, the
> initialization
> > cost is ~57000% slower than the equivalent C/C++ array allocation.
That's
> > better than ~88000%, but who's to quibble at these levels?
>
> If your app typically is going to fill 2 to 3 Kb, I suggest using malloc()
> (or alloca()) for it. The handful of extra instructions to allocate via
> malloc as opposed to the stack will be lost in the thousands of
instructions
> needed to fill 2 to 3 Kb.
>
> > I wish it were so simple as your suggestion conveyed. Conversely, if
there
> > were a syntactic means of optionally inhibiting the array
initialization,
> > there would be no related issue at all. We could all be radiantly happy,
> and
> > have a beer or something.
>
> Allocating large, uninitialized arrays on the stack has its own problems:
>
> 1) The uninitialized data that is on the stack will get scanned by the garbage collector looking for any references to allocated memory. Since
the
> uninitialized data consists of old D stack frames, it is highly likely
that
> some of that garbage will look like references into the gc heap, and the
gc
> memory will not get freed. This problem really does happen, and can be pretty frustrating to track down. Some of my C code that uses GC gets a memset() on the stack frame for that reason.
>
> 2) It's possible for a function to pass out of it a reference to data on that function's stack frame. By then allocating a new stack frame over the old data, and not initializing, the reference to the old data may still appear to be valid. The program will then behave erratically. Initializing all data on the stack frame will greatly increase the probability of
forcing
> that bug into the open in a repeatable manner.
>
> 3) Large arrays on the stack bring up the specter of stack exhaustion with even modest amounts of recursion.
>
> That said, there are things I can do to generate faster code for the initialization.
>
>


1 2 3 4 5 6
Next ›   Last »