Jump to page: 1 2
Thread overview
alloca is slow and dangerous
Jan 01, 2021
welkam
Jan 01, 2021
IGotD-
Jan 01, 2021
Patrick Schluter
Jan 03, 2021
welkam
Jan 03, 2021
IGotD-
Jan 03, 2021
welkam
Jan 03, 2021
welkam
Jan 04, 2021
Iain Buclaw
Jan 04, 2021
Johan
Jan 04, 2021
Max Haughton
Jan 06, 2021
Jacob Carlborg
January 01, 2021
Over the years I saw several people asking why D doesn't have alloca. I just want to share a video where it states that alloca is slower than static array so we have one more arguments against it.
https://youtu.be/FY9SbqTO5GQ?t=468

In summary alloca is:
1. Hard to implement
2. security problem
3. slower than static array on the stack
January 01, 2021
On Friday, 1 January 2021 at 14:48:11 UTC, welkam wrote:
> Over the years I saw several people asking why D doesn't have alloca. I just want to share a video where it states that alloca is slower than static array

13% slower is nothing compared to the alternative which is malloc or running out of stack (which will happen real fast if you keep piling up worst case fixed size).

January 01, 2021
On Friday, 1 January 2021 at 14:48:11 UTC, welkam wrote:
> Over the years I saw several people asking why D doesn't have alloca. I just want to share a video where it states that alloca is slower than static array so we have one more arguments against it.
> https://youtu.be/FY9SbqTO5GQ?t=468
>
> In summary alloca is:
> 1. Hard to implement

Why is it hard to implement?

> 2. security problem

Not entirely true. You need to check so that the amount elements is within reasonable limits. You can have stack overflows but most operating systems have guard pages detecting such cases. In kernels it might not be the case but in kernels you need pay attention more than in user space programs. Keep in mind static arrays on the stack are prone to overflows just as much as a VLA.

In D static arrays have an additional problem and that is that D will initialize the array by default. For stability this is great but performance this takes more time. VLA can be a better option here as you initialize exactly the amount elements you need.

> 3. slower than static array on the stack

The reason was strangely a lot of code for just allocating on the stack. He said he wasn't even using -O2 optimization and with -O2 it would be smaller. In general it shouldn't be that bad.


The fear of alloca in my opinion is exaggerated.

Doesn't D implement alloca for those who want it?
https://dlang.org/phobos/rt_alloca.html

January 01, 2021
On Friday, 1 January 2021 at 15:19:07 UTC, IGotD- wrote:
> On Friday, 1 January 2021 at 14:48:11 UTC, welkam wrote:
>> Over the years I saw several people asking why D doesn't have alloca. I just want to share a video where it states that alloca is slower than static array so we have one more arguments against it.
>> https://youtu.be/FY9SbqTO5GQ?t=468
>>
>> In summary alloca is:
>> 1. Hard to implement
>
> Why is it hard to implement?
>
>> 2. security problem
>
> Not entirely true. You need to check so that the amount elements is within reasonable limits. You can have stack overflows but most operating systems have guard pages detecting such cases. In kernels it might not be the case but in kernels you need pay attention more than in user space programs. Keep in mind static arrays on the stack are prone to overflows just as much as a VLA.

I think it is even worse than for VLA, because for the VLA you will have used a computed length depending on the parameters. So a real size. With static arrays, one will generally use a "worst case" size which value is much more prone to errors than the real length required.
Anecdote: recent gcc versions (since version 7) have for each version better and better heuristic to check the potential size of a buffers with string combination functions (sprintf, strcpy, strcat etc.). The number of potential worst case buffer overflows in legacy code is staggering.

>
> In D static arrays have an additional problem and that is that D will initialize the array by default. For stability this is great but performance this takes more time. VLA can be a better option here as you initialize exactly the amount elements you need.

yes

>
>> 3. slower than static array on the stack
>
> The reason was strangely a lot of code for just allocating on the stack. He said he wasn't even using -O2 optimization and with -O2 it would be smaller. In general it shouldn't be that bad.

afaik VLA get in the way of clever frame pointer optimisations. Local variables and parameters will have to be accessed with slightly more costly code than if the code knows at compile time exactly where they are. Better optimizers and code generation can alleviate it in a lot of cases.

>
>
> The fear of alloca in my opinion is exaggerated.
>
> Doesn't D implement alloca for those who want it?
> https://dlang.org/phobos/rt_alloca.html

While I'm neither afraid nor sceptical towards VLA/alloca, I have to say after 10 years of use that there are really very few cases where it was really a substantial improvement. It often spared a malloc/free pair but more often than expected added memory copies. YMMV.
January 01, 2021
On 1/1/21 9:48 AM, welkam wrote:
> Over the years I saw several people asking why D doesn't have alloca. I just want to share a video where it states that alloca is slower than static array so we have one more arguments against it.
> https://youtu.be/FY9SbqTO5GQ?t=468
> 
> In summary alloca is:
> 1. Hard to implement
> 2. security problem
> 3. slower than static array on the stack

D has alloca. It's in core.std.stdlib

https://dlang.org/phobos/core_stdc_stdlib.html#.alloca

In my experience, using it isn't extremely beneficial -- since it's in stdc, it requires the c library, which means you have malloc/free, which is much safer/usable.

I'd much rather use a static array, or malloc/scope(exit) free, or a combination between the two that uses the stack when it can, and expands to malloc when needed.

Or just use the GC...

-Steve
January 01, 2021
On Friday, 1 January 2021 at 17:55:33 UTC, Steven Schveighoffer wrote:
> In my experience, using it isn't extremely beneficial -- since it's in stdc, it requires the c library, which means you have malloc/free, which is much safer/usable.

I never use alloca directly, but in C you can just use an int variable for the dynamic array size. I find that useful for things like building a zero terminated path that only will be used with one function call. Or for a FFT buffer that is only known at runtime and I want to use hot memory (already in cache).

To do that fixed size would take maybe 200 KB, could easily be 100 times more than needed... And the next stack frame would be cold as hell...

January 03, 2021
Im going to respond to both messages

On Friday, 1 January 2021 at 15:19:07 UTC, IGotD- wrote:
> Why is it hard to implement?

Because it requires backend help. alloca is a special function

On Friday, 1 January 2021 at 17:48:22 UTC, Patrick Schluter wrote:
> On Friday, 1 January 2021 at 15:19:07 UTC, IGotD- wrote:
>> In D static arrays have an additional problem and that is that D will initialize the array by default. For stability this is great but performance this takes more time. VLA can be a better option here as you initialize exactly the amount elements you need.
>
> yes

Usually this forum has excellent technical responses but this time you dropped ball on this one. Behold

byte[128] = void;

Walter did a talk on this one.
DConf 2019 Day 1 Keynote: Allocating Memory with the D Programming Language -- Walter Bright
https://www.youtube.com/watch?t=2210&v=_PB6Hdi4R7M

On Friday, 1 January 2021 at 15:19:07 UTC, IGotD- wrote:
>> 3. slower than static array on the stack
>
> The reason was strangely a lot of code for just allocating on the stack. He said he wasn't even using -O2 optimization and with -O2 it would be smaller. In general it shouldn't be that bad.

Later he said that when people replaced VLA in kernel they found 13% speed up. While it was not stated kernel should be benchmarked with optimizations enabled.

January 03, 2021
On Friday, 1 January 2021 at 17:55:33 UTC, Steven Schveighoffer wrote:
> D has alloca. It's in core.std.stdlib
>
> https://dlang.org/phobos/core_stdc_stdlib.html#.alloca
>
Does it work on all compilers and all platforms?
January 03, 2021
On Sunday, 3 January 2021 at 19:12:47 UTC, welkam wrote:
>
> Usually this forum has excellent technical responses but this time you dropped ball on this one. Behold
>
> byte[128] = void;
>

I think the default initialization in D is an excellent feature. MS has already stated that they want to do this in C++. Sure your example circumvents that but still the default initialization help. My point was that more that you will allocate exactly the elements you need on the stack. For example you have have a maximum of 1024 elements but usually you only use one or two elements. Then static allocation of 1024 elements become crazy big and also it generate more cache misses.

I'm not sure why the stack allocation needs to be so enormously complicated like the seminar. In general you just need to subtract the stack pointer.


January 03, 2021
On Sunday, 3 January 2021 at 19:33:10 UTC, IGotD- wrote:
> I think the default initialization in D is an excellent feature.
Agree. Also Walter put an escape hatch because he knew that some times you dont want it for performance reasons.

> MS has already stated that they want to do this in C++.
Not only MS but also linux people want default initialize everything so they both put effort in compilers to improve optimizations so they dont pay performance penalty where they should not. And in the end we benefit too.

> My point was that more that you will allocate exactly the elements you need on the stack. For example you have have a maximum of 1024 elements but usually you only use one or two elements. Then static allocation of 1024 elements become crazy big and also it generate more cache misses.
If that's the case then you do like Walter showed. Common case on the stack and uncommon big allocations on heap.

> I'm not sure why the stack allocation needs to be so enormously complicated like the seminar. In general you just need to subtract the stack pointer.
I'm not following you

« First   ‹ Prev
1 2