dynamically allocating on the stack (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » dynamically allocating on the stack (page 3)

April 22, 2018

Re: dynamically allocating on the stack

Posted by Adam D. Ruppe
in reply to Uknown

Adam D. Ruppe

Posted in reply to Uknown

On Sunday, 22 April 2018 at 01:26:09 UTC, Uknown wrote:
> Its a special case for classes. Makes them usable without the GC.

The intention was actually just to have deterministic destruction - a scope class ctor runs at the end of scope (and it used to be you could force a class to always be scope too). The stack allocation is simply an optimization enabled by the scope lifetime. That optimization can apply elsewhere too.

April 21, 2018

Re: dynamically allocating on the stack

Posted by Steven Schveighoffer
in reply to Mike Franklin

Steven Schveighoffer

Posted in reply to Mike Franklin

On 4/21/18 7:47 PM, Mike Franklin wrote:
> On Saturday, 21 April 2018 at 19:06:52 UTC, Steven Schveighoffer wrote:
> 
>> alloca is an intrinsic, and part of the language technically -- it has to be.
> 
>  From what I can tell `alloca` is only available in the platform's C standard library (actually for Linux it appears be part of libgcc as `__builtin_alloca`; `alloca` is just and alias for it).  Of course I can use 3rd party libraries like C to do this, but it seems like something useful to have in the language for certain use case and optimizations.  Also, my immediate use case if for bare metal microcontroller programming where I'm intentionally avoid C and looking for a way to do this in idiomatic D.

As Nick says, it's an intrinsic. The call never gets to the C library. And it can't anyway -- if you give it some thought, you will realize, there's no way for a function to add stack space in the calling function. The compiler has to deal with that new space somehow, and make sure it puts it in the right place.

So you don't need a C library. I frankly don't know why it's in core.stdc, except maybe to be consistent with C?

-Steve

April 21, 2018

Re: dynamically allocating on the stack

Posted by Jonathan M Davis
in reply to Giles Bathgate

Jonathan M Davis

Posted in reply to Giles Bathgate

On Sunday, April 22, 2018 01:07:44 Giles Bathgate via Digitalmars-d-learn wrote:
> On Saturday, 21 April 2018 at 19:06:52 UTC, Steven Schveighoffer
>
> wrote:
> > alloca is an intrinsic, and part of the language technically -- it has to be.
>
> Why does:
>
> scope c = new C();       // allocate c on stack
> scope a = new char[len]; // allocate a via gc?

Because prior to DIP 1000 being introduced, scope did absolutely nothing to parameters if they weren't delegates, and it did nothing to other variables if they weren't classes. So, without -dip1000, scope is ignored on the second line. scope on parameters was to tell the compiler that the delegate didn't escape so that it didn't have to allocate a closer. scope on classes was to force them to be allocated on the stack and have a deterministic lifetime, since classes normally live on the heap. Without -dip1000, it's completely unsafe to do, because the compiler makes no guarantees about references to the class not escaping, which is why std.typecons.scoped was introduced to replace it, and scope on classes was going to be deprecated.

However, when Walter was looking into improving @safety with ref and adding some sort of ref-counting mechanism into the language, he found that he needed better guarantees about the lifetime of some objects in @safe code in order to be able to guarantee @safety. So, he came up with DIP 1000, which expands scope to do a lot more. The result is that scope has been greatly expanded with DIP 1000 and its meaning has subtly changed. It becomes closer to what it meant for parameters that were delegates in that it makes guarantees about no references or pointers or whatnot to an object escaping the scope. The effect on classes is then retained, but it becomes an optimization that the compiler can do as a result of what scope allows it to guarantee rather than an instruction by the programmer to tell the compiler to put the object on the stack. As such, the optimization could theoretically be expanded to affect any call to new where the result is assigned to variable that's marked with scope rather than just classes, but I don't think that it has been yet. Also, because scope is then forced to be used in enough places to allow the compiler to make guarantees about the lifetime of the object and not allow it to escape, the @safety issues with scope on classes is fixed, which more or less negates the need for std.typecons.scoped, and the deprecation for scope on classes has therefore been nixed.

However, unless -dip1000, the only status quo is still in effect, making scope on classes very much unsafe, and until -dip1000 is the default behavior (which will probably be a while), we really can't take advantage of the improvements that it's supposed to provide (and currently, because -dip1000 makes ABI-incompatible changes, you have to have compiled your entire software stack - Phobos included - with -dip1000 to use it, meaning that it's mostly unusable right now).

- Jonathan M Davis

April 22, 2018

Re: dynamically allocating on the stack

Posted by Mike Franklin
in reply to Nicholas Wilson

Mike Franklin

Posted in reply to Nicholas Wilson

On Sunday, 22 April 2018 at 00:41:34 UTC, Nicholas Wilson wrote:

> You're not using the C library version of it, the compiler does the stack space reservation inline for you. There is no way around this.

I'm not convinced.  I did some no-runtime testing and eventually found the implementation in druntime here:  https://github.com/dlang/druntime/blob/master/src/rt/alloca.d

Mike

April 22, 2018

Re: dynamically allocating on the stack

Posted by Cym13
in reply to Mike Franklin

Cym13

Posted in reply to Mike Franklin

On Sunday, 22 April 2018 at 05:29:30 UTC, Mike Franklin wrote:
> On Sunday, 22 April 2018 at 00:41:34 UTC, Nicholas Wilson wrote:
>
>> You're not using the C library version of it, the compiler does the stack space reservation inline for you. There is no way around this.
>
> I'm not convinced.  I did some no-runtime testing and eventually found the implementation in druntime here:  https://github.com/dlang/druntime/blob/master/src/rt/alloca.d
>
> Mike

The first assertion ("the C library isn't called") is easily apperent from
that assembly dump. The second is interesting but not so evident.

It might be clearer looking at actual assembly.

The doSomething function starts as such:

; sym._D4test11doSomethingFmZv (int arg_1h);
    ; prologue, puts the old stack pointer on the stack
      0x563d809095ec      55             push rbp
      0x563d809095ed      488bec         mov rbp, rsp
    ; allocate stack memory
      0x563d809095f0      4883ec20       sub rsp, 0x20
    ; setup arguments for the alloca call
    ; that 0x20 in rcx is actually the size of the current stack allocation
      0x563d809095f4      48c745e82000.  mov qword [local_18h], 0x20 ; 32
      0x563d809095fc      48ffc7         inc rdi
      0x563d809095ff      48897de0       mov qword [local_20h], rdi
      0x563d80909603      488d4de8       lea rcx, [local_18h]
    ; calls alloca
      0x563d80909607      e830010000     call sym.__alloca

The alloca function works as such:

;-- __alloca:
    ; Note how we don't create a stack frame by "push rbp;mov rbp,rsp"
    ; Those instructions could be inlined, it's not a function per se
    ;
    ; At that point rcx holds the size of the calling functions's stack frame
    ; and eax how much we want to add
      0x563d8090973c      4889ca         mov rdx, rcx
      0x563d8090973f      4889f8         mov rax, rdi
    ; Round rax up to 16 bytes
      0x563d80909742      4883c00f       add rax, 0xf
      0x563d80909746      24f0           and al, 0xf0
      0x563d80909748      4885c0         test rax, rax
  ,=< 0x563d8090974b      7505           jne 0x563d80909752
  |   0x563d8090974d      b810000000     mov eax, 0x10
  `-> 0x563d80909752      4889c6         mov rsi, rax
    ; Do the substraction in rax which holds the new address
      0x563d80909755      48f7d8         neg rax
      0x563d80909758      4801e0         add rax, rsp
    ; Check for overflows
  ,=< 0x563d8090975b      7321           jae 0x563d8090977e
  | ; Replace the old stack pointer by the new one
  |   0x563d8090975d      4889e9         mov rcx, rbp
  |   0x563d80909760      4829e1         sub rcx, rsp
  |   0x563d80909763      482b0a         sub rcx, qword [rdx]
  |   0x563d80909766      480132         add qword [rdx], rsi
  |   0x563d80909769      4889c4         mov rsp, rax
  |   0x563d8090976c      4801c8         add rax, rcx
  |   0x563d8090976f      4889e7         mov rdi, rsp
  |   0x563d80909772      4801e6         add rsi, rsp
  |   0x563d80909775      48c1e903       shr rcx, 3
  |   0x563d80909779      f348a5         rep movsq qword [rdi], qword ptr [rsi]
 ,==< 0x563d8090977c      eb02           jmp 0x563d80909780
 |`-> 0x563d8090977e      31c0           xor eax, eax
 |  ; Done!
 `--> 0x563d80909780      c3             ret

 So as you can see alloca isn't really a function in that it doesn't create a
 stack frame. It also needs help from the compiler to setup its arguments
 since the current allocation size is needed (rcx in the beginning of alloca)
which isn't a parameter known by the programmer. The compiler has to detect
that __alloca call and setup an additionnal argument by itself. Alloca then
just ("just") modifies the calling frame.


(I really hope I didn't mess something up)

April 22, 2018

Re: dynamically allocating on the stack

Posted by Giles Bathgate
in reply to Mike Franklin

Giles Bathgate

Posted in reply to Mike Franklin

On Sunday, 22 April 2018 at 05:29:30 UTC, Mike Franklin wrote:
> I'm not convinced.  I did some no-runtime testing and eventually found the implementation in druntime here

It seems to me that it's an odd thing to have what apparently looks like a function call an intrinsic part of the language. (I realise its probably needed to be compatible with c programs)

I appreciate that allocating arbitrarily large arrays on the stack is probably not a good idea. (Can the availability of sufficient stack space be calculated before making the call? )

That said I still think that having `scope` should just mean that the lifetime of the variable is limited to the function and cannot be returned as a dangling reference.

So like I said before it would be nice if you could have a `push` keyword to explicitly state that you want to allocate on the stack. Presumably, that is not a simple change (and is it a commonly needed use case anyway)

--------

scope c = new C();       // allocate class c limit to local scope
scope a = new char[len]; // allocate array a limit to local scope

scope c = push C();       // allocate class c on stack
scope a = push char[len]; // allocate array a on stack

auto c = push C();       // Error can't allocate stack variable to non local scope
auto a = push char[len]; // Error ditto

--------

April 22, 2018

Re: dynamically allocating on the stack

Posted by Steven Schveighoffer
in reply to Cym13

Steven Schveighoffer

Posted in reply to Cym13

On 4/22/18 3:17 AM, Cym13 wrote:
> On Sunday, 22 April 2018 at 05:29:30 UTC, Mike Franklin wrote:
>> On Sunday, 22 April 2018 at 00:41:34 UTC, Nicholas Wilson wrote:
>>
>>> You're not using the C library version of it, the compiler does the stack space reservation inline for you. There is no way around this.
>>
>> I'm not convinced.  I did some no-runtime testing and eventually found the implementation in druntime here: https://github.com/dlang/druntime/blob/master/src/rt/alloca.d
>>
>> Mike
> 
> The first assertion ("the C library isn't called") is easily apperent from
> that assembly dump. The second is interesting but not so evident.
> 
> It might be clearer looking at actual assembly.
> 
> The doSomething function starts as such:
> 
> ; sym._D4test11doSomethingFmZv (int arg_1h);
>      ; prologue, puts the old stack pointer on the stack
>        0x563d809095ec      55             push rbp
>        0x563d809095ed      488bec         mov rbp, rsp
>      ; allocate stack memory
>        0x563d809095f0      4883ec20       sub rsp, 0x20
>      ; setup arguments for the alloca call
>      ; that 0x20 in rcx is actually the size of the current stack allocation
>        0x563d809095f4      48c745e82000.  mov qword [local_18h], 0x20 ; 32
>        0x563d809095fc      48ffc7         inc rdi
>        0x563d809095ff      48897de0       mov qword [local_20h], rdi
>        0x563d80909603      488d4de8       lea rcx, [local_18h]
>      ; calls alloca
>        0x563d80909607      e830010000     call sym.__alloca
> 
> The alloca function works as such:
> 
> ;-- __alloca:
>      ; Note how we don't create a stack frame by "push rbp;mov rbp,rsp"
>      ; Those instructions could be inlined, it's not a function per se
>      ;
>      ; At that point rcx holds the size of the calling functions's stack frame
>      ; and eax how much we want to add
>        0x563d8090973c      4889ca         mov rdx, rcx
>        0x563d8090973f      4889f8         mov rax, rdi
>      ; Round rax up to 16 bytes
>        0x563d80909742      4883c00f       add rax, 0xf
>        0x563d80909746      24f0           and al, 0xf0
>        0x563d80909748      4885c0         test rax, rax
>    ,=< 0x563d8090974b      7505           jne 0x563d80909752
>    |   0x563d8090974d      b810000000     mov eax, 0x10
>    `-> 0x563d80909752      4889c6         mov rsi, rax
>      ; Do the substraction in rax which holds the new address
>        0x563d80909755      48f7d8         neg rax
>        0x563d80909758      4801e0         add rax, rsp
>      ; Check for overflows
>    ,=< 0x563d8090975b      7321           jae 0x563d8090977e
>    | ; Replace the old stack pointer by the new one
>    |   0x563d8090975d      4889e9         mov rcx, rbp
>    |   0x563d80909760      4829e1         sub rcx, rsp
>    |   0x563d80909763      482b0a         sub rcx, qword [rdx]
>    |   0x563d80909766      480132         add qword [rdx], rsi
>    |   0x563d80909769      4889c4         mov rsp, rax
>    |   0x563d8090976c      4801c8         add rax, rcx
>    |   0x563d8090976f      4889e7         mov rdi, rsp
>    |   0x563d80909772      4801e6         add rsi, rsp
>    |   0x563d80909775      48c1e903       shr rcx, 3
>    |   0x563d80909779      f348a5         rep movsq qword [rdi], qword ptr [rsi]
>   ,==< 0x563d8090977c      eb02           jmp 0x563d80909780
>   |`-> 0x563d8090977e      31c0           xor eax, eax
>   |  ; Done!
>   `--> 0x563d80909780      c3             ret
> 
>   So as you can see alloca isn't really a function in that it doesn't create a
>   stack frame. It also needs help from the compiler to setup its arguments
>   since the current allocation size is needed (rcx in the beginning of alloca)
> which isn't a parameter known by the programmer. The compiler has to detect
> that __alloca call and setup an additionnal argument by itself. Alloca then
> just ("just") modifies the calling frame.
> 
> 
> (I really hope I didn't mess something up)

Thanks, I didn't realize there was an implementation outside the compiler. I had thought the compiler did all this for you.

I also didn't realize there was an actual function (stack frame or no stack frame, you are calling and returning).

Literally, I thought alloca just bumped the stack pointer and loaded the result into your target. Seems really complex for what it's doing, but maybe that's because it's a function call that's not really normal.

-Steve

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation