Cleaned up C++ (page 4)

On 4/24/2015 12:23 AM, John Colvin wrote: > Except of course that alloca is a lot cheaper than malloc/free. That's not necessarily true. But in any case, go ahead and use it if you like. Just prepare to benchmark and be disappointed :-)

On Friday, 24 April 2015 at 08:16:40 UTC, Walter Bright wrote: > On 4/24/2015 12:23 AM, John Colvin wrote: >> Except of course that alloca is a lot cheaper than malloc/free. > > That's not necessarily true. But in any case, go ahead and use it if you like. Just prepare to benchmark and be disappointed :-) Do you have a guess for why and when it could not be faster than malloc in times? I have some difficulty imagining a reason (yet I have sometimes found malloc faster than aligned_malloc which is another odd thing).

April 24, 2015

Re: Cleaned up C++

Posted by John Colvin
in reply to ponce

Permalink

John Colvin

Posted in reply to ponce

Permalink

On Friday, 24 April 2015 at 12:34:19 UTC, ponce wrote:
> On Friday, 24 April 2015 at 08:16:40 UTC, Walter Bright wrote:
>> On 4/24/2015 12:23 AM, John Colvin wrote:
>>> Except of course that alloca is a lot cheaper than malloc/free.
>>
>> That's not necessarily true. But in any case, go ahead and use it if you like. Just prepare to benchmark and be disappointed :-)
>
> Do you have a guess for why and when it could not be faster than malloc in times?
> I have some difficulty imagining a reason (yet I have sometimes found malloc faster than aligned_malloc which is another odd thing).

one reason why it might be faster is that e.g. gcc can produce code like this:

#include<alloca.h>

void bar(char* a);

void foo(unsigned int n)
{
  char *a = (char*)alloca(n);
  bar(a);
}

foo:
	movl	%edi, %eax
	pushq	%rbp
	addq	$46, %rax
	movq	%rsp, %rbp
	shrq	$4, %rax
	salq	$4, %rax
	subq	%rax, %rsp
	leaq	31(%rsp), %rdi
	andq	$-32, %rdi
	call	bar
	leave
	ret

which is neat. Now of course a push-the-pointer malloc/free implementation could perhaps be (in theory) optimised to be as small as this, but is that ever actually the case?

On Friday, 24 April 2015 at 08:16:40 UTC, Walter Bright wrote: > On 4/24/2015 12:23 AM, John Colvin wrote: >> Except of course that alloca is a lot cheaper than malloc/free. > > That's not necessarily true. But in any case, go ahead and use it if you like. Just prepare to benchmark and be disappointed :-) It is, unless one go the bump the pointer/never free road.

On 4/24/2015 5:59 AM, John Colvin wrote: > one reason why it might be faster is that e.g. gcc can produce code like this: > > #include<alloca.h> > > void bar(char* a); > > void foo(unsigned int n) > { > char *a = (char*)alloca(n); > bar(a); > } > > foo: > movl %edi, %eax > pushq %rbp > addq $46, %rax > movq %rsp, %rbp > shrq $4, %rax > salq $4, %rax > subq %rax, %rsp > leaq 31(%rsp), %rdi > andq $-32, %rdi > call bar > leave > ret > > which is neat. It's a cowboy implementation that's fine until it someone tries a largish value of n.

On 4/24/2015 10:27 AM, deadalnix wrote: > On Friday, 24 April 2015 at 08:16:40 UTC, Walter Bright wrote: >> On 4/24/2015 12:23 AM, John Colvin wrote: >>> Except of course that alloca is a lot cheaper than malloc/free. >> >> That's not necessarily true. But in any case, go ahead and use it if you like. >> Just prepare to benchmark and be disappointed :-) > > It is, unless one go the bump the pointer/never free road. I wouldn't assume that. A large array on the stack will have memory caching issues.

On 25 Apr 2015 01:25, "Walter Bright via Digitalmars-d" < digitalmars-d@puremagic.com> wrote: > > On 4/24/2015 5:59 AM, John Colvin wrote: >> >> one reason why it might be faster is that e.g. gcc can produce code like this: >> >> #include<alloca.h> >> >> void bar(char* a); >> >> void foo(unsigned int n) >> { >> char *a = (char*)alloca(n); >> bar(a); >> } >> >> foo: >> movl %edi, %eax >> pushq %rbp >> addq $46, %rax >> movq %rsp, %rbp >> shrq $4, %rax >> salq $4, %rax >> subq %rax, %rsp >> leaq 31(%rsp), %rdi >> andq $-32, %rdi >> call bar >> leave >> ret >> >> which is neat. > > > It's a cowboy implementation that's fine until it someone tries a largish value of n. > I wonder just how large... IIRC I think the limit on ubyte arrays is 1M?

On 4/24/2015 11:51 PM, Iain Buclaw via Digitalmars-d wrote: > I wonder just how large... IIRC I think the limit on ubyte arrays is 1M? A large enough value can not just case stack overflow, but can cause the stack pointer to be anywhere in the address space. I don't know of a limit on ubyte arrays.

On 4/22/15 1:36 PM, John Colvin wrote: > > Is it even possible to contrive a case where > 1) The default initialisation stores are technically dead and > 2) Modern compilers can't tell they are dead and elide them and > 3) Doing the initialisation has a significant performance impact? > > The boring example is "extra code causes instruction cache misses". I've seen statically-sized arrays causing problems. -- Andrei

Forums