November 15, 2016 Re: CTFE Status | ||||
---|---|---|---|---|
| ||||
Posted in reply to deadalnix | On Tuesday, 15 November 2016 at 22:50:49 UTC, deadalnix wrote:
> On Tuesday, 15 November 2016 at 01:35:42 UTC, Stefan Koch wrote:
>> However there is a a bug inside the code that does bounds-checking for array assignment.
>> In rare cases it can trigger a out-bounds-error on newly created arrays.
>>
>
> This raise all kind of red flags to me. What are the design decision that lead to this ?
I am still figuring out when and why the bug is caused.
The Byte-code for slice allocation is rather complex.
Because it has to deal with resizing slices as well.
I suspect that somewhere the heapPtr is not bumped or the length is not set correctly.
|
November 16, 2016 Re: CTFE Status | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Tuesday, 15 November 2016 at 23:46:51 UTC, Stefan Koch wrote:
> On Tuesday, 15 November 2016 at 22:50:49 UTC, deadalnix wrote:
>> On Tuesday, 15 November 2016 at 01:35:42 UTC, Stefan Koch wrote:
>>> However there is a a bug inside the code that does bounds-checking for array assignment.
>>> In rare cases it can trigger a out-bounds-error on newly created arrays.
>>>
>>
>> This raise all kind of red flags to me. What are the design decision that lead to this ?
>
> I am still figuring out when and why the bug is caused.
> The Byte-code for slice allocation is rather complex.
> Because it has to deal with resizing slices as well.
> I suspect that somewhere the heapPtr is not bumped or the length is not set correctly.
Indeed the length was not set on a code-path meant for resizeing.
The problem is fixed :)
The HeapLimit has been raised to a more reasonable limit of 2 ^^ 24 Addresses. (Which means that you'll have 2 ^^ 24 Bytes in practice.)
|
November 16, 2016 Re: CTFE Status | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Wednesday, 16 November 2016 at 09:22:01 UTC, Stefan Koch wrote:
> On Tuesday, 15 November 2016 at 23:46:51 UTC, Stefan Koch wrote:
>
>> I suspect that somewhere the heapPtr is not bumped or the length is not set correctly.
>
> Indeed the length was not set on a code-path meant for resizeing.
> The problem is fixed :)
>
> The HeapLimit has been raised to a more reasonable limit of 2 ^^ 24 Addresses. (Which means that you'll have 2 ^^ 24 Bytes in practice.)
In fact, there is a single design decision that fosters these kinds of problems.
And that is to go with a low-level IR.
However although it's a tough route. It's also the only solution, (the only one I could think of), that will really enable CTFE to scale gracefully.
My latest measurements show that even for relatively small arrays (2 ^^ 15) bytes.
There is a 2x speedup.
When the interpreter backend.
As soon as my own jit backend is in place the performance will be a factor of 6 better.
Makeing newCTFE 12x faster then the current engine, even on small one-shot functions!
|
November 16, 2016 Re: CTFE Status | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | Here is a small demostration of the performance increase : [root@localhost dmd]# time src/dmd -c testSettingArrayLength.d > x 2> x real 0m0.199s user 0m0.180s sys 0m0.017s [root@localhost dmd]# time src/dmd -c testSettingArrayLength.d -bc-ctfe > x 2> x real 0m0.072s user 0m0.050s sys 0m0.020s Please note that newCTFE only spends 15 ms inside the evaluation, most time is spent clearing the 2^^24 Bytes of heap-memory to zero. The sourcode of testSettingArrayLength is uint[] MakeAndInitArr(uint length) { uint[] arr; arr.length = length; foreach(i;0 .. length) { arr[i] = i + 3; } return arr; } static assert(MakeAndInitArr(ushort.max).length == ushort.max); |
November 16, 2016 Re: CTFE Status | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Wednesday, 16 November 2016 at 09:45:24 UTC, Stefan Koch wrote:
> Here is a small demostration of the performance increase :
>
> [root@localhost dmd]# time src/dmd -c testSettingArrayLength.d
> > x 2> x
>
> real 0m0.199s
> user 0m0.180s
> sys 0m0.017s
> [root@localhost dmd]# time src/dmd -c testSettingArrayLength.d -bc-ctfe > x 2> x
>
> real 0m0.072s
> user 0m0.050s
> sys 0m0.020s
>
> Please note that newCTFE only spends 15 ms inside the evaluation, most time is spent clearing the 2^^24 Bytes of heap-memory to zero.
>
> The sourcode of testSettingArrayLength is
> uint[] MakeAndInitArr(uint length)
> {
> uint[] arr;
> arr.length = length;
>
> foreach(i;0 .. length)
> {
> arr[i] = i + 3;
> }
> return arr;
> }
>
> static assert(MakeAndInitArr(ushort.max).length == ushort.max);
A more accurate breakdown :
Initializing Heap: 18.6 ms
Generating Bytecode: 1.2 ms
Executing Bytecode: 13.2 ms
Converting to CTFE-EXp: 9.1 ms
For a second execution of the same function with the same arguments within the same file the numbers look like :
Initializing Heap: 16.7 ms
Generating Bytecode: 0.6 ms
Executing Bytecode: 13.2 ms
Converting to CTFE-EXp: 9.3 ms
|
November 16, 2016 Re: CTFE Status | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Wednesday, 16 November 2016 at 10:07:06 UTC, Stefan Koch wrote:
>
> A more accurate breakdown :
>
> Initializing Heap: 18.6 ms
> Generating Bytecode: 1.2 ms
> Executing Bytecode: 13.2 ms
> Converting to CTFE-EXp: 9.1 ms
>
> For a second execution of the same function with the same arguments within the same file the numbers look like :
>
> Initializing Heap: 16.7 ms
> Generating Bytecode: 0.6 ms
> Executing Bytecode: 13.2 ms
> Converting to CTFE-EXp: 9.3 ms
The above numbers were obtained using a debug build made with dmd.
The following numbers are from a optimized build with ldmd2
First Execution (cold cache) :
Initializing Heap: 17.4 ms
Generating Bytecode: 0.7 ms
Executing Bytecode: 5.3 ms
Converting to CTFE-EXp: 5.1 ms
Second run (warmer cache) :
Initializing Heap: 16.9 ms
Generating Bytecode: 0.3 ms
Executing Bytecode: 5.3 ms
Converting to CTFE-EXp: 4.9 ms
|
November 16, 2016 Re: CTFE Status | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Wednesday, 16 November 2016 at 10:25:30 UTC, Stefan Koch wrote:
> First Execution (cold cache) :
>
> Initializing Heap: 17.4 ms
> Generating Bytecode: 0.7 ms
> Executing Bytecode: 5.3 ms
> Converting to CTFE-EXp: 5.1 ms
>
> Second run (warmer cache) :
>
> Initializing Heap: 16.9 ms
> Generating Bytecode: 0.3 ms
> Executing Bytecode: 5.3 ms
> Converting to CTFE-EXp: 4.9 ms
And Again a bit of bad news.
Due to problems in the lowering of function arguments the implementation of strcat is delayed again.
|
November 17, 2016 Re: CTFE Status | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Wednesday, 16 November 2016 at 14:44:06 UTC, Stefan Koch wrote:
>
> And Again a bit of bad news.
> Due to problems in the lowering of function arguments the implementation of strcat is delayed again.
The bug does not affect strings.
Since strings are not build up out of multiple sub-expressions.
strcat is on it's way.
I have begun the process of macrofication.
|
November 17, 2016 Re: CTFE Status | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stefan Koch | On Thursday, 17 November 2016 at 05:35:33 UTC, Stefan Koch wrote:
> On Wednesday, 16 November 2016 at 14:44:06 UTC, Stefan Koch wrote:
>>
>> And Again a bit of bad news.
>> Due to problems in the lowering of function arguments the implementation of strcat is delayed again.
>
> The bug does not affect strings.
> Since strings are not build up out of multiple sub-expressions.
> strcat is on it's way.
>
> I have begun the process of macrofication.
I follow this thread every day. I hope you'll write an article on dlang blog when the work will be completed :)
|
November 17, 2016 Re: CTFE Status | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrea Fontana | On Thursday, 17 November 2016 at 08:39:57 UTC, Andrea Fontana wrote:
>
> I follow this thread every day. I hope you'll write an article on dlang blog when the work will be completed :)
Mike Parker is going to write an short article about it based on information I gave him via mail.
I am afraid my blog-writing skills are a bit under-developed.
On the topic of CTFE :
My attempts of half-automatically generating a string-concat macro have yet to succeed.
I am currently busy fixing bugs :)
I apologize for the seemingly slow progress.
However,
such is the nature of low-level code.
There is really no way around it,
if we are aiming for performance.
|
Copyright © 1999-2021 by the D Language Foundation