Thread overview
[Issue 6498] [CTFE] copy-on-write is slow and causes huge memory usage
Jun 28, 2014
Per Nordlöw
Jun 28, 2014
Iain Buclaw
Mar 17, 2015
NCrashed@gmail.com
Jun 09, 2022
RazvanN
Jun 10, 2022
mhh
Jun 10, 2022
Iain Buclaw
Jun 10, 2022
Iain Buclaw
June 28, 2014
https://issues.dlang.org/show_bug.cgi?id=6498

Per Nordlöw <per.nordlow@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |per.nordlow@gmail.com

--- Comment #3 from Per Nordlöw <per.nordlow@gmail.com> ---
Don: Is there a Github PR or branch for your changes or are these things normally kept secret because this issue has a bounty?

--
June 28, 2014
https://issues.dlang.org/show_bug.cgi?id=6498

Iain Buclaw <ibuclaw@gdcproject.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ibuclaw@gdcproject.org

--- Comment #4 from Iain Buclaw <ibuclaw@gdcproject.org> ---
FYI, all PR's have been merged in.

I won't bother listing them all (there's a lot that was done over 2012/2013). There has been no work on this since June 2013 IIRC.

https://github.com/D-Programming-Language/dmd/pull/1778#issuecomment-19964496


What should be focused on (thanks to Walter's idea of allocating but not freeing memory) is to limit just how much memory is allocated from CTFE.  By possibly find ways to re-use and not re-allocate memory, or maybe giving CTFE its own allocator (it is a backend in its own right, afterall).

--
March 17, 2015
https://issues.dlang.org/show_bug.cgi?id=6498

NCrashed@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |NCrashed@gmail.com

--
June 09, 2015
https://issues.dlang.org/show_bug.cgi?id=6498

Andrei Alexandrescu <andrei@erdani.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|D1 & D2                     |D2

--
June 09, 2022
https://issues.dlang.org/show_bug.cgi?id=6498

RazvanN <razvan.nitu1305@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |razvan.nitu1305@gmail.com

--- Comment #5 from RazvanN <razvan.nitu1305@gmail.com> ---
This seems to have been fixed. On my machine it takes 5 seconds to run this and it appears to use 2-3% of my 16 GB RAM. Should we close this?

--
June 10, 2022
https://issues.dlang.org/show_bug.cgi?id=6498

mhh <maxhaton@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |maxhaton@gmail.com

--- Comment #6 from mhh <maxhaton@gmail.com> ---
The memory usage has improved a lot but this is still ridiculously slow.

Compare with a soon to be upstream-ed -preview=newCTFE: https://asciinema.org/a/zTHuVmXbsZ4ryWGfCd2bXoJG5 (roughly 10x faster)

SDC does this in about 0.04 sec on my machine so 50x to 80x faster

--
June 10, 2022
https://issues.dlang.org/show_bug.cgi?id=6498

--- Comment #7 from Iain Buclaw <ibuclaw@gdcproject.org> ---
Metrics of the code in this report ran by v2.080:
---
Command being timed: "./generated/linux/release/64/dmd issue6498.d -c"
User time (seconds): 6.44
System time (seconds): 0.29
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.75
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1104116
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 274715
Voluntary context switches: 1
Involuntary context switches: 256
Swaps: 0
File system inputs: 246
File system outputs: 6
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---

As of v2.085.0 - when most of dinterpret had been converted over to returning UnionExp on the stack.
---
Command being timed: "./generated/linux/release/64/dmd issue6498.d -c"
User time (seconds): 6.64
System time (seconds): 0.19
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.84
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 636044
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 1
Minor (reclaiming a frame) page faults: 157878
Voluntary context switches: 1
Involuntary context switches: 231
Swaps: 0
File system inputs: 386
File system outputs: 6
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---

As of v2.089.0 - when a ctfeRegion allocator was introduced to free memory after exiting an interpret "scope".
---
Command being timed: "./generated/linux/release/64/dmd issue6498.d -c"
User time (seconds): 6.88
System time (seconds): 0.14
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:07.03
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 637204
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 158019
Voluntary context switches: 1
Involuntary context switches: 17
Swaps: 0
File system inputs: 474
File system outputs: 6
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---

As of v2.100.0
---
Command being timed: "./generated/linux/release/64/dmd issue6498.d -c"
User time (seconds): 7.13
System time (seconds): 0.07
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:07.22
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 482504
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 119238
Voluntary context switches: 1
Involuntary context switches: 223
Swaps: 0
File system inputs: 833
File system outputs: 6
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---


With -lowmem.
---
Command being timed: "./generated/linux/release/64/dmd issue6498.d -c -lowmem"
User time (seconds): 7.64
System time (seconds): 0.05
Percent of CPU this job got: 103%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:07.42
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 28760
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 1
Minor (reclaiming a frame) page faults: 5679
Voluntary context switches: 2376
Involuntary context switches: 774
Swaps: 0
File system inputs: 833
File system outputs: 6
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
---

--
June 10, 2022
https://issues.dlang.org/show_bug.cgi?id=6498

--- Comment #8 from Iain Buclaw <ibuclaw@gdcproject.org> ---
(In reply to Iain Buclaw from comment #7)
> v2.080:
> Maximum resident set size (kbytes): 1104116
> v2.085.0:
> Maximum resident set size (kbytes): 636044
> v2.089.0:
> Maximum resident set size (kbytes): 637204
> v2.100.0:
> Maximum resident set size (kbytes): 482504
> -lowmem (as of v2.090):
> Maximum resident set size (kbytes): 28760
It's still nearly 500MB, so only 2x better than where we were 4 years ago, and still a far cry away from the possible 30MB we could instead by managing with.

I also note that the compiler has slowed down by 1 second since v2.080 as well, so CTFE is not getting faster at all...

--