Thread overview
Refactoring a "simple" function in dmd source
Dec 31, 2023
Walter Bright
Dec 31, 2023
Johan
Dec 31, 2023
Siarhei Siamashka
Dec 31, 2023
Alexandru Ermicioi
Dec 31, 2023
Walter Bright
Dec 31, 2023
Hipreme
Dec 31, 2023
Walter Bright
December 31, 2023
1. return an InputRange rather than an array of lines
2. no longer do any memory allocation (yay!)
3. no more leaking memory
4. stops processing the source text when the desired line is found
5. support Unicode line endings
6. uses Voldemort return value
7. is pure, @safe, @nogc and all that good stuff
8. much faster
9. completely self-contained

https://github.com/dlang/dmd/pull/15972

and that doesn't even include an earlier refactor:

https://github.com/dlang/dmd/pull/15969

I eyeballed std.string.splitLines() when doing this refactor, and noticed that it also allocated memory unnecessarily.

It's more lines of code, but the lines are simpler, and there's not a clever thing in it!
December 31, 2023
I've been watching the DFeed ticker on OFTC, very neat!
December 31, 2023
On Sunday, 31 December 2023 at 08:25:33 UTC, Walter Bright wrote:
> 1. return an InputRange rather than an array of lines
> 2. no longer do any memory allocation (yay!)
> 3. no more leaking memory
> 4. stops processing the source text when the desired line is found
> 5. support Unicode line endings
> 6. uses Voldemort return value
> 7. is pure, @safe, @nogc and all that good stuff
> 8. much faster
> 9. completely self-contained
>
> https://github.com/dlang/dmd/pull/15972
>
> and that doesn't even include an earlier refactor:
>
> https://github.com/dlang/dmd/pull/15969
>
> I eyeballed std.string.splitLines() when doing this refactor, and noticed that it also allocated memory unnecessarily.
>
> It's more lines of code, but the lines are simpler, and there's not a clever thing in it!

Nice. Even better if you'd have used std.string.splitLines !
That would have solved your point 9; self-contained == not maintained (as proven by the long standing Unicode line ending bug).

-Johan

December 31, 2023

On Sunday, 31 December 2023 at 11:07:53 UTC, Johan wrote:

>

On Sunday, 31 December 2023 at 08:25:33 UTC, Walter Bright wrote:

>
  1. return an InputRange rather than an array of lines
  2. no longer do any memory allocation (yay!)
  3. no more leaking memory
  4. stops processing the source text when the desired line is found
  5. support Unicode line endings
  6. uses Voldemort return value
  7. is pure, @safe, @nogc and all that good stuff
  8. much faster
  9. completely self-contained

https://github.com/dlang/dmd/pull/15972

and that doesn't even include an earlier refactor:

https://github.com/dlang/dmd/pull/15969

I eyeballed std.string.splitLines() when doing this refactor, and noticed that it also allocated memory unnecessarily.

It's more lines of code, but the lines are simpler, and there's not a clever thing in it!

Nice. Even better if you'd have used std.string.splitLines !

Or more like https://dlang.org/library/std/string/line_splitter.html for having no allocations and @nogc compatibility.

The listed benefits are nice and very much welcome, but it's a copy-paste job from Phobos into the DMD codebase. Raising a question again about the reasons why DMD itself isn't using Phobos like any normal applications do.

Also whenever game developers are complaining in this forum about poor @nogc support, they are being told that the GC is totally fine and they are just unreasonably biased against it. All of this while the DMD compiler itself is doing fancy hackish stunts with memory allocation. Games are latency sensitive. Compilers are not.

>

That would have solved your point 9; self-contained == not maintained (as proven by the long standing Unicode line ending bug).

Regarding copy-pasting code into multiple places and maintaining it separately, LDC developers fixed https://issues.dlang.org/show_bug.cgi?id=24290 for themselves, but haven't contributed the fix back to DMD.

December 31, 2023

On Sunday, 31 December 2023 at 13:02:27 UTC, Siarhei Siamashka wrote:

>

The listed benefits are nice and very much welcome, but it's a copy-paste job from Phobos into the DMD codebase. Raising a question again about the reasons why DMD itself isn't using Phobos like any normal applications do.

Chicken and egg problem, should've been discussed before here a lot.

>

Also whenever game developers are complaining in this forum about poor @nogc support, they are being told that the GC is totally fine and they are just unreasonably biased against it. All of this while the DMD compiler itself is doing fancy hackish stunts with memory allocation. Games are latency sensitive. Compilers are not.

Imho, this PR is nowhere imbued with @nogc fanciness here, just the property of D arrays that, now are properly taken advantage of, if you're talking about PR and not how compiler itself is working.

December 31, 2023
On 12/31/2023 3:07 AM, Johan wrote:
> Nice. Even better if you'd have used std.string.splitLines !
> That would have solved your point 9; self-contained == not maintained (as proven by the long standing Unicode line ending bug).

std.string.splitLines() unnecessarily allocates memory, to fix that requires changing its interface. Therefore, an improved version would be a candidate for Phobos 3.

I suspect there's a fair amount of memory allocation in Phobos that wouldn't be needed if we changed our point of view.

December 31, 2023
On 12/31/2023 5:02 AM, Siarhei Siamashka wrote:
> Or more like https://dlang.org/library/std/string/line_splitter.html for having no allocations and `@nogc` compatibility.

I missed the existence of that. Oops!

> The listed benefits are nice and very much welcome, but it's a copy-paste job from Phobos into the DMD codebase. Raising a question again about the reasons why DMD itself isn't using Phobos like any normal applications do.

It's good to question decisions now and then, but we just had a conversation about it.

> Also whenever game developers are complaining in this forum about poor `@nogc` support, they are being told that the GC is totally fine and they are just unreasonably biased against it. All of this while the DMD compiler itself is doing fancy hackish stunts with memory allocation. Games are latency sensitive. Compilers are not.

GC is just another tool. There are places where it is appropriate, and places where it is not. Because some functions in Phobos use GC does not mean anyone using Phobos must use GC, not at all.

Problems also exist if Phobos was to use malloc(). In particular, who owns the memory thus allocated?

Finding a method that avoids allocation entirely resolves this conundrum rather neatly.
December 31, 2023
On Sunday, 31 December 2023 at 19:09:15 UTC, Walter Bright wrote:
> On 12/31/2023 5:02 AM, Siarhei Siamashka wrote:
>> Or more like https://dlang.org/library/std/string/line_splitter.html for having no allocations and `@nogc` compatibility.
>
> I missed the existence of that. Oops!
>
>> The listed benefits are nice and very much welcome, but it's a copy-paste job from Phobos into the DMD codebase. Raising a question again about the reasons why DMD itself isn't using Phobos like any normal applications do.
>
> It's good to question decisions now and then, but we just had a conversation about it.
>
>> Also whenever game developers are complaining in this forum about poor `@nogc` support, they are being told that the GC is totally fine and they are just unreasonably biased against it. All of this while the DMD compiler itself is doing fancy hackish stunts with memory allocation. Games are latency sensitive. Compilers are not.
>
> GC is just another tool. There are places where it is appropriate, and places where it is not. Because some functions in Phobos use GC does not mean anyone using Phobos must use GC, not at all.
>
> Problems also exist if Phobos was to use malloc(). In particular, who owns the memory thus allocated?
>
> Finding a method that avoids allocation entirely resolves this conundrum rather neatly.


Not on a completely unrelated note. But I wanted to say that in newer versions we could have memory reserve/preAllocate inside range interfaces. I've been using that on output ranges since it could decrease the number of allocations required on output ranges, thus increasing performance.

For example: when converting a integer to a string, I calculate first how many digits the string will need, after that, I preallocate the digits count and only after that I start outputting it.