June 01, 2019
On Saturday, 1 June 2019 at 14:18:25 UTC, Stefanos Baziotis wrote:
> Moreover, let me stress that malloc(), free().. will be available as well.

Do you mean you're planning to allow the stdlib's allocation backend to be switched completely to libc-style malloc() and free(), or just that developers can always import core.stdc.stdlib and call malloc() if they like?  (The second option won't be enough.)

One option is to design D's allocation so that users can link with wrapped versions of tcmalloc, etc.  However, it's important that this be designed properly so that it doesn't require a custom compiler toolchain, otherwise it'll just be a theoretical thing that no one actually does.  Preferably it would work with LD_PRELOAD.

I like the idea of moving beyond libc's API, but please consider and test this use case.  A lot of smart people outside D are working on allocators, and it would be a major disadvantage if D can't use them.
June 01, 2019
On Saturday, 1 June 2019 at 22:45:40 UTC, sarn wrote:
>
> Do you mean you're planning to allow the stdlib's allocation backend to be switched completely to libc-style malloc() and free()

Currently, it is using malloc() and free(). Maybe you mean move away?

> or just that developers can always import core.stdc.stdlib and call malloc() if they like?  (The second option won't be enough.)

They will be able because libc is not going anywhere. The purpose is to create
an allocator _for the D Runtime_. Of course this allocator will be available
for users to use as well. It's just that the focus will be there.
Our initial plan was to make a D version of malloc() and free().
But, as Mike first suggested, we have the chance to create a more D-style
version allocator. And fortunately, the foundation has already been built
in std.experimental allocator.
And as a personal opinion, the interface of malloc() and free() is not ideal
for an allocator. From what I know, a lot of people working on allocators
seem to have the same opinion.
Just to disambiguate again, the purpose is that D Runtime won't depend on libc.

> One option is to design D's allocation so that users can link with wrapped versions of tcmalloc, etc.  However, it's important that this be designed properly so that it doesn't require a custom compiler toolchain, otherwise it'll just be a theoretical thing that no one actually does.  Preferably it would work with LD_PRELOAD.
>

Well, the thing is to wrap an allocator, you first have to either write
the allocator in D, or create a dependency on that allocator.
Our choice is not the first, but somewhat the first. Meaning, I won't
port any allocator but the allocator I will write will of course be inspired
from work of others. Now, the important thing here is that I have so much time.
It's only a summer, which is not even completely devoted to the allocator (it's
about half the time). So, hopefully, either I or other people will continue
the work post-GSoC.

> I like the idea of moving beyond libc's API, but please consider and test this use case.  A lot of smart people outside D are working on allocators, and it would be a major disadvantage if D can't use them.

As a I said, it will be able to use them. The purpose is not to replace them
in general, but specifically in the D Runtime.
Be sure to check again the starting post in this thread for why we're doing this,
and if there are any questions, please ask.

- Stefanos

June 02, 2019
On Saturday, 1 June 2019 at 02:40:10 UTC, sarn wrote:

> Here's something to consider if you're replacing malloc() et al: it's popular (especially with large server deployments) to tune application memory allocation performance by replacing libc malloc() with alternatives such as tcmalloc and jemalloc.  That works because they use the same libc malloc() API but with a different implementation, injected at link or load time (using LD_PRELOAD or something).
>
> It would be great if D code can still take advantage of alternative allocators developed by third-parties who may or may not be writing for D.

std.experimental.allocator (https://dlang.org/phobos/std_experimental_allocator.html) supports an `IAllocator` interface (https://dlang.org/phobos/std_experimental_allocator.html#IAllocator).

The way I envision this playing out is that when std.experimenal.allocator is ported to druntime, callers would use the `IAllocator` interface.  Therefore, any allocator conforming to that interface could potentially serve as druntime's allocator.  In order to swap the allocator, one would only have to implement the `IAllocator` interface, potentially even using the `Mallocator` (https://dlang.org/phobos/std_experimental_allocator_mallocator.html), and make the swap.

Providing the machinery to make that convenient (compiler switches, runtime configuration, etc.) should probably not be in the scope of the GSoC project as it is already pressed for time, but that should only be a PR away for anyone who considers it a priority.

That being said, we recognize that change needs to happen gradually to not rock the boat.  Therefore, even when this project is complete, it should probably still default to libc with a `-preview` switch or something like to allow users to opt-in to the D allocator.  Once there is sufficient experience in the real world with the D allocator, the defaults can potentially be swapped.

This GSoC project will attempt to remove libc as a hard, intrinsic dependency in druntime, and reduce it to a platform implementation detail.  In other words, druntime will not depend on libc, but a specific platform's port of druntime might.

Mike
June 02, 2019
On Friday, 31 May 2019 at 21:40:11 UTC, Stefanos Baziotis wrote:
> On Friday, 31 May 2019 at 21:01:01 UTC, Stefanos Baziotis wrote:
>>     Important clarifications:
>
> Forgot that it targets x86_64.

And which OS?

--
/Jacob Carlborg
June 02, 2019
On 6/1/2019 7:24 AM, Stefanos Baziotis wrote:
> Do we have any benchmarks?

No.
June 02, 2019
On Sunday, 2 June 2019 at 00:10:51 UTC, Mike Franklin wrote:
> On Saturday, 1 June 2019 at 02:40:10 UTC, sarn wrote:
>
>> Here's something to consider if you're replacing malloc() et al: it's popular (especially with large server deployments) to tune application memory allocation performance by replacing libc malloc() with alternatives such as tcmalloc and jemalloc.
>>  That works because they use the same libc malloc() API but with a different implementation, injected at link or load time (using LD_PRELOAD or something).
>>
>> It would be great if D code can still take advantage of alternative allocators developed by third-parties who may or may not be writing for D.
>
> std.experimental.allocator (https://dlang.org/phobos/std_experimental_allocator.html) supports an `IAllocator` interface (https://dlang.org/phobos/std_experimental_allocator.html#IAllocator).
>
> The way I envision this playing out is that when std.experimenal.allocator is ported to druntime, callers would use the `IAllocator` interface.  Therefore, any allocator conforming to that interface could potentially serve as druntime's allocator.  In order to swap the allocator, one would only have to implement the `IAllocator` interface, potentially even using the `Mallocator` (https://dlang.org/phobos/std_experimental_allocator_mallocator.html), and make the swap.

Thanks, that makes sense.  It sounds like a version spec that switches to Mallocator (or whatever) could do it, as long as it doesn't force a recompilation of the whole runtime library.  (Even more convenient would be a runtime flag like --DRT-gcopt, but I'm guessing you'd want to make it happen at compile time.)
June 02, 2019
On Sunday, 2 June 2019 at 00:10:51 UTC, Mike Franklin wrote:
> The way I envision this playing out is that when std.experimenal.allocator is ported to druntime

You probably don't need or want to port the whole of std.experimental.allocator to druntime. I recently looked at the GC in druntime and it has it's own pools etc. If it didn't, then the mark phase would be a lot harder and slower. (according to my understanding...)

Therefor, for normal D programs, the only thing that makes sense is to implement the allocator that underlies the GC (an mmap or sbrk allocator). And be sure to make it is pluggable.

What I am trying to say is that you can avoid porting the whole thing.

> use the `IAllocator` interface.  Therefore, any allocator conforming to that interface could potentially serve as druntime's allocator.

I am not a big fan of the IAllocator interface since it introduces a layer of indirection. There is no simple solution to avoid the indirection and get a pluggable allocator. Well, maybe a combination of ldc's @weak and LTO. Dunno...

https://wiki.dlang.org/LDC-specific_language_changes#.40.28ldc.attributes.weak.29
http://johanengelen.github.io/ldc/2016/11/10/Link-Time-Optimization-LDC.html
June 03, 2019
On Saturday, 1 June 2019 at 14:29:03 UTC, Stefanos Baziotis wrote:
> I'll post an update about where my experimentation will be visible.

https://github.com/baziotis/Dmemcpy

You can follow this repo for memcpy. In the future, probably
I will merge all the string.h functions in one repo, but in the development
stage I think it's better to have them on their own.

Any feedback is greatly appreciated!
June 03, 2019
On Sunday, 2 June 2019 at 11:19:20 UTC, Sebastiaan Koppe wrote:
> On Sunday, 2 June 2019 at 00:10:51 UTC, Mike Franklin wrote:
>> [...]
>
> You probably don't need or want to port the whole of std.experimental.allocator to druntime. I recently looked at the GC in druntime and it has it's own pools etc. If it didn't, then the mark phase would be a lot harder and slower. (according to my understanding...)
>
> [...]

Sebastiaan I don't have a good answer for you right now. std.experimental.allocator
is quite new for me. I hope Mike can give you more insight until I start working
on this part.
June 03, 2019
On 6/3/19 10:17 AM, Stefanos Baziotis wrote:
> On Saturday, 1 June 2019 at 14:29:03 UTC, Stefanos Baziotis wrote:
>> I'll post an update about where my experimentation will be visible.
> 
> https://github.com/baziotis/Dmemcpy
> 
> You can follow this repo for memcpy. In the future, probably
> I will merge all the string.h functions in one repo, but in the development
> stage I think it's better to have them on their own.
> 
> Any feedback is greatly appreciated!

At 512 lines including tests, it seems on the involved side. The benchmarks ought to show a hefty improvement to match. Are there benchmark results available?

Quoting the rationale from the motivation in another thread:

1) C’s implementations are not type-safe and memory-safe.
2) C’s implementations have accumulated a lot of cruft over the years.
3) Cross-compiling is more difficult as now one should have available and configured a C runtime and toolchain apart from the D runtime. This makes it difficult for D to create freestanding software.

And then the listed advantages of using D for implementation (renumbered):

4) Type-safety and memory safety (bounds-checking etc.)
5) Templates to branch to an optimal implementation at compile-time.
6) Inlining, as the branching in C happens at runtime.
7) Compile-Time Function Execution (CTFE) and introspection (type info).

My view on formulating motivation is simple: do it like a scientist. Argue the facts. If facts are not available, argue fundaments and universal principles. If such are not available, the motivation is too weak.

(1) checks the "facts" box but has the obvious comeback "then how about a 2-line trusted wrapper over memcpy?" that needs to be explained. Related, obviously people who reach for memcpy() are often not looking for a safe primitive. a[] = b[] is safe, syntactically simple, and could lower to anything including memcpy.

(2) is quite specious and really needs some evidence. Is cruft in memcpy really an issue? I looked memcpy() implementations a while ago but didn't save bookmarks. Did a google search just now and found https://github.com/gcc-mirror/gcc/blob/master/libgcc/memcpy.c, which is very far from cruft-ridden. I do remember elaborate implementations of memcpy but so are (somewhat ironically) the 512 lines of the proposed implementation. I found one here:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/lib/memcpy_64.S?id=HEAD

No idea of its level of cruftiness, where it's used etc. The right way to argue (2) is to provide links to implementations that people can look at and decide without doubt, "yep, crufty".

(3) is... odd. Doesn't every machine ever come with a C implementation including a ready-to-link standard library? If not, isn't that a rarity? Again, that should be argued preemptively by the motivation section.

(4) brings again the wrapper argument
(5) is nice if and only if confirmed by benchmarks
(6) is also nice under the same conditions as (5)
(7) again... what's wrong with a wrapper that does if (__ctfe)

These considerations are built with memcpy() in mind. With malloc() we're looking at a completely different ballgame. Implementing malloc() from scratch is a very serious project that needs almost overwhelming motivation. The goal of std.experimental.allocator was to offer a flexible framework for implementing general and specialized allocators, but simply replacing malloc() is more difficult to argue. Also, achieving comparable performance will be difficult.