Thread overview
Can't implement conformant memset/memcpy without compiler -ffreestanding support
Jul 25, 2018
Zheng (Vic) Luo
Jul 25, 2018
Zheng (Vic) Luo
Jul 25, 2018
rikki cattermole
Jul 25, 2018
Zheng (Vic) Luo
Jul 25, 2018
rikki cattermole
Jul 25, 2018
Zheng (Vic) Luo
Jul 25, 2018
rikki cattermole
Jul 25, 2018
Zheng (Vic) Luo
Jul 25, 2018
rikki cattermole
Jul 25, 2018
Mike Franklin
July 25, 2018
Current implementation of compilers assumes libc implementation, which leads to an infinite loop if we want to implement primitives like memset with our own code because the compiler will optimize consecutive set with "memset". This suggests that we cannot write a freestanding program without supports from compiler. With "-betterC" flag, ldc also comes into this issue, which also applies to C/C++[1] and rust [2][3][4].

gcc and clang provides an option "-ffreestanding" to bypass optimizations that need libc support. Although we can hack around this issue by making our implementation complicated enough/using assembly to bypass the optimizer, it would be better to provide a standard flag like "-ffreestanding" for all compilers to disable such optimizations, so that developers won't have to hack around different compiler implementations.

[1] https://godbolt.org/g/5gVWeN
[2] https://play.rust-lang.org/?gist=64f2acafa8cec112893633a5f2e12a9a&version=stable&mode=release&edition=2015
[3] https://github.com/rust-lang/rust/issues/10116
[4] https://github.com/thestinger/rust-core#freestanding
July 25, 2018
Minimal example in D: https://run.dlang.io/is/EYVTzb. Affects at least dmd and ldc.

July 25, 2018
On 25/07/2018 8:59 PM, Zheng (Vic) Luo wrote:
> Minimal example in D: https://run.dlang.io/is/EYVTzb. Affects at least dmd and ldc.
> 

https://run.dlang.io/is/8tPOVX

Note that switch void* to ubyte* won't matter once its extern(C)'d.

My version (_memcpy_impl2) is just the regular old memcpy without optimizations. Your example is syntax sugar for a function call.
July 25, 2018
On Wednesday, 25 July 2018 at 09:16:19 UTC, rikki cattermole wrote:
> On 25/07/2018 8:59 PM, Zheng (Vic) Luo wrote:
>> Minimal example in D: https://run.dlang.io/is/EYVTzb. Affects at least dmd and ldc.
>> 
>
> https://run.dlang.io/is/8tPOVX
>
> Note that switch void* to ubyte* won't matter once its extern(C)'d.
>
> My version (_memcpy_impl2) is just the regular old memcpy without optimizations. Your example is syntax sugar for a function call.

There is no guarantee that a compiler (in a future version or after enabling some optimization flags) will not optimize _memcpy_impl2 into "call memset". Maybe it just happens that the optimizer is not smart enough to optimize this, because nothing with -beeterC prohibits the compiler to do so.
July 25, 2018
On Wednesday, 25 July 2018 at 09:16:19 UTC, rikki cattermole wrote:
> On 25/07/2018 8:59 PM, Zheng (Vic) Luo wrote:
>> Minimal example in D: https://run.dlang.io/is/EYVTzb. Affects at least dmd and ldc.
>> 
>
> https://run.dlang.io/is/8tPOVX
>
> Note that switch void* to ubyte* won't matter once its extern(C)'d.
>
> My version (_memcpy_impl2) is just the regular old memcpy without optimizations. Your example is syntax sugar for a function call.

A naive implementation of memset also lead to "call memset": https://run.dlang.io/is/k3Hl04
July 25, 2018
On 25/07/2018 9:23 PM, Zheng (Vic) Luo wrote:
> On Wednesday, 25 July 2018 at 09:16:19 UTC, rikki cattermole wrote:
>> On 25/07/2018 8:59 PM, Zheng (Vic) Luo wrote:
>>> Minimal example in D: https://run.dlang.io/is/EYVTzb. Affects at least dmd and ldc.
>>>
>>
>> https://run.dlang.io/is/8tPOVX
>>
>> Note that switch void* to ubyte* won't matter once its extern(C)'d.
>>
>> My version (_memcpy_impl2) is just the regular old memcpy without optimizations. Your example is syntax sugar for a function call.
> 
> There is no guarantee that a compiler (in a future version or after enabling some optimization flags) will not optimize _memcpy_impl2 into "call memset". Maybe it just happens that the optimizer is not smart enough to optimize this, because nothing with -beeterC prohibits the compiler to do so.

You misunderstand.
It isn't optimizing anything.
You requested the call to memcpy, explicitly when you said 'I want this copied ASAP'.

By the looks, the spec doesn't clearly explain this properly.
July 25, 2018
On 25/07/2018 9:32 PM, Zheng (Vic) Luo wrote:
> On Wednesday, 25 July 2018 at 09:16:19 UTC, rikki cattermole wrote:
>> On 25/07/2018 8:59 PM, Zheng (Vic) Luo wrote:
>>> Minimal example in D: https://run.dlang.io/is/EYVTzb. Affects at least dmd and ldc.
>>>
>>
>> https://run.dlang.io/is/8tPOVX
>>
>> Note that switch void* to ubyte* won't matter once its extern(C)'d.
>>
>> My version (_memcpy_impl2) is just the regular old memcpy without optimizations. Your example is syntax sugar for a function call.
> 
> A naive implementation of memset also lead to "call memset": https://run.dlang.io/is/k3Hl04

Okay yup, that is an optimization.

This won't optimize:

extern(C) void* memset(ubyte* dest, int val, size_t count) {
    immutable c = cast(ubyte)val;
    foreach(i; 0..count) {
        dest[i] = c;
    }
    return dest;
}

And yes the name is important :)
July 25, 2018
On Wednesday, 25 July 2018 at 08:57:41 UTC, Zheng (Vic) Luo wrote:

> gcc and clang provides an option "-ffreestanding" to bypass optimizations that need libc support. Although we can hack around this issue by making our implementation complicated enough/using assembly to bypass the optimizer, it would be better to provide a standard flag like "-ffreestanding" for all compilers to disable such optimizations, so that developers won't have to hack around different compiler implementations.

I ran into this with LDC and discussed it at https://forum.dlang.org/post/kchsryntrrnfaohjfqfw@forum.dlang.org  The solution for me was to compile with `-disable-simplify-libcalls`.

I never ran into this issue with GDC when I was compiling with `-nophoboslib -nostdinc -nodefaultlibs -nostdlib`.  If I remember correctly, GDC would generate calls to `memset`, `memcpy`, and friends but seemed to be smart enough not to rewrite my own implementations of those functions, so as long as I implemented those functions everything worked fine.

I'll have more to say in response to one of your other posts.

Mike

July 25, 2018
On Wednesday, 25 July 2018 at 09:37:56 UTC, rikki cattermole wrote:

> You misunderstand.
> It isn't optimizing anything.
> You requested the call to memcpy, explicitly when you said 'I want this copied ASAP'.
> 
> By the looks, the spec doesn't clearly explain this properly.

Well, it seems that this is not a good example to show. I didn't notice its semantics :)

>> 
>> A naive implementation of memset also lead to "call memset": https://run.dlang.io/is/k3Hl04
>
> Okay yup, that is an optimization.
>
> This won't optimize:
>
> extern(C) void* memset(ubyte* dest, int val, size_t count) {
>     immutable c = cast(ubyte)val;
>     foreach(i; 0..count) {
>         dest[i] = c;
>     }
>     return dest;
> }
>
> And yes the name is important :)

First, IIRC, the name hacking is a technique used to bypass llvm optimizers, I'm not sure if it also applies to gdc. Moreover, I think this *is* a hack around compiler because this forces memset implementer to write all code in that function.

July 25, 2018
On 25/07/2018 9:48 PM, Zheng (Vic) Luo wrote:
> 
> First, IIRC, the name hacking is a technique used to bypass llvm optimizers, I'm not sure if it also applies to gdc. Moreover, I think this *is* a hack around compiler because this forces memset implementer to write all code in that function.

I won't call it a hack. Because memset obviously can't go on to call memset.

Realistically, the default case that uses D itself, won't need more than a single function to implement. For optimized versions you must use assembly. Where the problem won't manifest anyway.

However this would be a wonderful candidate for a pragma to disable the calling to memset and friends.