February 01, 2017
On 1/31/2017 10:43 PM, Tobias Müller wrote:
> Using an FFI function to compare D vs Rust doesn't tell you much. Foreign
> functions are usually not used directly in Rust, they are used to build
> safe wrappers that will give you *all* possible guarantees, including type
> safety.
> As a consequence it's not necessary to augment the C declaration with
> additional information.

I'm not very familiar with Rust. Can you post what a Rust declaration for memcpy would look like with all the guarantees?


> Marking the function as safe would be wrong in Rust, because dereferencing
> raw pointers is unsafe. Raw pointers are not necessarily valid, even in
> safe code. You need references for that guarantee. But again, raw pointers
> are usually only used for FFI and to build safe abstractions.

memcpy() isn't marked safe in D, either.
February 01, 2017
On Tuesday, 31 January 2017 at 23:30:04 UTC, Walter Bright wrote:
> On 1/31/2017 3:00 PM, Richard Delorme wrote:
> The thing about memcpy is compilers build in a LOT of information about it that simply is not there in the declaration. I suggest retrying your example for gcc/clang, but use your own memcpy, i.e.:
>
>    void* mymemcpy(void * restrict s1, const void * restrict s2, size_t n);
>
> Let us know what the results are!

//-----8<-------------------------------------------------------
#include <string.h>
#include <stdio.h>

void* mymemcpy(void* restrict dest, const void* restrict src, size_t n) {
	const char *s = src;
	char *d = dest;
	for (size_t i = 0; i < n; ++i) d[i] = s[i];
	return d;
}

void *copy(const void *c, size_t n) {
	char d[16];
	return mymemcpy(d, c, n);
}	

int main(void) {
	char a[16] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
	char *b = copy(a, 8);

	for (int i = 0; i < 16; ++i) printf("%d ", b[i]);
	putchar('\n');
}
//-----8<-------------------------------------------------------
$ gcc mymemcpy.c -O2 -W
mymemcpy.c: In function 'copy':
mymemcpy.c:13:9: warning: function returns address of local variable [-Wreturn-local-addr]
  return mymemcpy(d, c, n);
         ^~~~~~~~~~~~~~~~~
memcpy4.c:12:7: note: declared here
  char d[16];

clang (version 3.8.1) failed to find error in this code.


February 01, 2017
On Wednesday, 1 February 2017 at 10:05:49 UTC, Richard Delorme wrote:
> On Tuesday, 31 January 2017 at 23:30:04 UTC, Walter Bright wrote:
>> On 1/31/2017 3:00 PM, Richard Delorme wrote:
>> The thing about memcpy is compilers build in a LOT of information about it that simply is not there in the declaration. I suggest retrying your example for gcc/clang, but use your own memcpy, i.e.:
>>
>>    void* mymemcpy(void * restrict s1, const void * restrict s2, size_t n);
>>
>> Let us know what the results are!
>
> //-----8<-------------------------------------------------------
> #include <string.h>
> #include <stdio.h>
>
> void* mymemcpy(void* restrict dest, const void* restrict src, size_t n) {
> 	const char *s = src;
> 	char *d = dest;
> 	for (size_t i = 0; i < n; ++i) d[i] = s[i];
> 	return d;
> }
>
> void *copy(const void *c, size_t n) {
> 	char d[16];
> 	return mymemcpy(d, c, n);
> }	
>
> int main(void) {
> 	char a[16] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
> 	char *b = copy(a, 8);
>
> 	for (int i = 0; i < 16; ++i) printf("%d ", b[i]);
> 	putchar('\n');
> }
> //-----8<-------------------------------------------------------
> $ gcc mymemcpy.c -O2 -W
> mymemcpy.c: In function 'copy':
> mymemcpy.c:13:9: warning: function returns address of local variable [-Wreturn-local-addr]
>   return mymemcpy(d, c, n);
>          ^~~~~~~~~~~~~~~~~
> memcpy4.c:12:7: note: declared here
>   char d[16];
>
> clang (version 3.8.1) failed to find error in this code.

You have to define the mymemcpy() in another source file and only put the prototype in this module. If the compiler sees the code it can do the complete data flow analyses. With only the declaration it can't and that is Walter's point. The annotations allow to give to the declaration the information the compiler can not deduce itself from the code, because the code is in another module (object file, library).

February 01, 2017
On Wednesday, 1 February 2017 at 08:17:45 UTC, Walter Bright wrote:
> I'm not very familiar with Rust. Can you post what a Rust declaration for memcpy would look like with all the guarantees?

The memcpy you have linked [1] is just a wrapper around the LLVM intrinsic [2] function. This is not stabilized therefore not part of the standard library, as Rust doesn't want to force permanent dependence on the LLVM (or emulating the LLVM on other future backends).

The _traditional_ C-like memcpy [3] in the stdlib. It is unsafe, and carries no side effects for the src buffer. It enforces type safety, but it cannot enforce memory safety as you can blow past the allocation side on your dst buffer (hence why it is unsafe).

The simplest _safe_ memcpy [4] is just doing a range check before calling the unsafe memcpy in stdlib. This ensure type and memory safety (returning Err on non-equal length buffers). While this may seem limiting one can still archive non-aligned copies via the Rust sub-slice operator Example: mempy( &src[0..4], &mut dst[20..24]);

Which would copy the first 3 bytes of src, into the 20th to 23rd bytes of dst.


[1] https://doc.rust-lang.org/1.14.0/libc/fn.memcpy.html

[2] http://llvm.org/docs/LangRef.html#llvm-memcpy-intrinsic

[3] https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping.html

[4] https://gist.github.com/1f34331b2cae6ba9e624c5f9f4f2a458
February 01, 2017
On Wed, 01 Feb 2017 10:20:45 +0000, Patrick Schluter wrote:
> You have to define the mymemcpy() in another source file and only put the prototype in this module. If the compiler sees the code it can do the complete data flow analyses. With only the declaration it can't and that is Walter's point. The annotations allow to give to the declaration the information the compiler can not deduce itself from the code, because the code is in another module (object file, library).

OTOH I haven't seen anyone distribute a D library with .di files, and even extern(D) is pretty rare. That means whole program analysis is a lot more feasible.

The exception is virtual dispatch and functional programming, which is leaky enough compiling the whole program at once and intractable with any level of incremental compilation. The most obvious form is dub compiling each package you depend on into a static library.

But all whole-program analysis problems can be solved with a custom linker and object format, right?
February 01, 2017
On Tuesday, 31 January 2017 at 18:21:02 UTC, Jack Stouffer wrote:
> On Tuesday, 31 January 2017 at 01:30:48 UTC, Walter Bright wrote:
>> 2. The return value is derived from s1.
>> 4. Copies of s1 or s2 are not saved.
>
> Actually I didn't know either of those things from looking at the signature because DIP25 and DIP1000 have marketing problems, in that the only way to get info on them is on the DIP pages. I'd be willing to bet money that 80% of the people who use D don't know about the -dip25 flag.
>
> Is there anywhere which gives a simple explaination of both of these DIP's safety checks?

DIP1000 is not stable yet AFAICT. I documented return ref around December:
https://dlang.org/spec/function.html#return-ref-parameters
February 01, 2017
Walter Bright <newshound2@digitalmars.com> wrote:
> I'm not very familiar with Rust. Can you post what a Rust declaration for memcpy would look like with all the guarantees?

You wouldn't use memcpy but just assign the slices. Assignment is always just memcpy in Rust because of move-semantics:

a[m1..n1] = b[m2..n2];

It will panic if sizes don't match.

But if you still wanted a memcpy it would probably look like this:

fn memcpy<'a, T>(dest: &'a mut [T], src: &[T]) -> &'a mut [T]
February 01, 2017
On Wednesday, 1 February 2017 at 14:39:15 UTC, Cody Laeder wrote:
> [4] https://gist.github.com/1f34331b2cae6ba9e624c5f9f4f2a458

That example code won't even typecheck, and the minimal fix (make dest a slice) leaves it unsafe (T needs to be Copy). If you just want to quick bang out code like this, you should probably use play.rust-lang.org to make sure it works.

But, anyway, let's use the version of that function that's actually in the standard library: https://github.com/rust-lang/rust/blob/master/src/libcore/slice.rs#L531

// This trait is implemented for slices,
// so it can be invoked like this:
// dest.copy_from_slice(src)
pub trait SliceExt {
    type Item;

    // [other slice methods redacted]

    #[stable(feature = "copy_from_slice", since = "1.9.0")]
    fn copy_from_slice(&mut self, src: &[Self::Item]) where Self::Item: Copy;
}

To list the guarantees in the OP:

1. The signature doesn't say anything about side effects. This will probably be a const function, once those exist.

2. Since this function returns nothing, there is nothing to say about the return value. Because of how &mut pointers work in Rust, returning pointers like that is not ergonomic.

3. Nothing src points to, directly or transitively, can be mutated, unless T contains a cell (the compiler can and already does determine this on an as-needed basis, and a human reader can usually ignore interior mutability because it's used for semantically meaningless things like reference counts).

4. self and src can't be saved, because they don't outlive the function invocation. The items behind it can't be saved, either, because Self::Item is a generic that might not live long enough (that feels like cheating, though, because it only works for generics or if the data type is deliberately engineered to not be 'static).

Unlike libc's memcpy (which is directly exposed, as part of the stable standard library as copy_nonoverlapping), the slice abstraction expresses that the length of the slice is within bounds of the underlying allocation. But D has slices, too, and probably has a version of this function, so that also feels like cheating.

This function signature *does* guarantee that src and self don't overlap, unlike the C and D versions. Personally, I think that's at least as important as whether the function's pure or not.

Here's a version of memcpy that's blatantly unidiomatic, but gets the same score on 1, 2, 3, and 4 as the slice version https://play.rust-lang.org/?gist=1f3a07987258500b8afd5a30e589457b:

unsafe fn copy_nonoverlapping_ref<T>(src: &T, dest: &mut T, len: usize) {
  std::ptr::copy_nonoverlapping(src, dest, len)
}

Again, it doesn't guarantee no side effects, it may guarantee that src isn't mutated, it does guarantee that they aren't stored away somewhere, and it guarantees that src and dest don't overlap. It's still unsafe, because it doesn't do anything about len being possibly out of bounds, and I left out the Copy bound for the sake of flexibility.
February 01, 2017
Tobias Müller <troplin@bluewin.ch> wrote:
> Walter Bright But if you still wanted a memcpy it would probably look like this:
> 
> fn memcpy<'a, T>(dest: &'a mut [T], src: &[T]) -> &'a mut [T]

No, sorry:

fn memcpy<'a, T: Copy>(dest: &'a mut [T], src: &[T]) -> &'a mut [T]

And mutable references can never alias, you have the same guarantees as with _restrict, statically checked even at the call site.

February 01, 2017
On Wednesday, 1 February 2017 at 17:28:28 UTC, Michael Howell wrote:
> This function signature *does* guarantee that src and self don't overlap, unlike the C and D versions. Personally, I think that's at least as important as whether the function's pure or not.

Oops, forgot the "restrict" keyword. It is there in the C and D versions.