swap (page 2)

Settings

Help

Index » DIP Ideas » swap (page 2)

5 days ago

Re: swap

Posted by Dennis
in reply to Dom DiSc

Permalink

Dennis

Posted in reply to Dom DiSc

Permalink

On Wednesday, 28 May 2025 at 06:06:25 UTC, Dom DiSc wrote:

This is because for some objects it may be very expensive or even not possible to create a temporary copy. This is avoided by the xor-construct.

Okay, but that's a separate problem. I'm just going to discuss the @safe interface of swap for simple pointer types, because if we throw performance and special type considerations in the mix it only gets more confusing.

But why is this limitation necessary? What can possibly go wrong?

Memory corruption in @safe code.

import std.stdio, std.algorithm;

void main() @safe
{
    string x = "hello";

    void replace(ref string x)
    {
        immutable(char)[4] buf = "bye";
        string y = buf[]; // y contains a reference to local variable `buf`
        swap(y, x); // y is assigned to x which has a longer lifetime than `buf`
    }

    writeln(x); // hello
    replace(x);
    writeln(x); // @�, (`buf` is freed, printing garbage memory)
}

Let me clarify a few things.

Ignoring bugs, D without dip1000 / scope pointers is already memory safe if you stick to the @safe subset. You can freely swap the values of any combination of local and global variables without problem. It's all @safe because pointers always point to global memory or GC managed heap memory, preventing use-after-free scenarios.

Now there is one scenario where memory gets freed but not because the GC or the end of the program triggered it: local variables. At the end of a function, the stack frame gets cleaned up, freeing all memory used for local variables in that function. Since you can only refer to a local variable name inside the function, direct access is always safe. A problem arises however when you keep a reference to the local variable alive either by returning a nested function, or creating a pointer to the variable.

To solve the case with nested functions, the compiler creates a closure with the GC so the stack gets promoted to heap memory. For pointers however, the compiler doesn't do that. It could, but stack pointers are usually created specifically for performance reasons. If you want GC memory you might as well use new int[] or an array literal instead of slicing a static array.

This does mean that suddenly we're dealing with pointers that can't freely be assigned to anything anymore, but only to things that remain in the same scope as the variable you took the address of. So all this talk about ref-counting, borrow-checking, scope pointers and whatnot is about allowing a new scenario where you have a variable that frees itself when it goes out of scope, but you want to allow creating a pointer or 'reference' to that variable which may not outlive the variable is was created from.

Again, D is already safe if you simply don't allow creating such references to objects that destruct themselves. You can already do safe manual memory management by allocating in the constructor, implementing a copy constructor, disabling field access, and freeing in the constructor, but that severely restricts use of the allocated payload.

For example, you can have a @safe reference counted string, but you can't pass the underlying char[] to writeln, you have to index every char manually and pass that. Why? Because writeln might assign the char[] to a global or something else that lives longer than the reference counted string.

Yes, but why does the compiler need to know that the two have been swapped?

It doesn't for regular pointers or non-pointers. swap(ref T x, ref T y) is always safe when T = int, but when T = string and x has a pointer to char bytes that are stack-allocated/reference counted/other non-gc pointer, and y is a variable that is still accessible after those char bytes have been destructed, then you get a dangling pointer, which is obviously not @safe.

5 days ago

Re: swap

Posted by Patrick Schluter
in reply to Dom DiSc

Permalink

Patrick Schluter

Posted in reply to Dom DiSc

Permalink

On Thursday, 22 May 2025 at 09:44:39 UTC, Dom DiSc wrote:

Someone said without ref-counting or borrow-check it is not possible to create a @safe swap function.

I think this is not true, so I propose that phobos should provide the following swap-function which is usable in @safe code:

/// Generic swap function without temporary variables
/// Types with indirections need to define their own specific swap functions.
void swap(T)(ref T x, ref T y) @trusted if(is(typeof(x ^= y)) || !hasIndirections!T || isPointer!T || isArray!T)
{
   static if(is(typeof(x ^= y))) { x ^= y; y ^= x; x ^= y; } // numeric types
   else // flat structs, simple pointers and arrays
   {
      ubyte[] ax = (cast(ubyte*)&x)[0 .. T.sizeof];
      ubyte[] ay = (cast(ubyte*)&y)[0 .. T.sizeof];
      ax[] ^= ay[]; ay[] ^= ax[]; ax[] ^= ay[];
   }
}

Never, ever use XOR swap. It is buggy and slower than a temp variable

@system unittest
{
// type with xor operator
int a = 100;
int b = -123_456;
swap(a,b);
assert(a == -123_456);
assert(b == 100);

// basic array
int[] c = [5,4,3,2,1];
int[] d = [6,6,6,6];
swap(c,d);
assert(c == [6,6,6,6]);
assert(d == [5,4,3,2,1]);

Try swap(c, c). The result is not what you expect. Instead of getting the original content, your xor swap erases the content.

The xor swap is also slower because each instruction depends on the preceding one which means that it will always take 3 cycles to execute one swap. With a temp variable, 2 of the instructions can be executed together making it possible to execute in 2 cycles.

5 days ago

Re: swap

Posted by Paul Backus
in reply to Dom DiSc

Permalink

Paul Backus

Posted in reply to Dom DiSc

Permalink

On Tuesday, 27 May 2025 at 09:40:04 UTC, Dom DiSc wrote:

After thinking a lot about this, I would say the third and forth assertion make only sense if by "local" it is meant "allocated on the stack", because swapping anything on the heap is always ok, no matter what lifetime the objects have.
And for taking the address of local objects I would say: if this is done, the object must be allocated on the heap, else it is always possible to run into problems, not only with swap.

Yes, if you simply forbid taking the address of a local object in the first place, then there is no problem, and swap can easily be made @safe, without requiring any weird tricks.

The people who said that a @safe swap function was impossible without borrow-checking or ref-counting were speaking specifically in the context of -preview=dip1000, which allows taking the address of a local variable in @safe code. It seems like you were not aware of that context.

4 days ago

Re: swap

Posted by Dom DiSc
in reply to Paul Backus

Permalink

Dom DiSc

Posted in reply to Paul Backus

Permalink

On Wednesday, 28 May 2025 at 16:29:52 UTC, Paul Backus wrote:

Yes, if you simply forbid taking the address of a local object in the first place, then there is no problem, and swap can easily be made @safe, without requiring any weird tricks.

I know. I thought, the new method to allow taking the address of a local variable was implemented by simply allocating them on the heap.
So, whenever the compiler sees & is used on locals, it will not allocate the affected variable on the stack.
But now I learned, that this is not true for local pointers. I see that this may be necessary for performance, but ok, than other measures like dip1000 are necessary to make this "safe".

It seems like you were not aware of that context.

I was, but obviously I didn't read the details of the implementation correct.
Sorry.

4 days ago

Re: swap

Posted by Dom DiSc
in reply to Patrick Schluter

Permalink

Dom DiSc

Posted in reply to Patrick Schluter

Permalink

On Wednesday, 28 May 2025 at 14:47:23 UTC, Patrick Schluter wrote:

Never, ever use XOR swap. It is buggy and slower than a temp variable
Try swap(c, c). The result is not what you expect. Instead of getting the original content, your xor swap erases the content.

I did expect this, but ok, I should forbid using the same object twice - doesn't make sense anyway.

This maybe true for the buildin types, but for the interesting cases (like structures with complicated constructors) making a temporary copy maybe very expensive or not possible at all.

4 days ago

Re: swap

Posted by Dom DiSc
in reply to Paul Backus

Permalink

Dom DiSc

Posted in reply to Paul Backus

Permalink

On Wednesday, 28 May 2025 at 16:29:52 UTC, Paul Backus wrote:

The people who said that a @safe swap function was impossible without borrow-checking or ref-counting were speaking specifically in the context of -preview=dip1000

Even in this context there is only one problematic case:
swapping stack pointers with heap pointers.
But this will always need special handling, and detecting this doesn't require ref-counting or borrow-checking. For the moment it should simply be forbidden. In such strange usecases a workaround by hand would be recommended anyway.

4 days ago

Re: swap

Posted by Richard (Rikki) Andrew Cattermole
in reply to Dom DiSc

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Dom DiSc

Permalink

I've been meaning to reply for a few days now.

Implementation wise, the body of the swap function is irrelevant. As others have stated. This can be @trusted code.

The issue in terms of safety is for calls to the function, not within the function.

Library code especially core functions like swap and move do not need to be proven safe, only provide a safe interface.

These two functions have the same problem, its the by-ref of the parameters.

By-ref parameters function as two separate parameters within lifetime tracking, an input and an output.

```d
T move(ref T input) => input;
```

Actually looks something more akin to:

```d
T move(ref T input, ref T output) {
	scope(exit)
		output = T.init;
	return input;
}
```

So what we need to track is if the parameter has been modified to contain a different object.

```d
T move(@same ref T input) => input;
```

When what we need it to be:

```d
T move(@different ref T input) {
	scope(exit)
		input = T.init;
	return input
}
```

This is used in the caller:

```d
int* ptr = ...;
// variable `ptr` is not null and contains object `1`

int* another = move(ptr);
// variable `ptr` is null
// variable `another` is not null and contains object `1`
```

The way the compiler knows about the ``another`` variable containing the old value of ``ptr`` is due to escape analysis. This is what DIP1000 is meant to solve (it doesn't for swap, but does for move).

So an updated signature: ``T move(@different return scope ref T input);``

Important to note that in escape analysis attributes like ``return scope`` have the relationship: input "contributes to" output, not: input "becomes" output.

I hope that clarifies things a bit.

3 days ago

Re: swap

Posted by Dom DiSc
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Dom DiSc

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Thursday, 29 May 2025 at 21:03:37 UTC, Richard (Rikki) Andrew Cattermole wrote:
> The way the compiler knows about the ``another`` variable containing the old value of ``ptr`` is due to escape analysis. This is what DIP1000 is meant to solve (it doesn't for swap, but does for move).

Yes, dip1000 cannot check the correctness of functions with two (or more) writable ref-parameters. But the funny fact is: It doesn't need to.
If both are heap-objects or both stack-objects, nothing bad can happen.

So we only need to mark the swap function @trusted to prevent dip1000 from analysing it, and guarantee by an assert within the swap-function
that not one object is on the heap and the other on the stack (and that both are different and not NULL).
That's all we need to do to make the interface @safe.

So, swap is one of the cases where we need @trusted, because the compiler cannot prove it's @safe, but it is.

3 days ago

Re: swap

Posted by Dom DiSc
in reply to Patrick Schluter

Permalink

Dom DiSc

Posted in reply to Patrick Schluter

Permalink

On Wednesday, 28 May 2025 at 14:47:23 UTC, Patrick Schluter wrote:

Never, ever use XOR swap. It is buggy and slower than a temp variable

This heavily depends on the time, the copy-constructor needs.
My final version now looks so:

/// Generic swap function
/// need to be @trusted, because the current escape-analysis cannot prove its @safe
/// swap for objects with indirections need to handle all self-references separately
/// (if they have none, this swap function would work correct on them,
/// but there is no easy way to check for that).
void swap(T)(ref T x, ref T y) @trusted if(!hasIndirections!T || isPointer!T || isArray!T)
{
   assert(!(x is y), "cannot swap something with itself");
   static if(!isScalarType!T)
   {
      if(!x || !y) assert(0, "can swap only constructed objects");
      assert(!(__traits(isStackobject, x) ^ __traits(isStackobject, y)),
             "cannot swap heap-objects with stack-objects");
   }
   static if(!hasElaborateCopyConstructor!T) // standard implementation
   {
      T tmp = x;
      x = y;
      y = tmp;
   }
   else // avoid temporary copy, as this may be expensive (or not possible at all)
   {
      ubyte[] ax = (cast(ubyte*)&x)[0 .. T.sizeof];
      ubyte[] ay = (cast(ubyte*)&y)[0 .. T.sizeof];
      ax[] ^= ay[];
      ay[] ^= ax[];
      ax[] ^= ay[];
   }
}

Top | Forum index | About this forum

Forums