Jump to page: 1 2
Thread overview
Implicit conversion of unique objects to mutable and immutable
Jun 21, 2011
Ali Çehreli
Jun 21, 2011
bearophile
Jun 22, 2011
Ali Çehreli
Jun 22, 2011
Ali Çehreli
Jun 22, 2011
Timon Gehr
Jun 22, 2011
Ali Çehreli
Jun 21, 2011
Jonathan M Davis
Jun 21, 2011
Ali Çehreli
Jun 22, 2011
Jonathan M Davis
Jun 22, 2011
Ali Çehreli
June 21, 2011
(Note: I have a feeling that this must be related to the old 'unique' discussions, which I had somehow managed to stay out of. If so, I apologize for repeating an old story.)

It is most useful for a function to return the most mutable type unless there is good reason not to. Do you agree? foo() makes a fresh result and returns it:

char[] foo()
{
    char[] result;
    return result;
}

The caller can get the result as a char[] and modify it:

    char[] result = foo();
    result[0] = 'A';         // assume a valid element

But what if the caller wants to treat the returned value as immutable:

    string result = foo();

That's currently a compilation error: "Error: cannot implicitly convert expression (foo()) of type char[] to string"

When it makes sense, we have std.exception.assumeUnique() for the caller to use:

import std.exception;

// ...

    char[] result = foo();
    string immutable_result = assumeUnique(result);

But that's the wrong thing to do, as foo() may be changed in the future to return a non-unique result. I think the language should have the 'unique' qualifier so that foo() could use it on the return type:

unique char[] foo()
{
    // ...
}

and the unique objects could be converted to immutable or mutable implicitly:

    string immutable_result = foo();   // fine
    char[] mutable_result = foo();     // fine too

Is this whole thing the same as that old 'unique' discussion? :)

But anyway... How do you solve the problem of deciding the return type of such functions?

Thank you,
Ali
June 21, 2011
Ali Çehreli:

>     char[] result = foo();
>     string immutable_result = assumeUnique(result);
> 
> But that's the wrong thing to do, as foo() may be changed in the future to return a non-unique result.

I think assumeUnique is meant to be used inside foo(), at its end.

Somewhere in Bugzilla (I don't remember the number, sorry, suggestions welcome) there is a bug report (that I think Walter has vaguely accepted, but it's not implemented yet) that asks results of pure functions to be implicitly cast-able to immutable.

This surely doesn't solve the whole situation, but it's a start, and it's a simple but useful thing.

Bye,
bearophile
June 21, 2011
On 2011-06-21 15:25, Ali Çehreli wrote:
> (Note: I have a feeling that this must be related to the old 'unique' discussions, which I had somehow managed to stay out of. If so, I apologize for repeating an old story.)
> 
> It is most useful for a function to return the most mutable type unless there is good reason not to. Do you agree?

No. In a lot of cases, what you generally want is immutable, not mutable. In particular, if you're talking about strings, we favor string, wstring, and dstring over char[], wchar[], or dchar[] unless you actually need to alter the string. The default is to use immutable. You save memory, it works well with threading, and it increases the opportunites for optimization. There's generally little benefit to having strings be mutable (which is why string is an alias to immutable(char)[], not char[]). char[], wchar, and dchar[] are there if you need them (and obviously sometimes you do), but in general, immutable is preferred, because it's generally more efficient.

Now, outside of strings, it varies a lot more as to whether immutable, const or mutable is preferred, since there are plenty of other types which get mutated all of the time. But there are still huge advantages to const and immutable in the general case, so there are plenty of situations where defaulting to the least mutable type possible is the best solution. It does vary depending on the situation, however.

> foo() makes a fresh result and
> returns it:
> 
> char[] foo()
> {
> char[] result;
> return result;
> }
> 
> The caller can get the result as a char[] and modify it:
> 
> char[] result = foo();
> result[0] = 'A'; // assume a valid element
> 
> But what if the caller wants to treat the returned value as immutable:
> 
> string result = foo();
> 
> That's currently a compilation error: "Error: cannot implicitly convert
> expression (foo()) of type char[] to string"
> 
> When it makes sense, we have std.exception.assumeUnique() for the caller
> to use:
> 
> import std.exception;
> 
> // ...
> 
> char[] result = foo();
> string immutable_result = assumeUnique(result);
> 
> But that's the wrong thing to do, as foo() may be changed in the future to return a non-unique result. I think the language should have the 'unique' qualifier so that foo() could use it on the return type:
> 
> unique char[] foo()
> {
> // ...
> }
> 
> and the unique objects could be converted to immutable or mutable implicitly:
> 
> string immutable_result = foo(); // fine
> char[] mutable_result = foo(); // fine too
> 
> Is this whole thing the same as that old 'unique' discussion? :)

It relates - particularly if foo is strongly pure, because then it's guaranteed that the result of foo is either unique or it's immutable and therefore it doesn't matter.

> But anyway... How do you solve the problem of deciding the return type of such functions?

Generally, you pick. Normally, we favor string over char[]. Immutability is generally favored over mutability when it comes to strings. The major elements which affect what a string function returns though are the function's arguments and what the function actually does. If the function is going to allocate a string, then it's generally going to return immutable. But if it's going to return a slice, then its return type is going to depend on what you passed in (in which case, the function is templated or it just uses const).

Of course, you can always templatize the function and have it default to whatever string type you want to be the default, and then pass the type that you want if you want a different one.

string s = foo!string();
char[] t = foo!(char[])();

But I don't think that that's particularly normal thing to do. Phobos generally either just uses string, or the type of string that it returns depends on its arguments (generally because it's going to return a slice rather than a new string).

- Jonathan M Davis
June 21, 2011
On Tue, 21 Jun 2011 23:02:43 +0000, Jonathan M Davis wrote:

> On 2011-06-21 15:25, Ali Çehreli wrote:
>> (Note: I have a feeling that this must be related to the old 'unique' discussions, which I had somehow managed to stay out of. If so, I apologize for repeating an old story.)
>> 
>> It is most useful for a function to return the most mutable type unless there is good reason not to. Do you agree?
> 
> No. In a lot of cases, what you generally want is immutable, not mutable. In particular, if you're talking about strings, we favor string, wstring, and dstring over char[], wchar[], or dchar[] unless you actually need to alter the string.

I agree with all of that, but it should not be the function that restricts the caller. The function should return immutable only when the result is really immutable.

If the returned object is mutable and unique, it is pretentious of the function to evangelically return immutable. :) Besides, the caller can't know whether it was actually mutable but was returned as immutable because it was the right thing to do. Otherwise the caller could cast to mutable to save a copy.

> immutable is preferred, because it's generally more efficient.

Not in this case because the caller must make a copy to mutate further. If the function could return 'unique mutable' and if that could be casted to immutable implicitly, then all would be fine.

Ali
June 22, 2011
On Tue, 21 Jun 2011 19:04:11 -0400, bearophile wrote:

> Ali Çehreli:
> 
>>     char[] result = foo();
>>     string immutable_result = assumeUnique(result);
>> 
>> But that's the wrong thing to do, as foo() may be changed in the future to return a non-unique result.
> 
> I think assumeUnique is meant to be used inside foo(), at its end.

You're right: even the documentation example of assumeUnique does that.

But I think the problem here is at the interface. The type conversions that foo() applies internally doesn't change the original question of whether to return mutable or immutable. My point is that foo() should not care, as the data is already mutable. Why force the caller to make a copy. But I understand that the language doesn't help here.

I wonder whether a UniqueRef object could be returned, which could allow a single casting of its data to mutable or immutable at the call site. Further casts could throw, but that would be a runtime solution. :-/

Wait! :) I just grepped and learned about std.typecons.Unique. Ok, that's the idea but it is not fully implemented yet or doesn't work... Comments say "doesn't work yet". :-/

Ali
June 22, 2011
On 2011-06-21 16:50, Ali Çehreli wrote:
> On Tue, 21 Jun 2011 23:02:43 +0000, Jonathan M Davis wrote:
> > On 2011-06-21 15:25, Ali Çehreli wrote:
> >> (Note: I have a feeling that this must be related to the old 'unique' discussions, which I had somehow managed to stay out of. If so, I apologize for repeating an old story.)
> >> 
> >> It is most useful for a function to return the most mutable type unless there is good reason not to. Do you agree?
> > 
> > No. In a lot of cases, what you generally want is immutable, not mutable. In particular, if you're talking about strings, we favor string, wstring, and dstring over char[], wchar[], or dchar[] unless you actually need to alter the string.
> 
> I agree with all of that, but it should not be the function that restricts the caller. The function should return immutable only when the result is really immutable.
> 
> If the returned object is mutable and unique, it is pretentious of the function to evangelically return immutable. :) Besides, the caller can't know whether it was actually mutable but was returned as immutable because it was the right thing to do. Otherwise the caller could cast to mutable to save a copy.
> 
> > immutable is preferred, because it's generally more efficient.
> 
> Not in this case because the caller must make a copy to mutate further. If the function could return 'unique mutable' and if that could be casted to immutable implicitly, then all would be fine.

As I said, in Phobos, when a new string must be allocated, the type returned is generally string (so it's immutable), whereas if the result of the function is likely to be slice of the string passed in, then the function is templated to return the same type of string as was passed to it. There's no extraneous copying going on in order to return string. The _only_ case where it's arguably forced on the caller is in the case where a copy _must_ be made, in which case the function returns string rather than char[].

- Jonathan M Davis
June 22, 2011
On Wed, 22 Jun 2011 00:31:19 +0000, Jonathan M Davis wrote:

> As I said, in Phobos, when a new string must be allocated, the type returned is generally string (so it's immutable), whereas if the result of the function is likely to be slice of the string passed in, then the function is templated to return the same type of string as was passed to it.

I am talking about a freshly manufactured string. There is no parameter that is passed to the function in my example. If there is, the function probably takes them by const char[] so that any type of string can be used.

Let's assume that this function just uses a and b to produce the result:

char[] foo(const char[] a, const char[] b)
{
    // ...
    return fresh_mutable_result;
}

Taking 'const char[]' is the most useful because mutable and immutable can be passed. The function is not insisting on immutability because there is no reason for it to require that.

The return type is similar: The function should not insist on immutability of the result if the result is not really immutable.

> There's no extraneous copying going on in order to return string.

Agreed: No copy when returning. But there will be a copy when the caller wants to further mutate the result:

import std.exception;

string foo()
{
    char[] result;
    return assumeUnique(result);    // <-- no copy here
}

void main()
{
    char[] s = foo().dup;    // <-- COPY through .dup here
}

The caller cannot safely cast the string to char[] because there is no guarantee other than documentation (which may be outdated as the function is modified in the future).

> The _only_ case where it's arguably forced on the caller is in the case where a copy _must_ be made, in which case the function returns string rather than char[].

No. In the above code there is no need for a copy because the returned string is actually mutable. But the caller must make a copy because of not being certain whether the returned string is really mutable or immutable.

> - Jonathan M Davis

Ali
June 22, 2011
On Wed, 22 Jun 2011 00:02:55 +0000, Ali Çehreli wrote:

> I wonder whether a UniqueRef object could be returned, which could allow a single casting of its data to mutable or immutable at the call site. Further casts could throw, but that would be a runtime solution. :-/

An extremely rough five-minute attempt just to show the idea:

import std.stdio;
import std.exception;

struct UniqueMutable(T)
{
    T data;
    bool is_used;

    this(ref T data)
    {
        this.is_used = false;
        this.data = data;
        data = null;
    }

    T as_mutable()
    {
        return as_impl!(T)();
    }

    immutable(T) as_immutable()
    {
        return as_impl!(immutable(T))();
    }

    private ConvT as_impl(ConvT)()
    {
        enforce(!is_used);
        ConvT result = cast(ConvT)(data);
        data = null;
        is_used = true;
        return result;
    }
}

UniqueMutable!T unique_mutable(T)(ref T data)
{
    return UniqueMutable!T(data);
}

/* Now foo() clearly documents that the result is unique and mutable */
UniqueMutable!(char[]) foo()
{
    char[] result = "hello".dup;
    result ~= " world";
    return unique_mutable(result);
}

void main()
{
    /* The user can treat it as mutable because it is mutable ... */
    char[] mutable_result = foo().as_mutable;
    mutable_result[0] = 'H';

    /* ... or as immutable from this point on */
    string immutable_result = foo().as_immutable;
}

Ali
June 22, 2011
On Tue, 21 Jun 2011 20:31:19 -0400, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On 2011-06-21 16:50, Ali Çehreli wrote:
>> On Tue, 21 Jun 2011 23:02:43 +0000, Jonathan M Davis wrote:
>> > On 2011-06-21 15:25, Ali Çehreli wrote:
>> >> (Note: I have a feeling that this must be related to the old 'unique'
>> >> discussions, which I had somehow managed to stay out of. If so, I
>> >> apologize for repeating an old story.)
>> >>
>> >> It is most useful for a function to return the most mutable type  
>> unless
>> >> there is good reason not to. Do you agree?
>> >
>> > No. In a lot of cases, what you generally want is immutable, not
>> > mutable. In particular, if you're talking about strings, we favor
>> > string, wstring, and dstring over char[], wchar[], or dchar[] unless  
>> you
>> > actually need to alter the string.
>>
>> I agree with all of that, but it should not be the function that
>> restricts the caller. The function should return immutable only when the
>> result is really immutable.
>>
>> If the returned object is mutable and unique, it is pretentious of the
>> function to evangelically return immutable. :) Besides, the caller can't
>> know whether it was actually mutable but was returned as immutable
>> because it was the right thing to do. Otherwise the caller could cast to
>> mutable to save a copy.
>>
>> > immutable is preferred, because it's generally more efficient.
>>
>> Not in this case because the caller must make a copy to mutate further.
>> If the function could return 'unique mutable' and if that could be casted
>> to immutable implicitly, then all would be fine.
>
> As I said, in Phobos, when a new string must be allocated, the type returned
> is generally string (so it's immutable), whereas if the result of the function
> is likely to be slice of the string passed in, then the function is templated
> to return the same type of string as was passed to it. There's no extraneous
> copying going on in order to return string. The _only_ case where it's
> arguably forced on the caller is in the case where a copy _must_ be made, in
> which case the function returns string rather than char[].

The issue is that newly-allocated data can be considered unique.  It would be advantageous for this to implicitly cast to mutable *or* immutable.  In other words, imposing the immutable limitation is artificial, and in some cases makes things less usable.

Think about dup and idup.  Wouldn't it be more straightforward if you just did dup and assigned it to whatever type you wanted?  Wouldn't it be more universal if all functions that create new data didn't have to have a mutable and immutable version?

If it's a slice of the input, it should be using inout, but inout is broken.

I think the clear path here is to use strong-pure functions to implicitly cast to immutable.  Such functions would have to return char[] because if it returns const(char)[] or immutable(char)[], it's possible the result is a slice of the input.

For example, the following function can only return a newly-allocated char[] (without using casts):

pure char[] foo(immutable(char)[] arg1, const(char)[] arg2)) {...}

So it is safe to assume the return is unique.

To answer Ali, we can add unique to the language, but this adds yet another type of const (unique has to do with const).  This might not go over well in a language with already 4 types of const (const, mutable, immutable, inout).  Where it does help quite a bit is for sharing mutable data, but of course, we can start building library constructs that allow that.

-Steve
June 22, 2011
On Wed, 22 Jun 2011 00:02:55 +0000, Ali Çehreli wrote:

> I wonder whether a UniqueRef object could be returned, which could allow a single casting of its data to mutable or immutable at the call site. Further casts could throw, but that would be a runtime solution. :-/

Further casts should return null.


Timon
« First   ‹ Prev
1 2