November 06, 2012
On Tuesday, 6 November 2012 at 04:49:45 UTC, Tommi wrote:
> On Tuesday, 6 November 2012 at 04:31:56 UTC, H. S. Teoh wrote:
>> The problem is that you can't do this in generic code, because generic code by definition doesn't know how to copy an arbitrary type.
>
> I'm not familiar with that definition of generic code. But I do feel that there's a pretty big problem with a language design if the language doesn't provide a generic way to make a copy of a variable. To be fair, e.g. C++ doesn't provide that either.



C++ as a language doesn't, but if you follow the convention that C++ establishes in its libraries (where copy assignment & copy construction == deep copy), it always works out correctly.

D doesn't have that convention so that's why we're running into trouble.
November 06, 2012
On 11/6/12 4:36 AM, H. S. Teoh wrote:
> Hmm. Another idea just occurred to me. The basic problem here is that we
> are conflating two kinds of values, transient and persistent, under a
> single name .front. What if we explicitly name them? Say, .copyFront for
> the non-transient value and .refFront for the transient value (the
> names are unimportant right now, let's consider the semantics of it).

We could transfer that matter to the type of .front itself, i.e. define a function copy(x) that returns e.g. x for for string and x.dup for char[] etc. There would be problems on e.g. defining copy for structs with pointer and class reference fields etc.

One quite simple approach would be to define (on the contrary) .peekFront, which means "yeah, I'd like to take a peek at the front but I don't plan to store it anywhere". That would entail we define eachLine etc. to return string from .front and char[] from .peekFront, and deprecate byLine.


Andrei
November 06, 2012
On 11/6/12 6:49 AM, Tommi wrote:
> On Tuesday, 6 November 2012 at 04:31:56 UTC, H. S. Teoh wrote:
>> The problem is that you can't do this in generic code, because generic
>> code by definition doesn't know how to copy an
>> arbitrary type.
>
> I'm not familiar with that definition of generic code. But I do feel
> that there's a pretty big problem with a language design if the language
> doesn't provide a generic way to make a copy of a variable. To be fair,
> e.g. C++ doesn't provide that either.

Languages commonly have trouble defining comparison and copying generically. More often than not user intervention is needed (e.g. see Java's clone, Lisp's many comparison operators etc).

Andrei

November 06, 2012
On 11/6/12 7:48 AM, H. S. Teoh wrote:
> On Tue, Nov 06, 2012 at 05:49:44AM +0100, Tommi wrote:
>> On Tuesday, 6 November 2012 at 04:31:56 UTC, H. S. Teoh wrote:
>>> The problem is that you can't do this in generic code, because
>>> generic code by definition doesn't know how to copy an arbitrary
>>> type.
>>
>> I'm not familiar with that definition of generic code. But I do feel
>> that there's a pretty big problem with a language design if the
>> language doesn't provide a generic way to make a copy of a variable.
>> To be fair, e.g. C++ doesn't provide that either.
>
> OK I worded that poorly. All I meant was that currently, there is no
> generic way to make a copy of something. It could be construed to be a
> bug or a language deficiency, but that's how things are currently.
>
> One *could* introduce a new language construct for making a copy of
> something, of course, but that leads to all sorts of issues about
> implicit allocation, how to eliminate unnecessary implicit copying,
> etc.. It's not a simple problem, in spite of the simplicity of stating
> the problem.

Here's where user defined @tributes would help a lot. We'd then define @owned to mention that a class reference field inside an object must be duplicated upon copy:

class A { ... }

struct B
{
   @owned A payload;
   A another;
   ...
}

That way a generic clone() routine could be written.


Andrei
November 06, 2012
On Tuesday, November 06, 2012 08:49:26 Andrei Alexandrescu wrote:
> On 11/6/12 4:36 AM, H. S. Teoh wrote:
> > Hmm. Another idea just occurred to me. The basic problem here is that we are conflating two kinds of values, transient and persistent, under a single name .front. What if we explicitly name them? Say, .copyFront for the non-transient value and .refFront for the transient value (the names are unimportant right now, let's consider the semantics of it).
> 
> We could transfer that matter to the type of .front itself, i.e. define a function copy(x) that returns e.g. x for for string and x.dup for char[] etc. There would be problems on e.g. defining copy for structs with pointer and class reference fields etc.
> 
> One quite simple approach would be to define (on the contrary) .peekFront, which means "yeah, I'd like to take a peek at the front but I don't plan to store it anywhere". That would entail we define eachLine etc. to return string from .front and char[] from .peekFront, and deprecate byLine.

peekFront would work better than copy, because whether front needs to be copied or not doesn't necessarily have much to do with its type. For instance, byLine/eachLine can return char[] from front just fine and still have it be a new array every time. So, while in some cases, you can tell from the type that no copy is needed (e.g. string), you can't tell in the general case and would be forced to make needless copies in a number of cases. For instance, every range of class objects would end up having to make a copy, because they'd be mutable reference types, and without knowing that the range is doing, you have no way of knowing whether it keeps replacing the objects referred to by front or not (it's not particularly likely that it would be, but you can't tell for sure just the same).

If we defined peekFront via UFCS as a wrapper which calls front, then anything wanting to use peekFront could use peekFront regardless of whether the type defined it or not. So, that would reduce the impact caused by its introduction, but it would still impact a lot of range types ultimately, because we'd have to create appropriate wrappers for peekFront in most of them, or we'd end up making unnecessary copies.

I don't like how much this impacts, but as H. S. Teoh points out, we don't exactly have very many options with minimal impact beyond banning transient fronts entirely (which I'd honestly like to do just the same).

At least this avoids the need to create more traits to test ranges for, since if we create a free function peekFront, all range types can just assume that it's there and create wrappers for it without caring whether the wrapped range defines it itself or uses the free function. And it's less complicated than the .transient suggestion. Though it _does_ introduce the possibility of front and peekFront returning completely different types, which could complicate things a bit. It might be better to require that they be identical to avoid that problem.

For better or worse though, this approach would mean that byLine (or eachLine or whatever) wouldn't be reusing the buffer with foreach like they do now, though I suppose that you could make them have opApply which does the same thing as now (meaning that it effectively uses peekFront), and then any range- based functions would use front until they were updated to use peekFront if appropriate. But then again, maybe we want byLine/eachLine to copy by default, since that's safer, much as it's less efficient, since then we have safe by default but still have an explicit means to be more efficient. That fits in well with our general approach.

peekFront may be the way to go, but I think that we need to think through the consequences (like the potential problems caused by front and peekFront returning different types) before we decide on this.

- Jonathan M Davis
November 06, 2012
Le 06/11/2012 07:56, Andrei Alexandrescu a écrit :
> On 11/6/12 7:48 AM, H. S. Teoh wrote:
>> On Tue, Nov 06, 2012 at 05:49:44AM +0100, Tommi wrote:
>>> On Tuesday, 6 November 2012 at 04:31:56 UTC, H. S. Teoh wrote:
>>>> The problem is that you can't do this in generic code, because
>>>> generic code by definition doesn't know how to copy an arbitrary
>>>> type.
>>>
>>> I'm not familiar with that definition of generic code. But I do feel
>>> that there's a pretty big problem with a language design if the
>>> language doesn't provide a generic way to make a copy of a variable.
>>> To be fair, e.g. C++ doesn't provide that either.
>>
>> OK I worded that poorly. All I meant was that currently, there is no
>> generic way to make a copy of something. It could be construed to be a
>> bug or a language deficiency, but that's how things are currently.
>>
>> One *could* introduce a new language construct for making a copy of
>> something, of course, but that leads to all sorts of issues about
>> implicit allocation, how to eliminate unnecessary implicit copying,
>> etc.. It's not a simple problem, in spite of the simplicity of stating
>> the problem.
>
> Here's where user defined @tributes would help a lot. We'd then define
> @owned to mention that a class reference field inside an object must be
> duplicated upon copy:
>
> class A { ... }
>
> struct B
> {
> @owned A payload;
> A another;
> ...
> }
>
> That way a generic clone() routine could be written.
>

You mentioned me once that AOP was useless in D.
November 06, 2012
Le 06/11/2012 07:49, Andrei Alexandrescu a écrit :
> On 11/6/12 4:36 AM, H. S. Teoh wrote:
>> Hmm. Another idea just occurred to me. The basic problem here is that we
>> are conflating two kinds of values, transient and persistent, under a
>> single name .front. What if we explicitly name them? Say, .copyFront for
>> the non-transient value and .refFront for the transient value (the
>> names are unimportant right now, let's consider the semantics of it).
>
> We could transfer that matter to the type of .front itself, i.e. define
> a function copy(x) that returns e.g. x for for string and x.dup for
> char[] etc. There would be problems on e.g. defining copy for structs
> with pointer and class reference fields etc.
>
> One quite simple approach would be to define (on the contrary)
> .peekFront, which means "yeah, I'd like to take a peek at the front but
> I don't plan to store it anywhere". That would entail we define eachLine
> etc. to return string from .front and char[] from .peekFront, and
> deprecate byLine.
>

This have the same issue than .transient have in regard of transformer ranges (BTW what is the correct terminology for that ?).
November 06, 2012
Le 06/11/2012 08:17, Jonathan M Davis a écrit :
> If we defined peekFront via UFCS as a wrapper which calls front, then anything
> wanting to use peekFront could use peekFront regardless of whether the type
> defined it or not. So, that would reduce the impact caused by its introduction,
> but it would still impact a lot of range types ultimately, because we'd have
> to create appropriate wrappers for peekFront in most of them, or we'd end up
> making unnecessary copies.
>

Assuming you call front on the source range, you don't need to copy : you already work on a clean copy.

With that one, you ends up duplicating all code that use front on wrapper range into one that use peekFront as well. I'm not sure this is better than .transient, but maybe.

> I don't like how much this impacts, but as H. S. Teoh points out, we don't
> exactly have very many options with minimal impact beyond banning transient
> fronts entirely (which I'd honestly like to do just the same).
>

If we really want to go simple, this seems like the way to go.

> At least this avoids the need to create more traits to test ranges for, since
> if we create a free function peekFront, all range types can just assume that
> it's there and create wrappers for it without caring whether the wrapped range
> defines it itself or uses the free function. And it's less complicated than the
> .transient suggestion. Though it _does_ introduce the possibility of front and
> peekFront returning completely different types, which could complicate things a
> bit. It might be better to require that they be identical to avoid that
> problem.
>

I'm not sure yet this is simpler.

> For better or worse though, this approach would mean that byLine (or eachLine
> or whatever) wouldn't be reusing the buffer with foreach like they do now,
> though I suppose that you could make them have opApply which does the same
> thing as now (meaning that it effectively uses peekFront), and then any range-
> based functions would use front until they were updated to use peekFront if
> appropriate. But then again, maybe we want byLine/eachLine to copy by default,
> since that's safer, much as it's less efficient, since then we have safe by
> default but still have an explicit means to be more efficient. That fits in well
> with our general approach.
>

Safe by default seems like the way to go.

> peekFront may be the way to go, but I think that we need to think through the
> consequences (like the potential problems caused by front and peekFront
> returning different types) before we decide on this.
>

Yes especially string/char[] has already been detected as a potential problem.
November 06, 2012
On 11/6/12 3:50 PM, deadalnix wrote:
> Le 06/11/2012 07:56, Andrei Alexandrescu a écrit :
>> On 11/6/12 7:48 AM, H. S. Teoh wrote:
>>> On Tue, Nov 06, 2012 at 05:49:44AM +0100, Tommi wrote:
>>>> On Tuesday, 6 November 2012 at 04:31:56 UTC, H. S. Teoh wrote:
>>>>> The problem is that you can't do this in generic code, because
>>>>> generic code by definition doesn't know how to copy an arbitrary
>>>>> type.
>>>>
>>>> I'm not familiar with that definition of generic code. But I do feel
>>>> that there's a pretty big problem with a language design if the
>>>> language doesn't provide a generic way to make a copy of a variable.
>>>> To be fair, e.g. C++ doesn't provide that either.
>>>
>>> OK I worded that poorly. All I meant was that currently, there is no
>>> generic way to make a copy of something. It could be construed to be a
>>> bug or a language deficiency, but that's how things are currently.
>>>
>>> One *could* introduce a new language construct for making a copy of
>>> something, of course, but that leads to all sorts of issues about
>>> implicit allocation, how to eliminate unnecessary implicit copying,
>>> etc.. It's not a simple problem, in spite of the simplicity of stating
>>> the problem.
>>
>> Here's where user defined @tributes would help a lot. We'd then define
>> @owned to mention that a class reference field inside an object must be
>> duplicated upon copy:
>>
>> class A { ... }
>>
>> struct B
>> {
>> @owned A payload;
>> A another;
>> ...
>> }
>>
>> That way a generic clone() routine could be written.
>>
>
> You mentioned me once that AOP was useless in D.

What's the connection?

Andrei
November 06, 2012
Le 06/11/2012 15:44, Andrei Alexandrescu a écrit :
>>> Here's where user defined @tributes would help a lot. We'd then define
>>> @owned to mention that a class reference field inside an object must be
>>> duplicated upon copy:
>>>
>>> class A { ... }
>>>
>>> struct B
>>> {
>>> @owned A payload;
>>> A another;
>>> ...
>>> }
>>>
>>> That way a generic clone() routine could be written.
>>>
>>
>> You mentioned me once that AOP was useless in D.
>
> What's the connection?
>

This is OT. But this @owned stuff coupled with code that is generated depending on its presence IS AOP.