October 30, 2012
On Tuesday, October 30, 2012 15:29:17 H. S. Teoh wrote:
> The problem, of course, is how to check of a range has transient .front or not. I proposed adding a .isTransient member to the range, but this depends on the range implementor to remember to mark the range with .isTransient=true, and we know that coding by convention is not scalable. So here's another idea:
> 
> template isTransient(R) if (isInputRange!R)
> {
> static if (is(typeof(R.front) : immutable(typeof(R.front))))
> enum isTransient = false;
> else
> enum isTransient = true;
> }
> 
> The idea is that value types are implicitly convertible to immutable, and value types are non-transient. Reference types, if they can be made immutable, ensures that calling popFront() will never invalidate them (by definition of immutable).

This won't work at all. All that's required is for you to have a range of class objects (e.g Object[]), and suddenly it would be claiming that the range was transient even though it isn't. Whether the return type is immutable really doesn't have anything to do with whether or not the result of front will be valid after popFront is called - or at least the lack of immutable doesn't say anything about it.

I still posit that this sort of range is extremely rare and abnormal, so if we do anything to support them, I think that it's perfectly fair that they have to do something to mark themselves as transient rather than trying to determine it externally - especially because I don't think that it's really possible to determine it externally.

- Jonathan M davis
October 31, 2012
On Wed, Oct 31, 2012 at 12:00:50AM +0100, deadalnix wrote: [...]
> It appears to me that invalidating .front is done for performances reasons most of the time. As this is the unsafe face of the coin, I strongly think that this shouldn't be the default behavior.
> 
> To me the best solution is to provide a function or a property on ranges that provide the unsafe version of the range.
> 
> It is easy to provide a default implementation as UFCS that is a NOOP so no code is being broken.
> 
> With that design, it is possible to provide an always safe interface while allowing algorithm that can handle non persistent .front to run at full speed.

Hmm. I like this idea! This is much better than my proposal. So let's say we change byLine() to always return a copy of the buffer, so that .front is never invalidated. Then for algorithms that don't care about .front being invalidated, they can do something like this:

	auto myAlgo(R)(R inputRange) {
		auto r = inputRange.fastRange;
		while (!r.empty) {
			auto x = r.front;
			// make use of x
			r.popFront();	// this invalidates x
		}
		return result;
	}
	void main() {
		myAlgo(stdin.byLine());
	}

We can use UFCS so that fastRange is always defined:

	// Default implementation
	auto fastRange(R)(R range) {
		return range;
	}

For byLine(), then, we have:

	struct ByLine(...) {
		// safe implementation

		@property auto fastRange() {
			struct UnsafeByLine(...) {
				...
			}
			return UnsafeByLine(this);
		}
	}

I like this. Very clean, and doesn't break backward compatibility, and allows select algorithms to be optimized for speed without affecting everything else.


[...]
> If I use your permuting example, it should be done as follow :
> 
> struct InvalidatingAllPermutations(T) {
> 	T[] current, first;
> 	bool done;
> 
> 	this(T[] initialBuf) {
> 		first = initialBuf;
> 		current = first.dup;
> 	}
> 
> 	void popFront() {
> 		nextPermutation(current);
> 		if (equal(current, first))
> 			done = true;
> 	}
> 
> 	@property bool empty() {
> 		return !done;
> 	}
> 
> 	@property T[] front() {
> 		return current;
> 	}
> 
> 	auto save() {
> 		AllPermutations!T copy;
> 		copy.front = this.front;
> 		copy.first = this.first;
> 		copy.done = this.done;
> 		return copy;
> 	}
> }
> 
> struct AllPermutations(T) {
> 	InvalidatingAllPermutations!T innerRange;
> 	alias innerRange this;
> 
> 	T[] current;
> 
> 	this(T[] initialBuf) {
> 		innerRange = InvalidatingAllPermutations!T(initialBuf);
> 		current = innerRange.front.dup;
> 	}
> 
> 	@property T[] front() {
> 		return current;
> 	}
> 
> 	void popFront() {
> 		innerRange.popFront();
> 		current = innerRange.front.dup;
> 	}
> }
> 
> I do think this is the way that conciliate safe as default but still allow to be fast when needed, without adding much on most code.

+1. I like this better than my proposal.


T

-- 
Doubtless it is a good thing to have an open mind, but a truly open mind should be open at both ends, like the food-pipe, with the capacity for excretion as well as absorption. -- Northrop Frye
October 31, 2012
Le 31/10/2012 01:56, H. S. Teoh a écrit :
>
> +1. I like this better than my proposal.
>

The obvious drawback is that this make invalidating ranges harder to write. But they seems to be more the exception than the rule.

October 31, 2012
On Wed, Oct 31, 2012 at 02:18:55PM +0100, deadalnix wrote:
> Le 31/10/2012 01:56, H. S. Teoh a écrit :
> >
> >+1. I like this better than my proposal.
> >
> 
> The obvious drawback is that this make invalidating ranges harder to write. But they seems to be more the exception than the rule.

Actually, there is another problem: many algorithms' output will be transient or not depending on the input range. For example, we could write map to use the transient version of byLine, say, but then the range that map returns will also be transient (no longer safe).

IOW, transience of .front is transitive in some cases. This again makes things complicated: as soon as you use a single transient range, it makes downstream ranges transient as well. So we're back to the problem of how to mark the range as transient in a transparent way. :-(


T

-- 
VI = Visual Irritation
October 31, 2012
Le 31/10/2012 18:19, H. S. Teoh a écrit :
> On Wed, Oct 31, 2012 at 02:18:55PM +0100, deadalnix wrote:
>> Le 31/10/2012 01:56, H. S. Teoh a écrit :
>>>
>>> +1. I like this better than my proposal.
>>>
>>
>> The obvious drawback is that this make invalidating ranges harder to
>> write. But they seems to be more the exception than the rule.
>
> Actually, there is another problem: many algorithms' output will be
> transient or not depending on the input range. For example, we could
> write map to use the transient version of byLine, say, but then the
> range that map returns will also be transient (no longer safe).
>
> IOW, transience of .front is transitive in some cases. This again makes
> things complicated: as soon as you use a single transient range, it
> makes downstream ranges transient as well. So we're back to the problem
> of how to mark the range as transient in a transparent way. :-(
>

No it isn't always transitive. But that is true that it is in some cases.

This is why the code should default to the safe behavior. It is up to the programmer to ensure correctness when opting for an alternate behavior.
November 01, 2012
On Thu, Nov 01, 2012 at 12:05:18AM +0100, deadalnix wrote:
> Le 31/10/2012 18:19, H. S. Teoh a écrit :
[...]
> >Actually, there is another problem: many algorithms' output will be transient or not depending on the input range. For example, we could write map to use the transient version of byLine, say, but then the range that map returns will also be transient (no longer safe).
> >
> >IOW, transience of .front is transitive in some cases. This again makes things complicated: as soon as you use a single transient range, it makes downstream ranges transient as well. So we're back to the problem of how to mark the range as transient in a transparent way. :-(
> >
> 
> No it isn't always transitive. But that is true that it is in some cases.
> 
> This is why the code should default to the safe behavior. It is up to the programmer to ensure correctness when opting for an alternate behavior.

But then can the alternate behaviour be used at all? For example, joiner will return a range which will be transient if the original range is transient. But since this is "unsafe" behaviour, joiner's implementation can't take advantage of the faster behaviour at all, since otherwise the user, not knowing that it is switching to, say, byLine.fast instead of the safe version of byLine, may assume that the resulting range is non-transient, but it is actually transient.

So this nullifies the usefulness of having .fast, since many Phobos algorithms that would have been able to take advantage of .fast can't, because their result will be unsafe. In which case, is it even worth the bother to implement such a thing?

I think we still haven't addressed the fundamental issue, that is, we need a way to differentiate between transient and non-transient .front values. Being able to work with transient values is the best, because it automatically also works for non-transient values, but some algos need to be able to assume non-transience. This core problem is still unsolved.


T

-- 
Windows: the ultimate triumph of marketing over technology. -- Adrian von Bidder
November 01, 2012
Le 01/11/2012 05:30, H. S. Teoh a écrit :
> On Thu, Nov 01, 2012 at 12:05:18AM +0100, deadalnix wrote:
>> Le 31/10/2012 18:19, H. S. Teoh a écrit :
> [...]
>>> Actually, there is another problem: many algorithms' output will be
>>> transient or not depending on the input range. For example, we could
>>> write map to use the transient version of byLine, say, but then the
>>> range that map returns will also be transient (no longer safe).
>>>
>>> IOW, transience of .front is transitive in some cases. This again
>>> makes things complicated: as soon as you use a single transient
>>> range, it makes downstream ranges transient as well. So we're back to
>>> the problem of how to mark the range as transient in a transparent
>>> way. :-(
>>>
>>
>> No it isn't always transitive. But that is true that it is in some
>> cases.
>>
>> This is why the code should default to the safe behavior. It is up to
>> the programmer to ensure correctness when opting for an alternate
>> behavior.
>
> But then can the alternate behaviour be used at all? For example, joiner
> will return a range which will be transient if the original range is
> transient. But since this is "unsafe" behaviour, joiner's implementation
> can't take advantage of the faster behaviour at all, since otherwise the
> user, not knowing that it is switching to, say, byLine.fast instead of
> the safe version of byLine, may assume that the resulting range is
> non-transient, but it is actually transient.
>
> So this nullifies the usefulness of having .fast, since many Phobos
> algorithms that would have been able to take advantage of .fast can't,
> because their result will be unsafe. In which case, is it even worth the
> bother to implement such a thing?
>
> I think we still haven't addressed the fundamental issue, that is, we
> need a way to differentiate between transient and non-transient .front
> values. Being able to work with transient values is the best, because it
> automatically also works for non-transient values, but some algos need
> to be able to assume non-transience. This core problem is still
> unsolved.
>
>

Would joiner be able to take advantage of this if it was known or not ?
November 01, 2012
On Thu, Nov 01, 2012 at 03:18:13PM +0100, deadalnix wrote:
> Le 01/11/2012 05:30, H. S. Teoh a écrit :
> >On Thu, Nov 01, 2012 at 12:05:18AM +0100, deadalnix wrote:
> >>Le 31/10/2012 18:19, H. S. Teoh a écrit :
> >[...]
> >>>Actually, there is another problem: many algorithms' output will be transient or not depending on the input range. For example, we could write map to use the transient version of byLine, say, but then the range that map returns will also be transient (no longer safe).
> >>>
> >>>IOW, transience of .front is transitive in some cases. This again makes things complicated: as soon as you use a single transient range, it makes downstream ranges transient as well. So we're back to the problem of how to mark the range as transient in a transparent way. :-(
> >>>
> >>
> >>No it isn't always transitive. But that is true that it is in some cases.
> >>
> >>This is why the code should default to the safe behavior. It is up to the programmer to ensure correctness when opting for an alternate behavior.
> >
> >But then can the alternate behaviour be used at all? For example, joiner will return a range which will be transient if the original range is transient. But since this is "unsafe" behaviour, joiner's implementation can't take advantage of the faster behaviour at all, since otherwise the user, not knowing that it is switching to, say, byLine.fast instead of the safe version of byLine, may assume that the resulting range is non-transient, but it is actually transient.
> >
> >So this nullifies the usefulness of having .fast, since many Phobos algorithms that would have been able to take advantage of .fast can't, because their result will be unsafe. In which case, is it even worth the bother to implement such a thing?
> >
> >I think we still haven't addressed the fundamental issue, that is, we need a way to differentiate between transient and non-transient .front values. Being able to work with transient values is the best, because it automatically also works for non-transient values, but some algos need to be able to assume non-transience. This core problem is still unsolved.
> >
> >
> 
> Would joiner be able to take advantage of this if it was known or not ?

I have a working version of joiner that doesn't assume the persistence of .front. The thing is, if the range you hand it is transient, then the resulting joined range is also transient. This shows up in one of my unittests: I can't use writeln() because it assumes persistence of .front, and I can't use array() either for the same reason. I have to explicit write a foreach to .dup the elements of the range into an array before it can be safely used in writeln(), etc..

This makes it impossible for joiner to transparently switch to a fast implementation (call .fast on the original range), because in this case the transience is transitive -- the user will have to explicitly call joiner with .fast. I don't like that, because then how would you know which algos are safe to call with .fast? It's only by documentation, i.e., by convention, and is bound to be a hiding place for bugs later on.


T

-- 
Those who don't understand Unix are condemned to reinvent it, poorly.
November 01, 2012
This is up to the consumer to choose if .front can be oblitered on popFront, not to an intermediate algorithm.

joiner isn't a consumer, this is a « transformer ». transformer have to propagate the .fast (I don't like this name, but this is unimportant for now) to its source.

Let'(s consider the following sheme :
source -> transformer (possibly several of them) -> consumer.

Now see example below :

auto transform(R)(R range) {
    struct Transfomer(R) {
        // Operations . . .

        @property
        auto fast() {
            return transform(source.fast);
        }
    }
}

So the fast is used by the consumer and bubble throw all ranges that support it up to the source. Or is made a NOOP in the process in one of the transformers or the source do not support that optimization.

As said before, this is up to the consumer to know it it accept .front to be obliterated. In your case, writeln don't support that, so .fast must not be used with writeln.
November 02, 2012
On Thu, Nov 01, 2012 at 11:40:52PM +0100, deadalnix wrote:
> This is up to the consumer to choose if .front can be oblitered on popFront, not to an intermediate algorithm.
> 
> joiner isn't a consumer, this is a « transformer ». transformer have to propagate the .fast (I don't like this name, but this is unimportant for now) to its source.

What would be a better name for it?


> Let'(s consider the following sheme :
> source -> transformer (possibly several of them) -> consumer.
> 
> Now see example below :
> 
> auto transform(R)(R range) {
>     struct Transfomer(R) {
>         // Operations . . .
> 
>         @property
>         auto fast() {
>             return transform(source.fast);
>         }
>     }
> }
> 
> So the fast is used by the consumer and bubble throw all ranges that support it up to the source. Or is made a NOOP in the process in one of the transformers or the source do not support that optimization.
> 
> As said before, this is up to the consumer to know it it accept .front to be obliterated. In your case, writeln don't support that, so .fast must not be used with writeln.

Ah, I see. That makes sense. So basically it's not the source (or any intermediate step) that decides whether to use the optimization, but the final consumer.

Though now we have the issue that all intermediate ranges must propagate .fast, which is troublesome if every range has to do it manually. Can this be handled automatically by UFCS?


T

-- 
What doesn't kill me makes me stranger.
« First   ‹ Prev
1 2 3 4 5 6 7 8 9 10 11
Top | Discussion index | About this forum | D home