View mode: basic / threaded / horizontal-split · Log in · Help
October 30, 2012
Re: Transience of .front in input vs. forward ranges
On Tuesday, October 30, 2012 15:29:17 H. S. Teoh wrote:
> The problem, of course, is how to check of a range has transient .front
> or not. I proposed adding a .isTransient member to the range, but this
> depends on the range implementor to remember to mark the range with
> .isTransient=true, and we know that coding by convention is not
> scalable. So here's another idea:
> 
> template isTransient(R) if (isInputRange!R)
> {
> static if (is(typeof(R.front) : immutable(typeof(R.front))))
> enum isTransient = false;
> else
> enum isTransient = true;
> }
> 
> The idea is that value types are implicitly convertible to immutable,
> and value types are non-transient. Reference types, if they can be made
> immutable, ensures that calling popFront() will never invalidate them
> (by definition of immutable).

This won't work at all. All that's required is for you to have a range of 
class objects (e.g Object[]), and suddenly it would be claiming that the range 
was transient even though it isn't. Whether the return type is immutable 
really doesn't have anything to do with whether or not the result of front 
will be valid after popFront is called - or at least the lack of immutable 
doesn't say anything about it.

I still posit that this sort of range is extremely rare and abnormal, so if we 
do anything to support them, I think that it's perfectly fair that they have 
to do something to mark themselves as transient rather than trying to 
determine it externally - especially because I don't think that it's really 
possible to determine it externally.

- Jonathan M davis
October 31, 2012
Re: Transience of .front in input vs. forward ranges
On Wed, Oct 31, 2012 at 12:00:50AM +0100, deadalnix wrote:
[...]
> It appears to me that invalidating .front is done for performances
> reasons most of the time. As this is the unsafe face of the coin, I
> strongly think that this shouldn't be the default behavior.
> 
> To me the best solution is to provide a function or a property on
> ranges that provide the unsafe version of the range.
> 
> It is easy to provide a default implementation as UFCS that is a
> NOOP so no code is being broken.
> 
> With that design, it is possible to provide an always safe interface
> while allowing algorithm that can handle non persistent .front to
> run at full speed.

Hmm. I like this idea! This is much better than my proposal. So let's
say we change byLine() to always return a copy of the buffer, so that
.front is never invalidated. Then for algorithms that don't care about
.front being invalidated, they can do something like this:

	auto myAlgo(R)(R inputRange) {
		auto r = inputRange.fastRange;
		while (!r.empty) {
			auto x = r.front;
			// make use of x
			r.popFront();	// this invalidates x
		}
		return result;
	}
	void main() {
		myAlgo(stdin.byLine());
	}

We can use UFCS so that fastRange is always defined:

	// Default implementation
	auto fastRange(R)(R range) {
		return range;
	}

For byLine(), then, we have:

	struct ByLine(...) {
		// safe implementation

		@property auto fastRange() {
			struct UnsafeByLine(...) {
				...
			}
			return UnsafeByLine(this);
		}
	}

I like this. Very clean, and doesn't break backward compatibility, and
allows select algorithms to be optimized for speed without affecting
everything else.


[...]
> If I use your permuting example, it should be done as follow :
> 
> struct InvalidatingAllPermutations(T) {
> 	T[] current, first;
> 	bool done;
> 
> 	this(T[] initialBuf) {
> 		first = initialBuf;
> 		current = first.dup;
> 	}
> 
> 	void popFront() {
> 		nextPermutation(current);
> 		if (equal(current, first))
> 			done = true;
> 	}
> 
> 	@property bool empty() {
> 		return !done;
> 	}
> 
> 	@property T[] front() {
> 		return current;
> 	}
> 
> 	auto save() {
> 		AllPermutations!T copy;
> 		copy.front = this.front;
> 		copy.first = this.first;
> 		copy.done = this.done;
> 		return copy;
> 	}
> }
> 
> struct AllPermutations(T) {
> 	InvalidatingAllPermutations!T innerRange;
> 	alias innerRange this;
> 
> 	T[] current;
> 
> 	this(T[] initialBuf) {
> 		innerRange = InvalidatingAllPermutations!T(initialBuf);
> 		current = innerRange.front.dup;
> 	}
> 
> 	@property T[] front() {
> 		return current;
> 	}
> 
> 	void popFront() {
> 		innerRange.popFront();
> 		current = innerRange.front.dup;
> 	}
> }
> 
> I do think this is the way that conciliate safe as default but still
> allow to be fast when needed, without adding much on most code.

+1. I like this better than my proposal.


T

-- 
Doubtless it is a good thing to have an open mind, but a truly open mind
should be open at both ends, like the food-pipe, with the capacity for
excretion as well as absorption. -- Northrop Frye
October 31, 2012
Re: Transience of .front in input vs. forward ranges
Le 31/10/2012 01:56, H. S. Teoh a écrit :
>
> +1. I like this better than my proposal.
>

The obvious drawback is that this make invalidating ranges harder to 
write. But they seems to be more the exception than the rule.
October 31, 2012
Re: Transience of .front in input vs. forward ranges
On Wed, Oct 31, 2012 at 02:18:55PM +0100, deadalnix wrote:
> Le 31/10/2012 01:56, H. S. Teoh a écrit :
> >
> >+1. I like this better than my proposal.
> >
> 
> The obvious drawback is that this make invalidating ranges harder to
> write. But they seems to be more the exception than the rule.

Actually, there is another problem: many algorithms' output will be
transient or not depending on the input range. For example, we could
write map to use the transient version of byLine, say, but then the
range that map returns will also be transient (no longer safe).

IOW, transience of .front is transitive in some cases. This again makes
things complicated: as soon as you use a single transient range, it
makes downstream ranges transient as well. So we're back to the problem
of how to mark the range as transient in a transparent way. :-(


T

-- 
VI = Visual Irritation
October 31, 2012
Re: Transience of .front in input vs. forward ranges
Le 31/10/2012 18:19, H. S. Teoh a écrit :
> On Wed, Oct 31, 2012 at 02:18:55PM +0100, deadalnix wrote:
>> Le 31/10/2012 01:56, H. S. Teoh a écrit :
>>>
>>> +1. I like this better than my proposal.
>>>
>>
>> The obvious drawback is that this make invalidating ranges harder to
>> write. But they seems to be more the exception than the rule.
>
> Actually, there is another problem: many algorithms' output will be
> transient or not depending on the input range. For example, we could
> write map to use the transient version of byLine, say, but then the
> range that map returns will also be transient (no longer safe).
>
> IOW, transience of .front is transitive in some cases. This again makes
> things complicated: as soon as you use a single transient range, it
> makes downstream ranges transient as well. So we're back to the problem
> of how to mark the range as transient in a transparent way. :-(
>

No it isn't always transitive. But that is true that it is in some cases.

This is why the code should default to the safe behavior. It is up to 
the programmer to ensure correctness when opting for an alternate behavior.
November 01, 2012
Re: Transience of .front in input vs. forward ranges
On Thu, Nov 01, 2012 at 12:05:18AM +0100, deadalnix wrote:
> Le 31/10/2012 18:19, H. S. Teoh a écrit :
[...]
> >Actually, there is another problem: many algorithms' output will be
> >transient or not depending on the input range. For example, we could
> >write map to use the transient version of byLine, say, but then the
> >range that map returns will also be transient (no longer safe).
> >
> >IOW, transience of .front is transitive in some cases. This again
> >makes things complicated: as soon as you use a single transient
> >range, it makes downstream ranges transient as well. So we're back to
> >the problem of how to mark the range as transient in a transparent
> >way. :-(
> >
> 
> No it isn't always transitive. But that is true that it is in some
> cases.
> 
> This is why the code should default to the safe behavior. It is up to
> the programmer to ensure correctness when opting for an alternate
> behavior.

But then can the alternate behaviour be used at all? For example, joiner
will return a range which will be transient if the original range is
transient. But since this is "unsafe" behaviour, joiner's implementation
can't take advantage of the faster behaviour at all, since otherwise the
user, not knowing that it is switching to, say, byLine.fast instead of
the safe version of byLine, may assume that the resulting range is
non-transient, but it is actually transient.

So this nullifies the usefulness of having .fast, since many Phobos
algorithms that would have been able to take advantage of .fast can't,
because their result will be unsafe. In which case, is it even worth the
bother to implement such a thing?

I think we still haven't addressed the fundamental issue, that is, we
need a way to differentiate between transient and non-transient .front
values. Being able to work with transient values is the best, because it
automatically also works for non-transient values, but some algos need
to be able to assume non-transience. This core problem is still
unsolved.


T

-- 
Windows: the ultimate triumph of marketing over technology. -- Adrian von Bidder
November 01, 2012
Re: Transience of .front in input vs. forward ranges
Le 01/11/2012 05:30, H. S. Teoh a écrit :
> On Thu, Nov 01, 2012 at 12:05:18AM +0100, deadalnix wrote:
>> Le 31/10/2012 18:19, H. S. Teoh a écrit :
> [...]
>>> Actually, there is another problem: many algorithms' output will be
>>> transient or not depending on the input range. For example, we could
>>> write map to use the transient version of byLine, say, but then the
>>> range that map returns will also be transient (no longer safe).
>>>
>>> IOW, transience of .front is transitive in some cases. This again
>>> makes things complicated: as soon as you use a single transient
>>> range, it makes downstream ranges transient as well. So we're back to
>>> the problem of how to mark the range as transient in a transparent
>>> way. :-(
>>>
>>
>> No it isn't always transitive. But that is true that it is in some
>> cases.
>>
>> This is why the code should default to the safe behavior. It is up to
>> the programmer to ensure correctness when opting for an alternate
>> behavior.
>
> But then can the alternate behaviour be used at all? For example, joiner
> will return a range which will be transient if the original range is
> transient. But since this is "unsafe" behaviour, joiner's implementation
> can't take advantage of the faster behaviour at all, since otherwise the
> user, not knowing that it is switching to, say, byLine.fast instead of
> the safe version of byLine, may assume that the resulting range is
> non-transient, but it is actually transient.
>
> So this nullifies the usefulness of having .fast, since many Phobos
> algorithms that would have been able to take advantage of .fast can't,
> because their result will be unsafe. In which case, is it even worth the
> bother to implement such a thing?
>
> I think we still haven't addressed the fundamental issue, that is, we
> need a way to differentiate between transient and non-transient .front
> values. Being able to work with transient values is the best, because it
> automatically also works for non-transient values, but some algos need
> to be able to assume non-transience. This core problem is still
> unsolved.
>
>

Would joiner be able to take advantage of this if it was known or not ?
November 01, 2012
Re: Transience of .front in input vs. forward ranges
On Thu, Nov 01, 2012 at 03:18:13PM +0100, deadalnix wrote:
> Le 01/11/2012 05:30, H. S. Teoh a écrit :
> >On Thu, Nov 01, 2012 at 12:05:18AM +0100, deadalnix wrote:
> >>Le 31/10/2012 18:19, H. S. Teoh a écrit :
> >[...]
> >>>Actually, there is another problem: many algorithms' output will be
> >>>transient or not depending on the input range. For example, we
> >>>could write map to use the transient version of byLine, say, but
> >>>then the range that map returns will also be transient (no longer
> >>>safe).
> >>>
> >>>IOW, transience of .front is transitive in some cases. This again
> >>>makes things complicated: as soon as you use a single transient
> >>>range, it makes downstream ranges transient as well. So we're back
> >>>to the problem of how to mark the range as transient in a
> >>>transparent way. :-(
> >>>
> >>
> >>No it isn't always transitive. But that is true that it is in some
> >>cases.
> >>
> >>This is why the code should default to the safe behavior. It is up
> >>to the programmer to ensure correctness when opting for an alternate
> >>behavior.
> >
> >But then can the alternate behaviour be used at all? For example,
> >joiner will return a range which will be transient if the original
> >range is transient. But since this is "unsafe" behaviour, joiner's
> >implementation can't take advantage of the faster behaviour at all,
> >since otherwise the user, not knowing that it is switching to, say,
> >byLine.fast instead of the safe version of byLine, may assume that
> >the resulting range is non-transient, but it is actually transient.
> >
> >So this nullifies the usefulness of having .fast, since many Phobos
> >algorithms that would have been able to take advantage of .fast
> >can't, because their result will be unsafe. In which case, is it even
> >worth the bother to implement such a thing?
> >
> >I think we still haven't addressed the fundamental issue, that is, we
> >need a way to differentiate between transient and non-transient
> >.front values. Being able to work with transient values is the best,
> >because it automatically also works for non-transient values, but
> >some algos need to be able to assume non-transience. This core
> >problem is still unsolved.
> >
> >
> 
> Would joiner be able to take advantage of this if it was known or not ?

I have a working version of joiner that doesn't assume the persistence
of .front. The thing is, if the range you hand it is transient, then the
resulting joined range is also transient. This shows up in one of my
unittests: I can't use writeln() because it assumes persistence of
.front, and I can't use array() either for the same reason. I have to
explicit write a foreach to .dup the elements of the range into an array
before it can be safely used in writeln(), etc..

This makes it impossible for joiner to transparently switch to a fast
implementation (call .fast on the original range), because in this case
the transience is transitive -- the user will have to explicitly call
joiner with .fast. I don't like that, because then how would you know
which algos are safe to call with .fast? It's only by documentation,
i.e., by convention, and is bound to be a hiding place for bugs later
on.


T

-- 
Those who don't understand Unix are condemned to reinvent it, poorly.
November 01, 2012
Re: Transience of .front in input vs. forward ranges
This is up to the consumer to choose if .front can be oblitered on 
popFront, not to an intermediate algorithm.

joiner isn't a consumer, this is a « transformer ». transformer have to 
propagate the .fast (I don't like this name, but this is unimportant for 
now) to its source.

Let'(s consider the following sheme :
source -> transformer (possibly several of them) -> consumer.

Now see example below :

auto transform(R)(R range) {
    struct Transfomer(R) {
        // Operations . . .

        @property
        auto fast() {
            return transform(source.fast);
        }
    }
}

So the fast is used by the consumer and bubble throw all ranges that 
support it up to the source. Or is made a NOOP in the process in one of 
the transformers or the source do not support that optimization.

As said before, this is up to the consumer to know it it accept .front 
to be obliterated. In your case, writeln don't support that, so .fast 
must not be used with writeln.
November 02, 2012
Re: Transience of .front in input vs. forward ranges
On Thu, Nov 01, 2012 at 11:40:52PM +0100, deadalnix wrote:
> This is up to the consumer to choose if .front can be oblitered on
> popFront, not to an intermediate algorithm.
> 
> joiner isn't a consumer, this is a « transformer ». transformer have
> to propagate the .fast (I don't like this name, but this is
> unimportant for now) to its source.

What would be a better name for it?


> Let'(s consider the following sheme :
> source -> transformer (possibly several of them) -> consumer.
> 
> Now see example below :
> 
> auto transform(R)(R range) {
>     struct Transfomer(R) {
>         // Operations . . .
> 
>         @property
>         auto fast() {
>             return transform(source.fast);
>         }
>     }
> }
> 
> So the fast is used by the consumer and bubble throw all ranges that
> support it up to the source. Or is made a NOOP in the process in one
> of the transformers or the source do not support that optimization.
> 
> As said before, this is up to the consumer to know it it accept
> .front to be obliterated. In your case, writeln don't support that,
> so .fast must not be used with writeln.

Ah, I see. That makes sense. So basically it's not the source (or any
intermediate step) that decides whether to use the optimization, but the
final consumer.

Though now we have the issue that all intermediate ranges must propagate
.fast, which is troublesome if every range has to do it manually. Can
this be handled automatically by UFCS?


T

-- 
What doesn't kill me makes me stranger.
« First   ‹ Prev
1 2 3 4 5
Top | Discussion index | About this forum | D home