Jump to page: 1 2
Thread overview
Range returning an array
Apr 09, 2013
Ali Çehreli
Apr 09, 2013
bearophile
Apr 09, 2013
H. S. Teoh
Apr 10, 2013
H. S. Teoh
Apr 10, 2013
Jesse Phillips
April 09, 2013
Hello all,

I have a situation in a program I'm writing where it's convenient to define a range whose return value is an array.  Something like:

struct MySimulation(T)
{
	T[] var;
	T diff, convergence;

	auto front() @property
	{
		return var;
	}

	bool empty() @property
	{
		return (diff < convergence);
	}

	void popFront()
	{
		// update values in var
		// and calculate diff
	}
}

Now, the problem with a design like this is that it's unsafe, in the sense that, let's say I do something like,

	T[] output;
	foreach(opvar; mySim)
		output = opvar;

... then, at the end of this loop, output will not be the same as it was set to with the last assignment, because popFront() will have been called one last time prior to the "empty" condition being set to true.

Now, there are a number of ways I can think of to avoid this.  One is to return var.dup rather than var from front(), but this is undesirable because it will hit performance in a big way.  Another might be to calculate the updated value of var in a temporary array, and update var itself only if the diff is greater than the convergence criterion.

However, I'm curious enough about this problem that I thought I'd throw it open to everyone else to see if anyone has a better idea than one of these.  The challenge here is to ensure really good performance while not risking bad or incorrect results leaking outside of the range.

Any thoughts? :-)

Thanks & best wishes,

    -- Joe
April 09, 2013
On 04/09/2013 01:03 PM, Joseph Rushton Wakeling wrote:

> to return var.dup rather than var from front(), but this is
> undesirable because it will hit performance in a big way.

Your problem is the exact situation that stdio.ByLine has faced. There, the choice has been to use an internal buffer just like your 'var' and leave the .dup call to the user for when it is needed.

That choice is faster but sometimes causes bugs.

Ali

April 09, 2013
Ali Çehreli:

> Your problem is the exact situation that stdio.ByLine has faced. There, the choice has been to use an internal buffer just like your 'var' and leave the .dup call to the user for when it is needed.
>
> That choice is faster but sometimes causes bugs.

In similar situations I add a boolean template argument that switches on or off the dupping of the given result. On default it's on. So it's safe on default and faster on request, and this goes well with the D zen.

Bye,
bearophile
April 09, 2013
On Tue, 09 Apr 2013 16:03:01 -0400, Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> wrote:


> Any thoughts? :-)

1. documentation.  Make sure the user of the range knows that this data is going to be re-used.
2. Make front() return const if possible.  It is another signal that you aren't supposed to keep this data.
3. For your specific situation, add lastFront():

struct MySimulation(T)
{
	T[] var;
	T[] lastVar;
	T diff, convergence;

	auto front() @property
	{
		return var;
	}

	bool empty() @property
	{
		return (diff < convergence);
	}

	void popFront()
	{
		lastVar[] = var[];
		// update values in var
		// and calculate diff
	}
	
	auto lastFront() @property
	{
		return lastVar;
	}
}

-Steve
April 09, 2013
On 04/09/2013 11:02 PM, Steven Schveighoffer wrote:
> 1. documentation.  Make sure the user of the range knows that this data is going to be re-used.

In this case it's quite unlikely anyone will ever want to use the code apart from me, but yes, good point. :-)

> 2. Make front() return const if possible.  It is another signal that you aren't
> supposed to keep this data.
> 3. For your specific situation, add lastFront():

It's an interesting thought.  I don't think it's ultimately the right way to go -- yes, my application rests strongly on finding the last value, but the problem is very simply that popFront kills the value _before_ finding out if the range is now empty.

The only logical option that I can see is to tweak things so that doesn't happen, which is possible but probably a little bit more finnicky than the current implementation.

April 09, 2013
On Tue, 09 Apr 2013 18:09:07 -0400, Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> wrote:

> On 04/09/2013 11:02 PM, Steven Schveighoffer wrote:

>> 3. For your specific situation, add lastFront():
>
> It's an interesting thought.  I don't think it's ultimately the right way to go
> -- yes, my application rests strongly on finding the last value, but the problem
> is very simply that popFront kills the value _before_ finding out if the range
> is now empty.

Well here is another solution:

struct MySimulation(T)
{
	T[] var;
        T[] tmpvar;
	T diff, convergence;

	auto front() @property
	{
		return var;
	}

	bool empty() @property
	{
		return (diff < convergence);
	}

	void popFront()
	{
		tmpvar[] = var[];
		// update values in var
		// and calculate diff
		if(empty)
		{
			var[] = tmpvar[]; // revert
			var = null; // optional, detach from original array
		}
	}
}

You could also use alloca to avoid storing tmpvar as a struct member and also avoid allocation.

-Steve
April 09, 2013
On 04/10/2013 12:33 AM, Steven Schveighoffer wrote:
> Well here is another solution

I did consider something like that.  The actual one I have come to after thinking about it is like this:

struct MySimulation(T)
{
	T[] var;
	T diff = T.max;
	T convergence;
	_bool empty = false;

	auto front() @property
	{
		return var;
	}

	bool empty() @property
	{
		return _empty;
	}

	void popFront()
	{
		if(diff < convergence)
			_empty = true;
		else
			// update var and calculate new diff
	}
}

How does this look to people?
April 09, 2013
On 04/10/2013 12:50 AM, Joseph Rushton Wakeling wrote:
> I did consider something like that.

By the way: the reason that I rejected the temporary-variable choice was that I couldn't really see the difference cost-wise between doing that, versus returning var.dup from front().  Especially as it's not necessarily guaranteed that front will be called frequently (I might just popFront() until the range is empty and then take the final front value).
April 09, 2013
On Tue, 09 Apr 2013 18:53:56 -0400, Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> wrote:

> On 04/10/2013 12:50 AM, Joseph Rushton Wakeling wrote:
>> I did consider something like that.
>
> By the way: the reason that I rejected the temporary-variable choice was that I
> couldn't really see the difference cost-wise between doing that, versus
> returning var.dup from front().  Especially as it's not necessarily guaranteed
> that front will be called frequently (I might just popFront() until the range is
> empty and then take the final front value).

Calling front after empty is not good range policy, once empty, front is possibly invalid or points at invalid memory.

-Steve
April 09, 2013
On Tue, Apr 09, 2013 at 07:18:25PM -0400, Steven Schveighoffer wrote:
> On Tue, 09 Apr 2013 18:53:56 -0400, Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> wrote:
> 
> >On 04/10/2013 12:50 AM, Joseph Rushton Wakeling wrote:
> >>I did consider something like that.
> >
> >By the way: the reason that I rejected the temporary-variable choice was that I couldn't really see the difference cost-wise between doing that, versus returning var.dup from front().  Especially as it's not necessarily guaranteed that front will be called frequently (I might just popFront() until the range is empty and then take the final front value).
> 
> Calling front after empty is not good range policy, once empty, front is possibly invalid or points at invalid memory.
[...]

I believe it was proposed that .front should assert if .empty returns true. Personally I think that's a bit too extreme, but nevertheleses, I agree that calling .front after .empty returns true is sloppy coding, and can easily lead to bugs or runtime errors.


T

-- 
The two rules of success: 1. Don't tell everything you know. -- YHL
« First   ‹ Prev
1 2