Possible change to array runtime? (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Possible change to array runtime? (page 3)

March 14, 2014

Re: Possible change to array runtime?

Posted by Andrei Alexandrescu
in reply to Don

Andrei Alexandrescu

Posted in reply to Don

On 3/14/14, 8:58 AM, Don wrote:
> Actually it would help a great deal. In most cases, we do set the length
> to 0. That example code is unusual.
>
> FYI: In D1, this was the most important idiom in the language.
> In the first D conference in 2007, a feature T[new] was described,
> specifically to support this idiom in a safe manner. Implementation was
> begun in the compiler. Unfortunately, it didn't happen in the end. I'm
> not sure if it would actually have worked or not.

Some perspective would be interesting.

1. Do you think eliminating stomping was a strategic mistake? (Clearly it had tactical issues with regard to portability.)

2. If you could do it over again, would you go with T[new]? Something else?

Andrei

March 14, 2014

Re: Possible change to array runtime?

Posted by Steven Schveighoffer
in reply to Don

Steven Schveighoffer

Posted in reply to Don

On Fri, 14 Mar 2014 11:58:06 -0400, Don <x@nospam.com> wrote:

> On Friday, 14 March 2014 at 14:48:13 UTC, Steven Schveighoffer wrote:
>> On Thu, 13 Mar 2014 11:24:01 -0400, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>>
>>
>>> arr.length = 0;
>>>
>> ...
>>> 3. Don's company uses D1 as its language, I highly recommend watching Don's Dconf13 presentation (and look forward to his Dconf14 one!) to see how effective D code can create unbelievable speed, especially where array slices are concerned. But to the above line, in D2, they must add the following code to get the same behavior:
>>>
>>> arr.assumeSafeAppend();
>>
>> Just a quick note, buried in same thread that Don mentioned, he outlined a more specific case, and this does not involve setting length to 0, but to any arbitrary value.
>>
>> This means my approach does not help them, and although it makes sense, the idea that it would help Sociomantic move to D2 is not correct.
>>
>> -Steve
>
> Actually it would help a great deal. In most cases, we do set the length to 0. That example code is unusual.

If that example is not usual, it means that case is easy to fix (as I stated in the other post). Can you estimate how many ways your code contracts the length of an array? I assume all of them must be fully referencing the block, since that was the requirement in D1 (the slice had to point at the beginning of the block for append to work).

I imagine that adding this 'feature' of setting length = 0 would help, but maybe just adding a new, but similar symbol for length that means "Do what D1 length would have done" would be less controversial for adding to druntime. Example strawman:

arr.slength = 0; // effectively the same as arr.length = 0; arr.assumeSafeAppend();

It would do the same thing, but the idea is it would work for extension too -- if arr points at the beginning of the block, and slength *grows* into the block, it would work the same as D1 as well -- adjusting the "used" length and not reallocating.

Essentially, you would have to s/.length/.slength, and everything would work. Of course, length is not a keyword, so there may be other cases where length is used (read property for instance) where slength would not necessarily have to be used.

However, one thing we cannot fix is:

arr = arr[0..$-1];
arr ~= x;

This would reallocate due to stomp prevention, and I can't fix that. Do you have any cases like this, or do you always use .length?

> FYI: In D1, this was the most important idiom in the language.
> In the first D conference in 2007, a feature T[new] was described, specifically to support this idiom in a safe manner. Implementation was begun in the compiler. Unfortunately, it didn't happen in the end. I'm not sure if it would actually have worked or not.

I think the benefits would have been very minimal. T[new] would be almost exactly like T[]. And you still would have to have updated all your code to use T[new]. When to use T[new] or T[] would have driven most people mad.

I was around back then, and I also remember Andrei, when writing TDPL, stating that the T[new] and T[] differences were so subtle and confusing that he was glad he didn't have to specify that.

> BTW you said somewhere that concatenation always allocates. What I actually meant was ~=, not ~.

OK

> In our code it is always preceded by .length = 0 though.
> It's important that ~= should not allocate, when the existing capacity is large enough.

That is the case for D2, as long as the slice is qualified as ending at the end of the valid data (setting length to 0 doesn't do this, obviously). This is a slight departure from D1, which required the slice to point at the *beginning* of the data.

-Steve

March 14, 2014

Re: Possible change to array runtime?

Posted by deadalnix
in reply to Andrei Alexandrescu

deadalnix

Posted in reply to Andrei Alexandrescu

On Friday, 14 March 2014 at 16:38:23 UTC, Andrei Alexandrescu
wrote:
> Some perspective would be interesting.
>
> 1. Do you think eliminating stomping was a strategic mistake? (Clearly it had tactical issues with regard to portability.)
>

We should stomp isolated arrays.
We must not stomp immutable arrays. That mean we must not stomp
const ones as well.

> 2. If you could do it over again, would you go with T[new]? Something else?
>

No much choixe here, because of the problem mentioned earlier in
this thread. i'm baffled that this doesn't grab more attention.

March 15, 2014

Re: Possible change to array runtime?

Posted by sclytrack
in reply to deadalnix

sclytrack

Posted in reply to deadalnix

On Thursday, 13 March 2014 at 21:28:46 UTC, deadalnix wrote:
> If I may, and as author of dcollection, you'll certainly agree, I
> think the way forward is improving the language when it comes to
> collections.
>
> That way, any behavior can be provided as user type. Patching the
> language in such a way is simply moving the problem around.
>

------------------------------------------

If it is not in the language then maybe a design pattern.

------------------------------------------

> The #1 problem we have when it comes to collection is type
> qualifier for template type parameters. I consider this to be on
> of the top problems of D right now, if not the #1 problem.

------------------------------------------

Number 1 problem. Noted.

------------------------------------------


> The problem is that T!ElementType is a ompletely different type
> than T!const(ElementType) . Because of this, if is almost

------------------------------------------

I grasp that you need some sort of an implicit cast from
T!ElementType to T!const(ElementType).

void routine(T!const(ElementType) p) {}
T!ElementType a;

routine(a);

------------------------------------------

Collection!(const(Element1), immutable(Element2), Element3);

There are a lot of different implicit conversions that
stacking up. Now separating the qualifier like this.

T!(ElementType, immutable)

The implicit conversions would then be based on that sole
qualifier at the end. Would this not be sufficient for your
collections? (Not rethorical).

Collection!(const(Element1), immutable(Element2), Element3, const);


------------------------------------------


void routine(immutable Collection!(ElementType, const));     //Meh
void routine(immutable Collection!(ElementType, immutable));

I also feel like the above //Meh should not be there.

------------------------------------------


> impossible to provide anything that behave anything close to
> arrays. That is th reason why collection are so scarce in D, this
> is the reason why efforts to implement AA without the black magic

------------------------------------------

I dunno. Not a magician. Have to think.

------------------------------------------


> involved right now are failing.
>

> This problem is a killer for the viability of D2 on the long run.
> I'm not saying that lightly.

------------------------------------------


Everybody wants a solution to this.

That is why everybody is jumping on
this like crazy instead of responding to the
finalize by default thread. (irony)


Some more "Collection and Const Noise".  Collection!const(Noise)

March 16, 2014

Re: Possible change to array runtime?

Posted by Timon Gehr
in reply to Steven Schveighoffer

Timon Gehr

Posted in reply to Steven Schveighoffer

On 03/14/2014 06:38 PM, Steven Schveighoffer wrote:
>
> I imagine that adding this 'feature' of setting length = 0 would help,
> but maybe just adding a new, but similar symbol for length that means
> "Do what D1 length would have done" would be less controversial for
> adding to druntime.  Example strawman:
>
> arr.slength = 0; // effectively the same as arr.length = 0;
> arr.assumeSafeAppend();
>
> It would do the same thing, but the idea is it would work for extension
> too -- if arr points at the beginning of the block, and slength *grows*
> into the block, it would work the same as D1 as well -- adjusting the
> "used" length and not reallocating.
>
> Essentially, you would have to s/.length/.slength, and everything would
> work. Of course, length is not a keyword, so there may be other cases
> where length is used (read property for instance) where slength would
> not necessarily have to be used.
>
> However, one thing we cannot fix is:
>
> arr = arr[0..$-1];
> arr ~= x;
>
> This would reallocate due to stomp prevention, and I can't fix that.

auto sslice(T)(T[] arr, size_t s, size_t e){
    auto r=arr[s..e];
    r.assumeSafeAppend();
    return r;
}

arr = arr.sslice(0, $-1);
arr ~= x;

> Do you have any cases like this, or do you always use .length?

I think just replacing all usages of features that have changed behaviour with library implementations that simulate the old behaviour would be quite easy.

March 16, 2014

Re: Possible change to array runtime?

Posted by Joseph Rushton Wakeling
in reply to Steven Schveighoffer

Joseph Rushton Wakeling

Posted in reply to Steven Schveighoffer

On 13/03/14 19:09, Steven Schveighoffer wrote:
> On Thu, 13 Mar 2014 13:44:01 -0400, monarch_dodra <monarchdodra@gmail.com> wrote:
>> The "only way" to make it work (AFAIK), would be for "length = 0" to first
>> finalize the elements in the array. However, you do that, you may accidentally
>> destroy elements that are still "live" and referenced by another array.
>
> In fact, assumeSafeAppend *should* finalize the elements in the array, if it had
> that capability. When you call assumeSafeAppend, you are telling the runtime
> that you are done with the extra elements.
>
>> I'm not too hot about this proposal. My main gripe is that while "length = 0"
>> *may* mean "*I* want to discard this data", there is no guarantee you don't
>> have someone else that has a handle on said data, and sure as hell doesn't
>> want it clobbered.
>
> Again, the =null is a better solution. There is no point to keeping the same
> block in reference, but without any access to elements, unless you want to
> overwrite it.

Problem is, this still seems like safety-by-programmer-virtue.  It's far too easy to write ".length = 0" casually and without any attention to consequences one way or the other.

Isn't the whole point of the requirement to use assumeSafeAppend that it really forces the user to say, "Yes, I want to do away with the contents of the array and I take full responsibility for ensuring there's nothing else pointing to it that will break" ... ?

I must say that personally I'd rather have the safety-by-default and the obligation to write assumeSafeAppend (or use Appender!T) where performance needs it, than risk code breaking because someone's function accidentally throws away stuff that I still had a slice of.

March 17, 2014

Re: Possible change to array runtime?

Posted by sclytrack
in reply to deadalnix

sclytrack

Posted in reply to deadalnix

On Thursday, 13 March 2014 at 22:21:56 UTC, sclytrack wrote:

>
> 	Delayed D Language			D Normal
>
> 	immutable int * a;			int * a;
> 	immutable qual(int) * a;		immutable(int) * a;
> 	immutable int qual(*) a;		immutable int * a;
>
> 	-------------
>
> 	struct A
> 	{
> 		int a;
> 		qual(int) b;
> 	}
> 			
>
> 	A vari;
> 	const A cvari;
> 	vari.a = 1;
> 	cvari = vari;
> 	cvari.a = 2;
> 	cvari.b = 3;	//error
>
> 	-------------
>
> 	qual
> 	class Test
> 	{
> 	}
>
>
> 	Delayed D				D struct language
>
> 	immutable Test t;			immutable(Test) * t;
> 	immutable qual(Test) t; 		immutable Test * t;
>
> 	-------------
>
>
> 	struct Array(T)
> 	{
> 		qual(T) * data;
> 		size_t length;
> 	}
>
>
> 	Array!(int) a;	
> 	const Array!(int) b;
>
> 	void routine(const Array!(int) b)
> 	{
>
> 	}
>
> 	b = a;
> 	routine(a);

I'm not sure how to understand your change proposal, you should
explicit it more. But it seems that you get what the problem is
(or at least, you are poking the right language constructs in
your proposal).

qual would be the entry point of the qualifier so

immutable int * * * * qual(*) * * * data;

So everything to the left would be immutable and
everything to the right would be mutable.

	struct A
 	{
 		char [] a;
 		qual(char) [] b;
 	}

When defining const A data; Again the const
is delayed until the entry qual(int). So ...

A data;

	struct A
	{
		char [] a;
		char [] b;
	}

const A cdata;

	struct A
	{
		char [] a;
		const(char) [] b;
	}

immutable A idata;

	struct A
	{
		char [] a;
		immutable(char) [] b;
	}

All this A has a single memory layout and
a single qualifier that modifies them.

cdata = data;
cdata = idata;

---

	struct B
	{
		A data;
		qual(A) qdata;
	}

B bdata;

	struct B
	{
		A data;
		A qdata;
	}

const B cbdata;

	struct B
	{
		A data;
		const(A) qdata;     //struct A { char [] a; const(char) [] b; }		
	}

immutable B ibdata;

	struct B
	{
		A data;
		immutable(A) qdata;	//struct A { char [] a; immutable(char) [] b; }
	}

cbdata = bdata
cbdata = ibdata

For functions it the qualifier would also be delayed which
means having the word const in front of it doesn't necessarily
mean it is const.

void routine(const A * data)
{
	data.a = "modifiable";
}

If we are forcing a delayed qualifier on A we are forcing
the same one as the one on the pointer.

const A qual(*) data;

It is the same const applied to the pointer and the
struct A.

---

Okay now to more conventional D.

These two are more or less the same type.

Container!(PayloadType)
Container!(PayloadType, const)

The following two are different types.

Container!(PayloadType)
Container!(const(PayloadType))

---

Container!(PayloadType, const)

The qualifier can be applied to your payload
field and even none-payload fields.

The thing is only PayLoadType is part of the
type definition and the qualifier at the
end is not.

The qualifier at the end
must not participate in the type definition
in any way. Not in static if and so forth.
By some magic needs to be prevented.

Having to deal with a single qualifier
is much easier than having to deal with
the qualifier of every payload.

Again a bit of qual code.

struct Container!(PayloadType)
{
	qual(char) [] a;
	PayloadType payload;
}

PayloadType participates in the type and
sets the memory layout of the type.

Here the qualifier isn't even on the payload.

---

Container!(const(int), int, char [], const)

The first 3 define the type.

	(const(int), int, char [])

The last bit. The qualifier (const) of the
container does not participate
in the definition of the type.

---
For overloading there are too many options.

void routine( immutable Container!(int, const) data);		//a
void routine( immutable Container!(int, immutable) data);	//b

Here the first one (a) is not needed. In delayed qualifier D
this would be.

void routine(imutable qual(Container!(int)) data)		//b

Setting the qual in right in front of the container means that
immutable applies to all the fields of the Container!(int) and
not only those marked with qual.

---

In Normal D style

Existing style. Already possible.

void routine( Container!(int,  ) data);				//a
void routine( const Container!(int, const) data);		//b
void routine( immutable Container!(int, immutable) data);	//c

The new ones.

void routine( Container!(int, const ) data);	
void routine( Container!(int, immutable) data);

The nonsense ones.

void routine( const Container!(int, ) data);		
void routine( immutable Container!(int, const) data);

---

Anyways. Thank you for your time reading this mess.

March 20, 2014

Re: Possible change to array runtime?

Posted by Jonathan M Davis

Jonathan M Davis

On Sunday, March 16, 2014 13:14:15 Joseph Rushton Wakeling wrote:
> Problem is, this still seems like safety-by-programmer-virtue. It's far too easy to write ".length = 0" casually and without any attention to consequences one way or the other.
> 
> Isn't the whole point of the requirement to use assumeSafeAppend that it really forces the user to say, "Yes, I want to do away with the contents of the array and I take full responsibility for ensuring there's nothing else pointing to it that will break" ... ?
> 
> I must say that personally I'd rather have the safety-by-default and the obligation to write assumeSafeAppend (or use Appender!T) where performance needs it, than risk code breaking because someone's function accidentally throws away stuff that I still had a slice of.

I tend to agree with this. Setting an array's length to 0 with the expectation that you will then reuse that memory is inherently unsafe. What if there are other arrays still referring to the same data? They'll be stomped, which could do some very nasty things - especially if we're talking about structs rather than strings.

assumeSafeAppend is explicitly unsafe and makes it clear what you're doing, whereas while setting an array's length to 0 may be generally nonsensical if you're not intending to reuse the memory again, having that essentially call assumeSafeAppend for you could result in very pernicious bugs when someone is foolish enough to set an array's length to 0 when they still have other slices to some or all of that array. I really think that the assumeSafeAppend needs to be explicit.

- Jonathan M Davis

March 20, 2014

Re: Possible change to array runtime?

Posted by Jonathan M Davis
in reply to Dicebot

Jonathan M Davis

Posted in reply to Dicebot

On Friday, March 14, 2014 14:50:43 Dicebot wrote:
> It has not been mentioned as example of blocker but example of most horrible breaking change that can happen if dmd user is not aware of it before upgrade.
> 
> I think only practical thing that blocks D2 transition right now is that no one can afford to work on it full-time and amount of changes needed is too big to be done as a side work - array stomping is not only problematic area, I personally think const will cause much more problems.
> 
> Really hope that with recent hires doing a focused porting effort will soon become a real option. D1 is so D1 :P

Just a thought for this particular case. You could write a clearForAppend for D1 which simply sets the array length to 0 and slowly change your current code to use that instead of setting the array length to 0 directly. Then when you do go to port to D2, you can simply change clearForAppend's behavior so that it also calls assumeSafeAppend. That would allow you to do this particular change as a side project rather than focusing on it full time, which obviously isn't enough on its own, but it could help.

- Jonathan M Davis

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation