June 15, 2004
On Tue, 15 Jun 2004 12:20:31 +0800, J Anderson <REMOVEanderson@badmama.com.au> wrote:

> Regan Heath wrote:
>
>> On Tue, 15 Jun 2004 13:11:11 +1000, Derek Parnell <derek@psych.ward> wrote:
>>
>>> On Tue, 15 Jun 2004 10:35:20 +0800, J Anderson wrote:
>>>
>>>> Regan Heath wrote:
>>>>
>>>>> The thread "string performance issues" by "Daniel Horn" got me
>>>>> thinking of an idea for a change to arrays.
>>>>>
>>>>> Bascially, concatenation can be slow, as it causes reallocations of
>>>>> the array. If you could pre-allocate the array then it wouldn't be as
>>>>> slow.
>>>>>
>>>>> I tried this:
>>>>>
>>>>> char[] p = "regan"
>>>>>
>>>>> p.length = 10;
>>>>> p ~= "fred";
>>>>>
>>>>> and ended up with a string containing
>>>>>
>>>>> 'r' 'e' 'g' 'a' 'n' '0' '0' '0' '0' '0' 'f' 'r' 'e' 'd'
>>>>>
>>>>> which was not what I was after :)
>>>>>
>>>>> I remembered a thread on arrays requesting renaming the 'length'
>>>>> property to 'reserve' or something like that, and the idea for the
>>>>> addition of a reserve property that simply allocated memory to the
>>>>> array without changing it's length came to me. If we could go:
>>>>>
>>>>> char[] p = "regan";
>>>>>
>>>>> p.reserve = 10;
>>>>> p ~= "fred";
>>>>>
>>>>> and end up with a string containing
>>>>>
>>>>> 'r' 'e' 'g' 'a' 'n' 'f' 'r' 'e' 'd' '0'
>>>>>
>>>>> and a length of 9. Then we could do fast concatenation. Otherwise
>>>>> we're left writing a String class that achieves this by setting length
>>>>> and using memcpy. I thought a design goal of D was to avoid this.
>>>>>
>>>>> Thoughts?
>>>>>
>>>> I think this would work:
>>>> char[] p = "regan"
>>>> p.length = 10;
>>>> p.length = 5;
>>>> p ~= "fred";
>>>>
>>>> as D doesn't clean up the memory straight away.
>>>
>>>
>>> How about ...
>>>
>>>  char[] p = "regan"
>>>  p.length = 10;
>>>  p[5..9] = "fred";
>>
>>
>> This works, but see the other thread "string performance issues" for an example of the real problem.
>>
>> Regan
>>
> Personally I'd use block allocation for a problem like that.  That's what you'd do in C.

Yes, the poster was simply stating that it should be easier than that in D, after all that is the point of having a concatenation operator.

> Then after woods you simply trim the array to the size you really want (or use length = block; length = 0; beforehand).
>
> I don't think strings should be treated as a specific type of array.

Neither do I, nor was I suggesting that. I want this property on all arrays, the example I gave used a char[] as that is what the original poster wanted it for.

> Whatever applies to char array should also apply to every other type of array, except for the automatic conversion to zero terminate arrays of course.

Agreed.

> Adding a reserve property could increase the string overhead, unless all it did was:
>
> template reserveT(T)
> {
>    void reserve(inout char [] array, uint length)
>    {
>        int oldlen = array.length;
>        array.length = length;
>        array.length = oldlen;
>    }
> }
>
> alias reserveT!(char).reserve reserve;
>
> Hay, what do you know - I just solved your problem *grin*.
>
> Now you can write:
>
> array.reserve(10);

This should work. :)

So why not add this to arrays then?

Let me re-iterate what I am suggesting...

I think all arrays should have a property which you can use to specify the # of elements worth of memory for it to grab right now, this should be different to the length property which specifies the number of elements in the array.

Example..

char[] p;
p.length = 10;

the above adds 10 elements to the array, initializing them to the default value.
This is the current behaviour.

char[] p;
p.reserve = 10;

the above grabs enough memory for 10 elements, but does not add/initialize any.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 15, 2004
Ivan Senji wrote:

>"J Anderson" <REMOVEanderson@badmama.com.au> wrote in message
>news:calte9$fm0$1@digitaldaemon.com...
>  
>
>>Regan Heath wrote:
>>
>>    
>>
>>>On Tue, 15 Jun 2004 13:11:11 +1000, Derek Parnell <derek@psych.ward>
>>>wrote:
>>>
>>>      
>>>
>>>>On Tue, 15 Jun 2004 10:35:20 +0800, J Anderson wrote:
>>>>
>>>>        
>>>>
>>>>>Regan Heath wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>The thread "string performance issues" by "Daniel Horn" got me
>>>>>>thinking of an idea for a change to arrays.
>>>>>>
>>>>>>Bascially, concatenation can be slow, as it causes reallocations of
>>>>>>the array. If you could pre-allocate the array then it wouldn't be as
>>>>>>slow.
>>>>>>
>>>>>>I tried this:
>>>>>>
>>>>>>char[] p = "regan"
>>>>>>
>>>>>>p.length = 10;
>>>>>>p ~= "fred";
>>>>>>
>>>>>>and ended up with a string containing
>>>>>>
>>>>>>'r' 'e' 'g' 'a' 'n' '0' '0' '0' '0' '0' 'f' 'r' 'e' 'd'
>>>>>>
>>>>>>which was not what I was after :)
>>>>>>
>>>>>>I remembered a thread on arrays requesting renaming the 'length'
>>>>>>property to 'reserve' or something like that, and the idea for the
>>>>>>addition of a reserve property that simply allocated memory to the
>>>>>>array without changing it's length came to me. If we could go:
>>>>>>
>>>>>>char[] p = "regan";
>>>>>>
>>>>>>p.reserve = 10;
>>>>>>p ~= "fred";
>>>>>>
>>>>>>and end up with a string containing
>>>>>>
>>>>>>'r' 'e' 'g' 'a' 'n' 'f' 'r' 'e' 'd' '0'
>>>>>>
>>>>>>and a length of 9. Then we could do fast concatenation. Otherwise
>>>>>>we're left writing a String class that achieves this by setting
>>>>>>            
>>>>>>
>length
>  
>
>>>>>>and using memcpy. I thought a design goal of D was to avoid this.
>>>>>>
>>>>>>Thoughts?
>>>>>>
>>>>>>            
>>>>>>
>>>>>I think this would work:
>>>>>char[] p = "regan"
>>>>>p.length = 10;
>>>>>p.length = 5;
>>>>>p ~= "fred";
>>>>>
>>>>>as D doesn't clean up the memory straight away.
>>>>>          
>>>>>
>>>>How about ...
>>>>
>>>> char[] p = "regan"
>>>> p.length = 10;
>>>> p[5..9] = "fred";
>>>>        
>>>>
>>>This works, but see the other thread "string performance issues" for
>>>an example of the real problem.
>>>
>>>Regan
>>>
>>>      
>>>
>>Personally I'd use block allocation for a problem like that.  That's
>>what you'd do in C.  Then after woods you simply trim the array to the
>>size you really want (or use length = block; length = 0; beforehand).
>>
>>I don't think strings should be treated as a specific type of array.
>>Whatever applies to char array should also apply to every other type of
>>array, except for the automatic conversion to zero terminate arrays of
>>course.
>>
>>Adding a reserve property could increase the string overhead, unless all
>>it did was:
>>
>>template reserveT(T)
>>{
>>   void reserve(inout char [] array, uint length)
>>    
>>
>
>Did you mean to write:
>
>    void reserve(inout T [] array, uint length)
>:)
>  
>

Good catch.

>  
>
>>   {
>>       int oldlen = array.length;
>>       array.length = length;
>>       array.length = oldlen;
>>   }
>>}
>>
>>alias reserveT!(char).reserve reserve;
>>
>>Hay, what do you know - I just solved your problem *grin*.
>>
>>Now you can write:
>>
>>array.reserve(10);
>>
>>    
>>
>
>Nice :)
>
>
>  
>
>>--
>>-Anderson: http://badmama.com.au/~anderson/
>>    
>>
>
>
>  
>


-- 
-Anderson: http://badmama.com.au/~anderson/
June 15, 2004
Regan Heath wrote:

> Yes, the poster was simply stating that it should be easier than that in D, after all that is the point of having a concatenation operator.
>
>> Then after woods you simply trim the array to the size you really want (or use length = block; length = 0; beforehand).
>>
>> I don't think strings should be treated as a specific type of array.
>
>
> Neither do I, nor was I suggesting that. I want this property on all arrays, the example I gave used a char[] as that is what the original poster wanted it for.
>
>> Whatever applies to char array should also apply to every other type of array, except for the automatic conversion to zero terminate arrays of course.
>
>
> Agreed.
>
>> Adding a reserve property could increase the string overhead, unless all it did was:
>>
>> template reserveT(T)
>> {
>>    void reserve(inout char [] array, uint length)
>>    {
>>        int oldlen = array.length;
>>        array.length = length;
>>        array.length = oldlen;
>>    }
>> }
>>
>> alias reserveT!(char).reserve reserve;
>>
>> Hay, what do you know - I just solved your problem *grin*.
>>
>> Now you can write:
>>
>> array.reserve(10);
>
>
> This should work. :)
>
> So why not add this to arrays then?
>
> Let me re-iterate what I am suggesting...
>
> I think all arrays should have a property which you can use to specify the # of elements worth of memory for it to grab right now, this should be different to the length property which specifies the number of elements in the array.

You can make your own class to do that.  The reason against that approach is that it requires more overhead (requiring both memory and CPU).  Having the most primitive array type is essential because then we can do things like you suggest.  If this overhead was included in the primitive type then we would have to lug it around even when we don't need the overhead.  Most of the time you can write your program so that you don't even need reserve tricks for efficiency by determining the amount of memory nessary.

An array with a reserve would look like:

struct array
{
   uint length;
   uint reserve;
   T * pointer }

compare that to:

struct array
{
   uint length;
   T * pointer }

Not to mention all the extra operations to maintain the reserved size.

Its not to hard to write an array class that has a reserve (10 lines or so), but it would be much harder the other way round. 

> Example..
>
> char[] p;
> p.length = 10;
>
> the above adds 10 elements to the array, initializing them to the default value.
> This is the current behaviour.
>
> char[] p;
> p.reserve = 10;
>
> the above grabs enough memory for 10 elements, but does not add/initialize any.

Initialization is pretty cheap for arrays.  That is initialization only take a fraction of the time it requires to allocate the memory for the array.  So whatever the time saved with initialization, the user will most probably not notice it.

> Regan.


All this talk just makes me like D arrays even more.

-- 
-Anderson: http://badmama.com.au/~anderson/
June 15, 2004
J Anderson wrote:

>> This should work. :)
>>
>> So why not add this to arrays then?
>
I do think it should be in phobos somewhere (where you don't need to include it).  I probably didn't outline that enough in the previous email.  That way different vendors can optimise the reserve to what works best for their program and individuals can overload it with there own optimised versions.

-- 
-Anderson: http://badmama.com.au/~anderson/
June 15, 2004
Matthew wrote:
> It means adding another member to slices, and would complicate it in other ways
> also, since one would have to distinguish between slices that own and slices that
> view.

I don't think it means that. It has already been mentioned that

s.reserve(x)

is the same as

s.length=x;
s.length=oldLength;

The compiler could simply rewrite it.

Note that the idea of a global template function is not quite equivalent, since the function is not in the "namespace" of the char type, but rather in global scope. Which means that multiple such functions will only overload if they are aliased.


Hauke
June 15, 2004
"J Anderson" <REMOVEanderson@badmama.com.au> wrote in message news:cam7n9$vbv$2@digitaldaemon.com...
> J Anderson wrote:
>
> >> This should work. :)
> >>
> >> So why not add this to arrays then?
> >
> I do think it should be in phobos somewhere (where you don't need to

I agree. D lets us add "properties" to arrays so why not show how this feature can be useful!

> include it).  I probably didn't outline that enough in the previous email.  That way different vendors can optimise the reserve to what works best for their program and individuals can overload it with there own optimised versions.
>
> --
> -Anderson: http://badmama.com.au/~anderson/


June 15, 2004
Regan Heath wrote:
> On Tue, 15 Jun 2004 12:31:48 +1000, Matthew <admin@stlsoft.dot.dot.dot.dot.org> wrote:
> 
>> It means adding another member to slices, and would complicate it in other ways
>> also, since one would have to distinguish between slices that own and slices that
>> view.
> 
> Why do you need to add another member to slices?
> Don't we already have to distinguish between slices that own and slices that view?
<snip>

My experiments show that arrays are allocated in powers of 2 or multiples of 2K.  (At least 1D arrays of an atomic type, otherwise I haven't experimented.)  There obviously is a reserve property, though looking at the ABI it must be of the allocated block rather than of the array reference itself.

My experiments with slicing showed that a slice becomes a newly allocated array if the length is changed, even if it doesn't exceed the originally sliced length.

But there are two cases I haven't tested yet:

	int[] qwert = new int[10];
	int[] yuiop = qwert;
	qwert.length = 14;	// remaining within allocated block
	// ... put some data in qwert
	yuiop.length = 13;

It would follow that qwert and yuiop still have the same start address, and setting the length of yuiop would overwrite qwert[10..13].

	int[] qwert = new int[10];
	int[] yuiop = qwert[0..5];
	yuiop.length = 8;

If the distinction between an owner and a viewer is simply a matter of whether the start address is the start of the allocated block, then likewise here.

Hang on ... better check if length setting and concatenation really have the same effect....

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment.  Please keep replies on the 'group where everyone may benefit.
June 15, 2004
On Tue, 15 Jun 2004 10:24:21 +0200, Hauke Duden <H.NS.Duden@gmx.net> wrote:
> Matthew wrote:
>> It means adding another member to slices, and would complicate it in other ways
>> also, since one would have to distinguish between slices that own and slices that
>> view.
>
> I don't think it means that. It has already been mentioned that
>
> s.reserve(x)
>
> is the same as
>
> s.length=x;
> s.length=oldLength;
>
> The compiler could simply rewrite it.

Or rather simply do the allocation the above steps would cause.

The only worry I have is that if you do this..

> s.length=x;
> s.length=oldLength;

when, if ever does it free the memory you are no longer using?

Because if it does it sometime in the middle of a few thousand concat operations then were almost back at square one.

> Note that the idea of a global template function is not quite equivalent, since the function is not in the "namespace" of the char type, but rather in global scope. Which means that multiple such functions will only overload if they are aliased.

While it's a nice workaround I cannot see why we cant have a reserve property, IMO it makes sense for all array types, it's the obvious way to do it (so people looking for a way will find it), it's dead simple to change, it does not cause any problems with existing code...

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 15, 2004
On Tue, 15 Jun 2004 15:12:59 +0800, J Anderson <REMOVEanderson@badmama.com.au> wrote:
> Regan Heath wrote:
>
>> Yes, the poster was simply stating that it should be easier than that in D, after all that is the point of having a concatenation operator.
>>
>>> Then after woods you simply trim the array to the size you really want (or use length = block; length = 0; beforehand).
>>>
>>> I don't think strings should be treated as a specific type of array.
>>
>>
>> Neither do I, nor was I suggesting that. I want this property on all arrays, the example I gave used a char[] as that is what the original poster wanted it for.
>>
>>> Whatever applies to char array should also apply to every other type of array, except for the automatic conversion to zero terminate arrays of course.
>>
>>
>> Agreed.
>>
>>> Adding a reserve property could increase the string overhead, unless all it did was:
>>>
>>> template reserveT(T)
>>> {
>>>    void reserve(inout char [] array, uint length)
>>>    {
>>>        int oldlen = array.length;
>>>        array.length = length;
>>>        array.length = oldlen;
>>>    }
>>> }
>>>
>>> alias reserveT!(char).reserve reserve;
>>>
>>> Hay, what do you know - I just solved your problem *grin*.
>>>
>>> Now you can write:
>>>
>>> array.reserve(10);
>>
>>
>> This should work. :)
>>
>> So why not add this to arrays then?
>>
>> Let me re-iterate what I am suggesting...
>>
>> I think all arrays should have a property which you can use to specify the # of elements worth of memory for it to grab right now, this should be different to the length property which specifies the number of elements in the array.
>
> You can make your own class to do that.

I know I can, I have.

> The reason against that approach is that it requires more overhead (requiring both memory and CPU).

I dont think so. For example current array behaviour if you go.

a.length = 1000;
a.length = 1;

is to allocate space for 1000 entries and not free it.

Meaning the array already stores (or calculates) it's size in memory in a seperate place to the length.

Meaning all the reserve property needs to do, is the same thing the above 2 lines already does.

With one possible exception, I am not sure when/if the array frees the un-used memory, so it's possible that after those lines sometime in the middle of 1000 or so concatentation ops it decides to free the excess and I'd be back at square one (almost).

> Having the most primitive array type is essential because then we can do things like you suggest.  If this overhead was included in the primitive type then we would have to lug it around even when we don't need the overhead.  Most of the time you can write your program so that you don't even need reserve tricks for efficiency by determining the amount of memory nessary.

I agree totally, I don't believe my suggestion adds any overhead whatsoever.

> An array with a reserve would look like:
>
> struct array
> {
>     uint length;
>     uint reserve;
>     T * pointer }
>
> compare that to:
>
> struct array
> {
>     uint length;
>     T * pointer }
>
> Not to mention all the extra operations to maintain the reserved size.

I think you'll find they're already there in some form as arrays already display the behaviour I want, just not in an easy to use/understand/find form. It looks like a hack currently.

> Its not to hard to write an array class that has a reserve (10 lines or so), but it would be much harder the other way round.
>> Example..
>>
>> char[] p;
>> p.length = 10;
>>
>> the above adds 10 elements to the array, initializing them to the default value.
>> This is the current behaviour.
>>
>> char[] p;
>> p.reserve = 10;
>>
>> the above grabs enough memory for 10 elements, but does not add/initialize any.
>
> Initialization is pretty cheap for arrays.  That is initialization only take a fraction of the time it requires to allocate the memory for the array.  So whatever the time saved with initialization, the user will most probably not notice it.

Everything counts if you do it enough times. Why do something you do not need?

>> Regan.
>
>
> All this talk just makes me like D arrays even more.

I like em too. Writing the class/struct to give me this behaviour was trivial, easier than in C/C++.

But I still believe this change is a good one, it has pro's and no cons.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 15, 2004
On Tue, 15 Jun 2004 15:15:59 +0800, J Anderson <REMOVEanderson@badmama.com.au> wrote:

> J Anderson wrote:
>
>>> This should work. :)
>>>
>>> So why not add this to arrays then?
>>
> I do think it should be in phobos somewhere (where you don't need to include it).  I probably didn't outline that enough in the previous email.  That way different vendors can optimise the reserve to what works best for their program and individuals can overload it with there own optimised versions.

FYI Walter mentioned that std.outbuffer has a struct/class for doing what I want also.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/