.length modification question - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » D » .length modification question

Thread overview

.length modification question

Sep 19, 2003

Sep 19, 2003

Sep 19, 2003

Sep 19, 2003

Charles Sanders

Sep 20, 2003

Re: .length / .reserve suggestion
Sep 20, 2003 Helmut Leitner
Sep 20, 2003 J Anderson
Sep 23, 2003 Walter
Sep 23, 2003 J C Calvarese
Sep 23, 2003 Walter
Sep 23, 2003 Hauke Duden
Sep 23, 2003 Antti Sykäri
Sep 24, 2003 Sean L. Palmer
Sep 24, 2003 Hauke Duden
Sep 24, 2003 Sean L. Palmer
Sep 24, 2003 Hauke Duden
Sep 25, 2003 J Anderson
Sep 23, 2003 J C Calvarese
Sep 23, 2003 Helmut Leitner
Sep 23, 2003 Walter
Sep 23, 2003 Helmut Leitner
Sep 23, 2003 Helmut Leitner
Sep 23, 2003 Julio César Carrascal Urquijo
Dec 11, 2003 Walter
Sep 23, 2003 Vathix
Sep 23, 2003 Walter
Sep 23, 2003 Riccardo De Agostini

September 19, 2003

.length modification question

Posted by John Boucher

John Boucher

If
args.length = args.length - 1 ;
is OK, why do
args.length -= 1 ;
and
args.length-- ;
produce the compilation error
'args.length' is not an lvalue
?

I hope it's a bug rather than a (poor) design decision.

John Boucher
The King had Humpty pushed.

September 19, 2003

Re: .length modification question

Posted by J Anderson
in reply to John Boucher

J Anderson

Posted in reply to John Boucher

Attachments:

text/html part

John Boucher wrote:

>If
>args.length = args.length - 1 ;
>is OK, why do
>args.length -= 1 ;
>and
>args.length-- ;
>produce the compilation error
>'args.length' is not an lvalue
>?
>
>I hope it's a bug rather than a (poor) design decision.
>
>John Boucher
>The King had Humpty pushed.
> 
>
It's not nessarily a poor design decision.  There's been quite a bit of debate about this.  It's really to keep programmers from doing:

for (int n=0; n<x; x++)
{
    args.length++; //or args.length--;
    ...
}

Which is less much less efficient then.

args.length = args.length + x;
for (int n=0; n<x; x++)
{
    ...
}

As I see it, most of the time, you should be increasing/decreasing an array size in large blocks. Code should very rarely need to increase an array size by one.  The longer syntax is to discourage bad programming.

-Anderson

September 19, 2003

Re: .length modification question

Posted by John Boucher
in reply to J Anderson

John Boucher

Posted in reply to J Anderson

Ah, so the latter. I guess I'll stick with C# then.

In article <bkfq6p$5mt$1@digitaldaemon.com>, J Anderson says...
>
>This is a multi-part message in MIME format.
>--------------060601020803030507000701
>Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit
>
>John Boucher wrote:
>
>>If
>>args.length = args.length - 1 ;
>>is OK, why do
>>args.length -= 1 ;
>>and
>>args.length-- ;
>>produce the compilation error
>>'args.length' is not an lvalue
>>?
>>
>>I hope it's a bug rather than a (poor) design decision.
>>
>>John Boucher
>>The King had Humpty pushed.
>> 
>>
>It's not nessarily a poor design decision.  There's been quite a bit of debate about this.  It's really to keep programmers from doing:
>
>for (int n=0; n<x; x++)
>{
>    args.length++; //or args.length--;
>    ...
>}
>
>Which is less much less efficient then.
>
>args.length = args.length + x;
>for (int n=0; n<x; x++)
>{
>    ...
>}
>
>As I see it, most of the time, you should be increasing/decreasing an array size in large blocks. Code should very rarely need to increase an array size by one.  The longer syntax is to discourage bad programming.
>
>-Anderson
>
>--------------060601020803030507000701
>Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit
>
><!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
><html>
><head>
>  <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
>  <title></title>
></head>
><body text="#000000" bgcolor="#ffffff">
>John Boucher wrote:<br>
><blockquote type="cite" cite="midbkfms1$2uag$1@digitaldaemon.com">
>  <pre wrap="">If
>args.length = args.length - 1 ;
>is OK, why do
>args.length -= 1 ;
>and
>args.length-- ;
>produce the compilation error
>'args.length' is not an lvalue
>?
>
>I hope it's a bug rather than a (poor) design decision.
>
>John Boucher
>The King had Humpty pushed.
>  </pre>
></blockquote>
>It's not nessarily a poor design decision.&nbsp; There's been quite a bit of
>debate about this.&nbsp; It's really to keep programmers from doing:<br>
><br>
>for (int n=0; n&lt;x; x++)<br>
>{<br>
>&nbsp;&nbsp;&nbsp; args.length++; //or args.length--;<br>
>&nbsp;&nbsp;&nbsp; ...<br>
>}<br>
><br>
>Which is less much less efficient then.<br>
><br>
>args.length = args.length + x;<br>
>for (int n=0; n&lt;x; x++)<br>
>{<br>
>&nbsp;&nbsp;&nbsp; ...<br>
>}<br>
><br>
>As I see it, most of the time, you should be increasing/decreasing an
>array size in large blocks. Code should very rarely need to increase an
>array size by one.&nbsp; The longer syntax is to discourage bad programming.<br>
><br>
>-Anderson<br>
></body>
></html>
>
>--------------060601020803030507000701--
>

September 19, 2003

Re: .length modification question

Posted by Charles Sanders
in reply to John Boucher

Charles Sanders

Posted in reply to John Boucher

You guess you'll stick with C#, what is that supposed to mean ?

C

"John Boucher" <John_member@pathlink.com> wrote in message news:bkfr7m$8qm$1@digitaldaemon.com...
> Ah, so the latter. I guess I'll stick with C# then.
>
> In article <bkfq6p$5mt$1@digitaldaemon.com>, J Anderson says...
> >
> >This is a multi-part message in MIME format.
> >--------------060601020803030507000701
> >Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit
> >
> >John Boucher wrote:
> >
> >>If
> >>args.length = args.length - 1 ;
> >>is OK, why do
> >>args.length -= 1 ;
> >>and
> >>args.length-- ;
> >>produce the compilation error
> >>'args.length' is not an lvalue
> >>?
> >>
> >>I hope it's a bug rather than a (poor) design decision.
> >>
> >>John Boucher
> >>The King had Humpty pushed.
> >>
> >>
> >It's not nessarily a poor design decision.  There's been quite a bit of debate about this.  It's really to keep programmers from doing:
> >
> >for (int n=0; n<x; x++)
> >{
> >    args.length++; //or args.length--;
> >    ...
> >}
> >
> >Which is less much less efficient then.
> >
> >args.length = args.length + x;
> >for (int n=0; n<x; x++)
> >{
> >    ...
> >}
> >
> >As I see it, most of the time, you should be increasing/decreasing an array size in large blocks. Code should very rarely need to increase an array size by one.  The longer syntax is to discourage bad programming.
> >
> >-Anderson
> >
> >--------------060601020803030507000701
> >Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit
> >
> ><!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
> ><html>
> ><head>
> >  <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
> >  <title></title>
> ></head>
> ><body text="#000000" bgcolor="#ffffff">
> >John Boucher wrote:<br>
> ><blockquote type="cite" cite="midbkfms1$2uag$1@digitaldaemon.com">
> >  <pre wrap="">If
> >args.length = args.length - 1 ;
> >is OK, why do
> >args.length -= 1 ;
> >and
> >args.length-- ;
> >produce the compilation error
> >'args.length' is not an lvalue
> >?
> >
> >I hope it's a bug rather than a (poor) design decision.
> >
> >John Boucher
> >The King had Humpty pushed.
> >  </pre>
> ></blockquote>
> >It's not nessarily a poor design decision.&nbsp; There's been quite a bit
of
> >debate about this.&nbsp; It's really to keep programmers from doing:<br>
> ><br>
> >for (int n=0; n&lt;x; x++)<br>
> >{<br>
> >&nbsp;&nbsp;&nbsp; args.length++; //or args.length--;<br>
> >&nbsp;&nbsp;&nbsp; ...<br>
> >}<br>
> ><br>
> >Which is less much less efficient then.<br>
> ><br>
> >args.length = args.length + x;<br>
> >for (int n=0; n&lt;x; x++)<br>
> >{<br>
> >&nbsp;&nbsp;&nbsp; ...<br>
> >}<br>
> ><br>
> >As I see it, most of the time, you should be increasing/decreasing an
> >array size in large blocks. Code should very rarely need to increase an
> >array size by one.&nbsp; The longer syntax is to discourage bad
programming.<br>
> ><br>
> >-Anderson<br>
> ></body>
> ></html>
> >
> >--------------060601020803030507000701--
> >
>
>

September 20, 2003

Re: .length modification question

Posted by Andrew Edwards
in reply to John Boucher

Andrew Edwards

Posted in reply to John Boucher

"John Boucher" <John_member@pathlink.com> wrote in message news:bkfr7m$8qm$1@digitaldaemon.com...
> Ah, so the latter. I guess I'll stick with C# then.
>
You still here? Don't let the door hit you in the ass on the way out!

Andrew

September 20, 2003

Re: .length / .reserve suggestion

Posted by Helmut Leitner
in reply to J Anderson

Helmut Leitner

Posted in reply to J Anderson

> J Anderson wrote:
> 
> John Boucher wrote:
> 
> > If
> > args.length = args.length - 1 ;
> > is OK, why do
> > args.length -= 1 ;
> > and
> > args.length-- ;
> > produce the compilation error
> > 'args.length' is not an lvalue
> > ?
> >
> > I hope it's a bug rather than a (poor) design decision.
> >
> > John Boucher
> > The King had Humpty pushed.
> >
> >
> It's not nessarily a poor design decision.  There's been quite a bit of debate about this.  It's really to keep programmers from doing:
> 
> for (int n=0; n<x; x++)
> {
>     args.length++; //or args.length--;
>     ...
> }
> 
> Which is less much less efficient then.
> 
> args.length = args.length + x;
> for (int n=0; n<x; x++)
> {
>     ...
> }
> 
> As I see it, most of the time, you should be increasing/decreasing an array size in large blocks. Code should very rarely need to increase an array size by one.  The longer syntax is to discourage bad programming.
> 
> -Anderson

Is it really "to discourage bad programming"?

I would have assumed, that this is an arbitrary implementation detail.

Any programming language that is sufficiently complex, will have all doors open for "bad programing" and there is no way to stop this. Grandmothering was never C's style, at least. Is it in D?

==

One idea that hounts me, is the Java String/Stringbuffer problem, that is somehow reflected in the D OutBuffer class.

It basically means, that there are situations were you want to reserve space for an array which will then grow and shrink below this limits without reallocations.

In Java - and currently in D - you need to create special classes for this purpose.

If any array had a .reserve field (Walter's C++ Array class even has),
lots of ways would open up for efficient programming.
And we could drop the OutBuffer class completly.

We could also write
   args.reserve=100;
   args.length++;
without being inefficient.

The cost of 4 byte per array seems steeper than it is. It would only hurt
with small strings. Even with them (allocated with N*16 byte) the
effect would be small.

On the positive side, you could just write:

   alias char [] string;
   string buffer;
   buffer.reserve=100000;
   foreach(file; sourcefiles) {
      FileGetStr(file,buffer);
      ...
   }

without any reallocation inefficiencies in typical conditions.

This would also go a long way towards solving the problem
of efficient formatted output, which is still hindered be the
fact that string reallocation is unavoidable.

With a .reserve you could just write
   string s;
   s.length=100;
   StrFormat(s,"Name=",name);
   StrCatFormat(s,"Age=%d",age);
so that you can handle this like a normal string, while
no reallocation or object creation has to happen inside.

====

Therefore, please consider the suggestion to add a property
   .reserve
to the array type in this way:
   -  reserve can be read and set like length
   -  reallocations happen according to max(reserve,length)
advantage:
   -  if length changes below or equal reserve,
      no reallocations need to happen.

-- 
Helmut Leitner    leitner@hls.via.at
Graz, Austria   www.hls-software.com

September 20, 2003

Re: .length / .reserve suggestion

Posted by J Anderson
in reply to Helmut Leitner

J Anderson

Posted in reply to Helmut Leitner

Attachments:

text/html part

Helmut Leitner wrote:

> 
>
>>J Anderson wrote:
>>
>>John Boucher wrote:
>>
>> 
>>
>>>If
>>>args.length = args.length - 1 ;
>>>is OK, why do
>>>args.length -= 1 ;
>>>and
>>>args.length-- ;
>>>produce the compilation error
>>>'args.length' is not an lvalue
>>>?
>>>
>>>I hope it's a bug rather than a (poor) design decision.
>>>
>>>John Boucher
>>>The King had Humpty pushed.
>>>
>>>
>>> 
>>>
>>It's not nessarily a poor design decision.  There's been quite a bit of debate about this.  It's really to keep programmers from doing:
>>
>>for (int n=0; n<x; x++)
>>{
>>    args.length++; //or args.length--;
>>    ...
>>}
>>
>>Which is less much less efficient then.
>>
>>args.length = args.length + x;
>>for (int n=0; n<x; x++)
>>{
>>    ...
>>}
>>
>>As I see it, most of the time, you should be increasing/decreasing an array size in large blocks. Code should very rarely need to increase an array size by one.  The longer syntax is to discourage bad programming.
>>
>>-Anderson
>> 
>>
>
>Is it really "to discourage bad programming"?
>
>I would have assumed, that this is an arbitrary implementation detail.
>
>Any programming language that is sufficiently complex, will have all doors open for "bad programing" and there is no way to stop this. Grandmothering was never C's style, at least. Is it in D?
>
>==
>
>One idea that hounts me, is the Java String/Stringbuffer problem, that is somehow reflected in the D OutBuffer class.
>
>It basically means, that there are situations were you want to reserve space for an array which will then grow and shrink below this limits without reallocations.
>
>In Java - and currently in D - you need to create special classes for this purpose.
>
>If any array had a .reserve field (Walter's C++ Array class even has),
>lots of ways would open up for efficient programming.
>And we could drop the OutBuffer class completly.
>
> 
>
C++'s vector class does also.

>We could also write
>   args.reserve=100;
>   args.length++;
>without being inefficient.
>
>The cost of 4 byte per array seems steeper than it is. It would only hurt
>with small strings. Even with them (allocated with N*16 byte) the
>effect would be small.
>
>On the positive side, you could just write:
>
>   alias char [] string;
>   string buffer;
>   buffer.reserve=100000;
>   foreach(file; sourcefiles) {
>      FileGetStr(file,buffer);
>      ...
>   }
>
>without any reallocation inefficiencies in typical conditions.
>
>This would also go a long way towards solving the problem
>of efficient formatted output, which is still hindered be the
>fact that string reallocation is unavoidable.
>
>With a .reserve you could just write
>   string s;
>   s.length=100;
>   StrFormat(s,"Name=",name);
>   StrCatFormat(s,"Age=%d",age);
>so that you can handle this like a normal string, while
>no reallocation or object creation has to happen inside.
>
>====
>
>Therefore, please consider the suggestion to add a property
>   .reserve
>to the array type in this way:
>   -  reserve can be read and set like length
>   -  reallocations happen according to max(reserve,length)
>advantage:
>   -  if length changes below or equal reserve,
>      no reallocations need to happen.
>
> 
>
The reserve thing was debated in much detail as well. I think there where nine or ten different "reserve" techniques offered.  Actually a couple of "reserve" techniques required no extra memory at all. However, they all had overhead (not just memory overhead).  There's an extra check at each resize.  The resulting consensus was to have another type, defined in the standard lib.

A programmer should be given the option of using a reserve or not. That's exactly what leaving it out of the standard D array does because they can use the standard lib one, or any other allocator scheme they like.

-Anderson

September 23, 2003

Re: .length / .reserve suggestion

Posted by Walter
in reply to Helmut Leitner

Walter

Posted in reply to Helmut Leitner

Actually, the implementation of D arrays does have a bit of a 'reserve', since the garbage collector allocates by using power of 2 buckets. This is an implementation detail, though. It still makes for much more efficient code to set the .length to some reasonably expected reserve value, then fill up the array, then reset the .length to the final size.

September 23, 2003

Re: .length / .reserve suggestion

Posted by J C Calvarese
in reply to Walter

J C Calvarese

Posted in reply to Walter

Walter wrote:
> Actually, the implementation of D arrays does have a bit of a 'reserve',
> since the garbage collector allocates by using power of 2 buckets. This is
> an implementation detail, though. It still makes for much more efficient
> code to set the .length to some reasonably expected reserve value, then fill
> up the array, then reset the .length to the final size.

But how do you keep up with the currently-filled length using this method?  I'm guessing you have an extra variable whose use may not be obvious.  I'd rather use .length for the current length and use .reserve to set a reasonable expected length.  I think Helmut's idea makes a lot of sense.  Would it be difficult to implement?

Justin

September 23, 2003

Re: .length / .reserve suggestion

Posted by Helmut Leitner
in reply to Walter

Helmut Leitner

Posted in reply to Walter

Walter wrote:
> 
> Actually, the implementation of D arrays does have a bit of a 'reserve', since the garbage collector allocates by using power of 2 buckets.

This will help little. Lets assume, someone decides to read a file line by line (or read it in one sweep but ends with a line in a string instead of a slice). The line will be reallocated hundredes of times.

  char line[];
  line.reserve=1024;

would set aside enough space so that reallocations would happen almost never.

> This is
> an implementation detail, though. It still makes for much more efficient
> code to set the .length to some reasonably expected reserve value, then fill
> up the array, then reset the .length to the final size.

Yes, I know that. But very often this is a repeated process and reallocation can only be avoided with a .reserve.

How does your dmd compiler avoid reallocation of the "current source buffer"?
If he does, he must use some kind of .reserve .
If he doesn't there would be room for making it faster.

This reserve thing is not theoretical. After 20 years of C-programming I've based all my C dynamic array handling on a "generic" Buffer structure, that looks like (translated to fit):

  typedef struct buffer {
     void *ptr;
     int length;
     int reserve;
     (int element_size;  sometimes implicit)
  } BUFFER ;

There are generic functions for reallocations, insertions, deletions, sorting...

The reserve is an essential feature for optimization, for you can often use heuristics. For example you build a hash-bucket-table for a dictionary. You don't know how many entries you will have (perhaps 2000 or 10000), but you definitely don't want to start at the default (maybe 50?) to reallocate and rehash 5-8 times to reach final size. A simple .reserve=1000 will give you a much better start and give you the feeling that you have incorporated your partial knowledge as good as possible.

How do you (Walter) build symbol tables during DMD compilation? Do you reallocate
them when the symbols come? You can't know before how large these tables will grow.
But I suppose you won't think much about the small sources and to mimize
the compiler memory need for these. So you will reserve moderate space for
these arrays, don't you?

====

I see two problems with reserve.

The first is an implementation problem. An array is not an object. It is stored
and passed as a (ptr, size) duple. And this is also how slices are handled (I think).
So reserve it is not justing adding a field to an object.

The second is a performance issue. Typically compiler builders have different performance interests than programmers. Compiler builders want to look good in benchmarks. Programmers want to write fast applications easily. A .reserve might worsen benchmarks (I don't think so, but we don't know), but would be an enormous benefit for the application programmer, because he needn't write special classes (which he often won't do) to get optimum performance.

An example of this conflict of interest is the wc official sample program,
which will perform  but not scale, because
        if (inword)
        {   char[] word = input[wstart .. input.length];
            dictionary[word]++;
        }
will keep the basic input buffer (the whole file) in memory. So you would
be surprised about this code, if you would extend it to count the symbols
in a larger set of files. It will eat all your memory. And it would be a
good "test for experienced D programming capability" to change this to a
performing and scaling example.

====

I also want to restate, that "basic IO" is still missing in D.

This means, you can't write a tutorial that makes sense, because no
sensible IO is available (you have to go back to printf and its friends) that
you want to show anyone new to the language.

Why is this so? Because "char []" can't be used for IO in a performing way!
So this decision and implementation has been pushed and pushed ....
And you can't work with "char buffer[N]" because
  - you can never be sure about the buffer size
  - you will need conversions all the time
The obvious solution, the final, powerful and performing String class
is a pure vision. I suppose it will never come.

.reserve would be a pragmatic solution to this.

-- 
Helmut Leitner    leitner@hls.via.at
Graz, Austria   www.hls-software.com

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation