resizing arrays won't work (page 6) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » D » resizing arrays won't work (page 6)

June 18, 2002

Re: resizing arrays won't work

Posted by Pavel Minayev
in reply to Nic Tiger

Pavel Minayev

Posted in reply to Nic Tiger

On Tue, 18 Jun 2002 10:06:23 +0400 "Nic Tiger" <nictiger@pt.comcor.ru> wrote:

> Didn't anyone noticed that the following example is not valid not only for current DMD (as long as Walter doesn't allow ++ on properties) but is simply *buggy*?
>>                int array[];
>>                array[array.length++] = x;
> 
> AFAIK, the sequence of event will be:
>     array[array.length] = x;
>     array.length=array.length+1;
> 
> And it is simply not valid array expanding. It's a bug which in the best case will give GPF (in the worst it silently corrupts memory). In fact, the

Isn't subscript expression supposed to be calculated BEFORE the subscript operator itself gets called???

June 18, 2002

Re: resizing arrays won't work

Posted by Sean L. Palmer
in reply to Nic Tiger

Sean L. Palmer

Posted in reply to Nic Tiger

"Nic Tiger" <nictiger@pt.comcor.ru> wrote in message news:aemia1$1ap7$1@digitaldaemon.com...
> Didn't anyone noticed that the following example is not valid not only for current DMD (as long as Walter doesn't allow ++ on properties) but is
simply
> *buggy*?
> >                int array[];
> >                array[array.length++] = x;
>
> AFAIK, the sequence of event will be:
>     array[array.length] = x;
>     array.length=array.length+1;

No it should evaluate to:

int tmp = array.length;
array.length = array.length+1;
array[tmp] = x;

Due to operator precedence.

> And it is simply not valid array expanding. It's a bug which in the best case will give GPF (in the worst it silently corrupts memory). In fact,
the
> sequence should be:
>     array.length=array.length+1;
>     array[array.length-1] = x;
>
> which corresponds to
>     array[(++array.length)-1] = x;

That's fine, it's the same thing but probably won't optimize as well.  A compiler could be made that would check for this case, of course.

> And after all, I really think that form is not very good. I agree that properties should have only gettors and settors, and ++, --, etc, should
not
> be used with them.

Perhaps they shouldn't be in the base language either.  ++ on int, is it more clear to be x  = x + 1 ?

> Nic Tiger.

June 18, 2002

Re: resizing arrays won't work

Posted by Sean L. Palmer
in reply to Matthew Wilson

Sean L. Palmer

Posted in reply to Matthew Wilson

I guess I'm only disagreeing about the percentage.  150% of 1 is 1 .. 150% of 2 is 3

(0+1)*1.5 = 1
(1+1)*1.5 = 3
(3+1)*1.5 = 6

Or are you doing it like so:

1.5*0+1 = 1
1.5*1+1 = 2
1.5*2+1 = 4

But 1.25 would work more like this:

1.25*0+1 = 1
1.25*1+1 = 2
1.25*2+1 = 3
1.25*3+1 = 4
1.25*4+1 = 6

Doesn't start padding til #5, it's safer in a memory tight environment especially if the allocation is big.

But don't get me wrong, 150% is a fine choice and will work most of the time and you're probably right about the long-term growth characteristics.

Sean

"Matthew Wilson" <matthew@thedjournal.com> wrote in message news:aemmi3$1fuo$1@digitaldaemon.com...
> You sounds to be agreeing violently with me, or is is too late in the day?
>
> "Sean L. Palmer" <seanpalmer@earthlink.net> wrote in message news:aemmca$1fpa$1@digitaldaemon.com...
> > It's not however the best balance of wasted memory (the memory allocated
> but
> > never used or detected as "wasted")
> >
> > I was thinking this over the other day and by the time you get to 5 or 6
> it
> > should start giving you one extra...at about 12 it should give you 2
> extra.
> >
> > Think about it this way.  You're working with modules that are so big
that
> > (or memory is so tight that) if one extra gets allocated, the game won't
> run
> > (it will run out of memory and crash).  How many of those are you likely
> to
> > be allocating.
> >
> > If it's so big that allocating only 1 will be possible, you obviously
want
> > no extras.
> >
> > If you can fit 2, good, but still wouldn't want to bet on having room
for
> > one more.
> >
> > But say they've allocated 3 already.  Will they allocate more or not?
> It's
> > the gamble.  It may be that they only had enough memory for 3 of them.
Or
> > it could be the start of a very big pattern.
> >
> > I think that by about 4 you can be pretty sure they are small enough
that
> if
> > you allocate one extra it might not be such a big problem.
> >
> > Or perhaps the allocator could be based on size of object.  Really large objects don't allocate as many in advance.  The smaller, the more likely
> to
> > allocate more in advance (down to the optimal 1.5x approach).  For byte sizes I'd have it start out allocating 4 or 8 more just in case.  The
heap
> > structure usually needs this much alignment anyway.  At size 4 bytes I'd start allocating exactly how many they asked for times 2.  2.0x growth (round down).  For a 64K object, probably wouldn't want any extra preallocation (but you *might!*)  And somewhere in between is the 150%, probably at about the OS page level or cacheline level.  This could be
> info
> > that the profiler could provide.  In fact, a standard allocator could perhaps be instructed to analyze usage patterns and come up with
> recommended
> > allocation strategy percentage as a result (modify its "prediction" over time).  If you pluck these values out from the debugger after letting
the
> > program run for a while you can drop that number in during
initialization
> to
> > give the allocator a "hint" as to how you're going to be using it ("I'm going to be allocating a bunch of these" == 10.0x or more and "This is a singleton" == 0.0x)  In any case the allocator would be able to modify
the
> > number over time if it wished so it could be self-correcting.  For that
> the
> > allocator would need access to some kind of timer (to tell the
difference
> > between 1 allocation per frame and 10000 allocations per frame).
Perhaps
> it
> > could be a method
> >
> > allocator.time_elapsed(1000); // 1 second in milliseconds
> >
> > During which it could update its prediction value.  Default would be
about
> > 1.25.
> >
> > I tried some functions with log, and you *way* don't want to go there. Exponentiation is not the thing we're after here.
> >
> > This would need to be a compiler switch probably (memory allocation prediction factor).  Or a configurable parameter of a standard
allocator.
> >
> > Sean
> >
> > "Matthew Wilson" <dmd@synesis.com.au> wrote in message news:aellgq$fgj$2@digitaldaemon.com...
> > > Koenig showed in (can't remember reference) that reallocation by 150%
is
> > the
> > > best balance of speed/size. I would be quite happy with that.
> > >
> > > I enhanced that in a string class to act only from the second
allocation
> > > onwards, i.e.. the first allocation is exact, in order to deal with
> > constant
> > > strings. The same could be applied to any array type

June 18, 2002

Re: resizing arrays won't work

Posted by Burton Radons
in reply to Pavel Minayev

Burton Radons

Posted in reply to Pavel Minayev

Pavel Minayev wrote:
> On Mon, 17 Jun 2002 17:56:36 -0700 "Walter" <walter@digitalmars.com> wrote:
> 
>>That was the original plan, but due to the way array slicing works, there's
>>no way for the GC to tell that there's "unused" space off the end of the
>>array. Hence, it must reallocate and copy every time. The result is terrible
>>performance - the cure is to preallocate the length you'll need, or use
>>OutBuffer which will do block reallocation in suitable increments.
> 
> Does this mean that ~= also performs a reallocation each time?
> It would be inacceptible...

Yes, ~= is functionally identical to incrementing the length yourself. I'm working on a journal article detailing internal behaviour of dynamic arrays.  It's either this way, making slice operations an active copy, or a maximum field.

June 18, 2002

Re: resizing arrays won't work

Posted by Nic Tiger
in reply to Pavel Minayev

Nic Tiger

Posted in reply to Pavel Minayev

ðÏÓÍÏÔÒÅÌ ASM, ËÏÔÏÒÙÊ ÇÅÎÅÒÉÒÕÅÔÓÑ, É ×ÙÎÕÖÄÅÎ ÓÏÇÌÁÓÉÔØÓÑ.
ðÏ ËÒÁÊÎÅÊ ÍÅÒÅ ÄÌÑ
  int[5] p;
  int ind=3;
  p[ind++] = 100;

ÐÏÓÌÅÄÏ×ÁÔÅÌØÎÏÓÔØ ËÏÍÁÎÄ
  tmp=ind;
  ind ++;
  p[tmp] = 100;

Nic Tiger.

"Pavel Minayev" <evilone@omen.ru> wrote in message news:CFN374255125238194@news.digitalmars.com...
> On Tue, 18 Jun 2002 10:06:23 +0400 "Nic Tiger" <nictiger@pt.comcor.ru>
wrote:
>
> > Didn't anyone noticed that the following example is not valid not only
for
> > current DMD (as long as Walter doesn't allow ++ on properties) but is
simply
> > *buggy*?
> >>                int array[];
> >>                array[array.length++] = x;
> >
> > AFAIK, the sequence of event will be:
> >     array[array.length] = x;
> >     array.length=array.length+1;
> >
> > And it is simply not valid array expanding. It's a bug which in the best case will give GPF (in the worst it silently corrupts memory). In fact,
the
>
> Isn't subscript expression supposed to be calculated BEFORE the subscript operator itself gets called???

June 18, 2002

Re: resizing arrays won't work

Posted by Sean L. Palmer
in reply to Burton Radons

Sean L. Palmer

Posted in reply to Burton Radons

Those are our 3 choices?

I'll take the maximum field.  Useful for doing a sort of "reserve" operation.  After that, ~= and array.length++ aren't nearly so much of a problem.  The only problem is that dynamic arrays then need 4 more bytes of storage.  Well ram is cheap, and getting cheaper.  ;)

Sean


"Burton Radons" <loth@users.sourceforge.net> wrote in message news:3D0EF52D.90600@users.sourceforge.net...
> Pavel Minayev wrote:
> > On Mon, 17 Jun 2002 17:56:36 -0700 "Walter" <walter@digitalmars.com>
wrote:
> >
> >>That was the original plan, but due to the way array slicing works,
there's
> >>no way for the GC to tell that there's "unused" space off the end of the array. Hence, it must reallocate and copy every time. The result is
terrible
> >>performance - the cure is to preallocate the length you'll need, or use OutBuffer which will do block reallocation in suitable increments.
> >
> > Does this mean that ~= also performs a reallocation each time? It would be inacceptible...
>
> Yes, ~= is functionally identical to incrementing the length yourself. I'm working on a journal article detailing internal behaviour of dynamic arrays.  It's either this way, making slice operations an active copy, or a maximum field.

June 18, 2002

Re: resizing arrays won't work

Posted by anderson
in reply to Nic Tiger

anderson

Posted in reply to Nic Tiger

Yer I was thinking that too. I saw it on th D page and couldn't work out why it worked.

Well that makes it impossible to go

array[array.length++] = foo();

I'd have to be,
array[(++array.length)-1] = foo();

Instead, which is kinda longer hand anyway.

I still think that ++ -- += -= should be allowed somehow with properties. Well *sigh* I'll have to wait for language E to include that.

June 18, 2002

Re: resizing arrays won't work

Posted by anderson
in reply to Roberto Mariottini

anderson

Posted in reply to Roberto Mariottini

*sigh*

But of coarse I was discussing the exceptions to this rule. ie when you are collecting or grouping data from an array and you have no idea what size the array needs to be expanded to. If you increased it by n every time it could be in-efficient if n = 10000 and the resize only needed to be 10. Also versa versa.

Sometimes things like this can be solved in two array passes, but when the results directly related to the movement of data in the arrays this it is impossible to use that method.

June 18, 2002

Re: resizing arrays won't work

Posted by anderson
in reply to Pavel Minayev

anderson

Posted in reply to Pavel Minayev

That's a very interesting point!

All this time,

array[array.length++] = Value;

Could have been done more simply like

array ~= Value;

I forgot about the ~ character

Parhaps Walter should change the documentation

"
The D Way
D supports dynamic arrays, which can be easilly resized. D supports all the
requisite memory management.
               int array[];

               array[array.length++] = x;

"

to

"
The D Way
D supports dynamic arrays, which can be easilly resized. D supports all the
requisite memory management.
               int array[];

               array ~= x;

"

"Pavel Minayev" <evilone@omen.ru> wrote in message
news:CFN374255085725347@news.digitalmars.com...
On Mon, 17 Jun 2002 17:43:25 -0700 "Walter" <walter@digitalmars.com> wrote:

> I see that kind of code all the time, over and over. I saw it again
recently
> in a popular benchmark program. I've shown many people how they can
improve
> performance a thousand+ fold by preallocating the array.
>
> I believe that by supporting ++ on the .length, that would implicitly
bless
> that wrong technique.

But still... you do support operator ~= (add element to the end of
the array), don't you? ++ is essentially the same, it adds the element
and initializes it with default value for its type, so what's the
difference? We're talking of dynamic arrays, after all, I understand
that preallocation is a useful technique, but it is not always useable.
Since ~= is there, ++ should also be just for consistency! =)

Also, I wonder why there is no -- ? It would be used even wider, since it is practically a stack-pop operator...

June 18, 2002

Re: resizing arrays won't work

Posted by Burton Radons
in reply to anderson

Burton Radons

Posted in reply to anderson

anderson wrote:
> Yer I was thinking that too. I saw it on th D page and couldn't work out why
> it worked.
> 
> Well that makes it impossible to go
> 
> array[array.length++] = foo();
> 
> I'd have to be,
> array[(++array.length)-1] = foo();

No - disinformation sure spreads quickly.  Nic was right but for the wrong reason: C and D have very wide flexibility in subexpression evaluation order.  The first example can be implemented as any of:

   array.length += 1; array [array.length - 1] = foo ();
   array [array.length] = foo (); array.length += 1;
   array [(array.length += 1) - 1] = foo ();

Or any of the multitude of combinations that assembly provides, and the second example is no better.  The only guarantee that C and D make is that if you modify the value of a parameter, the side-effect will be stored in memory by the end of the expression (or the next sequence point, such as || and && and some others that may not be sequence points in D).

C also adds that if you read and write a variable in the same sequence, the result is undefined, meaning 42 can always be the answer above in C.  D's a little more explicit: "If the compiler can determine that the result of an expression is illegally dependent on the order of evaluation, it can issue an error (but is not required to). The ability to detect these kinds of errors is a quality of implementation issue."

Unfortunately it's very easy for it to become impossible to unambiguously determine whether there's a subexpression evaluation order dependency when functions and methods are involved, but when they do occur you should assume it's going to yield 42.

The comp.lang.c FAQ (ftp://ftp.eskimo.com/u/s/scs/C-faq/faq) devotes questions 3.1 to 3.9 to this issue.

> Instead, which is kinda longer hand anyway.
> 
> I still think that ++ -- += -= should be allowed somehow with properties.
> Well *sigh* I'll have to wait for language E to include that.

Agreed, this should be allowed.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation