Array bounds checking causes algorithmic nasties (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Array bounds checking causes algorithmic nasties (page 3)

July 14, 2004

Re: Array bounds checking causes algorithmic nasties

Posted by Matthew
in reply to Walter

Matthew

Posted in reply to Walter

"Walter" <newshound@digitalmars.com> wrote in message news:cd3voa$2rie$1@digitaldaemon.com...
> I don't know what is happening inside the loop, but what it superficially looks like here is trying to apply C style pointer arithmetic optimizations to D. With foreach, I'll argue that 1) it isn't necessary and 2) using the index form, the optimizer can transform it to the pointer form automatically.
>
> While doing the C pointer form is still possible in D, such as:
>
> ORJRecordA* begin = cast(OBJRecordA*)m_database.records; ORJRecordA* end = begin + m_database.records.length;
>
> I'd argue that one will be better off using foreach or the index form. One reason is that using the pointer form is NOT necessarilly the most efficient. Another is that the pointer form can impair more aggressive optimizations. Using the higher level construct will enable advanced compilers to do a better job of code generation than if the source usurps that by going directly to pointer arithmetic.
>
> [Note: the 'index form' would be:
>     for (size_t i = 0; i < m_database.records; i++)
>     {
>         ... m_database[i] ...;
>     }
> ]

That all may be true. My main point is that because one has used the "C" address of operator, array bounds checking should not apply.

But I can change. :)

> Small D-style nit:
>
> In D, declare pointers as:
>     char* p;
> rather than the C style:
>     char *p;
> because in D:
>     char* p,q;    // p and q are both pointers to char
> whereas in C:
>     char *p,q;    // p is a pointer, q is a char
>
> Using whitespace in this way helps illustrate the left-associativity of D's * rather than the right-associativity of C.

Excellent point. Thankfully I've not been bitten by this since I never do multiple declarations on the same line. In fact, I didn't even know this. Nonetheless, I'll try and move my splats.

July 14, 2004

Re: Array bounds checking causes algorithmic nasties

Posted by Walter
in reply to Matthew

Walter

Posted in reply to Matthew

"Matthew" <admin@stlsoft.dot.dot.dot.dot.org> wrote in message news:cd481m$adi$2@digitaldaemon.com...
>
> "Walter" <newshound@digitalmars.com> wrote in message
> > > Doesn't work, since it gives me copies of the record structures, and I
> > need their
> > > addresses.
> >
> > For what purpose?
>
> So that the Record instance can hold a pointer to the underlying
ORJRecordA
> structure, which lives in a contiguous block headed by the ORJDatabaseA structure. (This is one of the nice things about OpenRJ: there are only
two
> memory (re-)allocations in the creation of the database from the database
file.
> In almost all circumstances this amounts to one block, since only other
threads
> might incur an allocation that would require the second ORJ allocation to
not
> expand the original block.)

Ok, I see. For that, I think using the index form of the loop will work, and you can take the address of the item within the loop.

July 14, 2004

Re: Array bounds checking causes algorithmic nasties

Posted by Walter
in reply to Matthew

Walter

Posted in reply to Matthew

"Matthew" <admin@stlsoft.dot.dot.dot.dot.org> wrote in message news:cd481n$adi$3@digitaldaemon.com...
> My main point is that because one has used the "C" address of operator, array bounds checking should not apply.

I understand, but it may be difficult to implement right, since one still would want to disallow things like:

    p = &array[array.length + 1];

July 14, 2004

Re: Array bounds checking causes algorithmic nasties

Posted by Regan Heath
in reply to Walter

Regan Heath

Posted in reply to Walter

On Wed, 14 Jul 2004 11:59:33 -0700, Walter <newshound@digitalmars.com> wrote:
> I don't know what is happening inside the loop, but what it superficially
> looks like here is trying to apply C style pointer arithmetic optimizations
> to D. With foreach, I'll argue that 1) it isn't necessary and 2) using the
> index form, the optimizer can transform it to the pointer form
> automatically.
>
> While doing the C pointer form is still possible in D, such as:
>
> ORJRecordA* begin = cast(OBJRecordA*)m_database.records;
> ORJRecordA* end = begin + m_database.records.length;
>
> I'd argue that one will be better off using foreach or the index form. One
> reason is that using the pointer form is NOT necessarilly the most
> efficient. Another is that the pointer form can impair more aggressive
> optimizations. Using the higher level construct will enable advanced
> compilers to do a better job of code generation than if the source usurps
> that by going directly to pointer arithmetic.
>
> [Note: the 'index form' would be:
>     for (size_t i = 0; i < m_database.records; i++)
>     {
>         ... m_database[i] ...;
>     }
> ]

It appears using inout on foreach also gives the originals, and thus allows you to take the address of them, is this guaranteed behaviour? The docs say "inout can be used to update the original elements".

void main()
{
  char[] test = "regan";
  char *begin = cast(char*)test;
  char *end   = begin+test.length;

  for(; begin != end; ++begin)
  {
    printf("%08x\n",begin);
  }
  printf("\n");

  foreach(inout char c; test)
  {
    printf("%08x\n",&c);
  }
  printf("\n");
}

prints:

0040f080
0040f081
0040f082
0040f083
0040f084

0040f080
0040f081
0040f082
0040f083
0040f084

Regan

> Small D-style nit:
>
> In D, declare pointers as:
>     char* p;
> rather than the C style:
>     char *p;
> because in D:
>     char* p,q;    // p and q are both pointers to char
> whereas in C:
>     char *p,q;    // p is a pointer, q is a char
>
> Using whitespace in this way helps illustrate the left-associativity of D's
> * rather than the right-associativity of C.


-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

July 15, 2004

Re: Array bounds checking causes algorithmic nasties

Posted by Matthew
in reply to Andrew

Matthew

Posted in reply to Andrew

"Andrew" <Andrew_member@pathlink.com> wrote in message news:cd3f9r$1vjl$1@digitaldaemon.com...
> In article <cd37fn$1iv6$1@digitaldaemon.com>, Matthew says...
> >
> >
> >"Andrew Edwards" <ridimz_at@yahoo.dot.com> wrote in message news:cd31h6$1901$1@digitaldaemon.com...
> >>
> >> My experience has been that I can always get an address(index) of an array by specifying an int counter in the foreach loop.
> >>
> >> typedef char[] ORJRecordA;
> >>
> >> struct database
> >> {
> >>    ORJRecordA[] records;
> >> }
> >>
> >> void main ()
> >> {
> >>    database m_database;
> >>    m_database.records ~= cast(ORJRecordA)"Contents @ Address 0";
> >>    m_database.records ~= cast(ORJRecordA)"Contents @ Address 1";
> >>
> >>    foreach(int address, ORJRecordA rec; m_database.records)
> >>     printf("%2d: %.*s"\n,address,cast(char[])rec);
> >> }
> >
> >Mate, you'll have to explain what you're doing here. This looks like a grand hack. A mighty beguiling one, to be sure, but a hack nonetheless.
> >
>
> Simply put, all arrays (including char[]) has both an index and a value at that index location. foreach normally allows access to the value, however you can always access the index by explicitly identifying it.
>
> void main() {
> char[] string = "this is a string";
>
> foreach(int idx, char c; string) {
> printf("%d: %c\n",idx,c);
> string[idx] = c + 1;
> }
> printf(string);
> }

Excellent! I never knew that. :)

July 15, 2004

Re: Array bounds checking causes algorithmic nasties

Posted by Matthew
in reply to Walter

Matthew

Posted in reply to Walter

"Walter" <newshound@digitalmars.com> wrote in message news:cd4b5i$gjg$1@digitaldaemon.com...
>
> "Matthew" <admin@stlsoft.dot.dot.dot.dot.org> wrote in message news:cd481n$adi$3@digitaldaemon.com...
> > My main point is that because one has used the "C" address of operator, array bounds checking should not apply.
>
> I understand, but it may be difficult to implement right, since one still would want to disallow things like:
>
>     p = &array[array.length + 1];

Well, bearing in mind a weakening resolve to argue this point, I'd say that's up to the programmer. They've taken a trip to C-world. Let them swim with the sharks

July 15, 2004

Re: Array bounds checking causes algorithmic nasties

Posted by Matthew
in reply to Regan Heath

Matthew

Posted in reply to Regan Heath

"Regan Heath" <regan@netwin.co.nz> wrote in message news:opsa5ek4sh5a2sq9@digitalmars.com...
> On Wed, 14 Jul 2004 11:59:33 -0700, Walter <newshound@digitalmars.com> wrote:
> > I don't know what is happening inside the loop, but what it superficially
> > looks like here is trying to apply C style pointer arithmetic
> > optimizations
> > to D. With foreach, I'll argue that 1) it isn't necessary and 2) using
> > the
> > index form, the optimizer can transform it to the pointer form
> > automatically.
> >
> > While doing the C pointer form is still possible in D, such as:
> >
> > ORJRecordA* begin = cast(OBJRecordA*)m_database.records; ORJRecordA* end = begin + m_database.records.length;
> >
> > I'd argue that one will be better off using foreach or the index form.
> > One
> > reason is that using the pointer form is NOT necessarilly the most
> > efficient. Another is that the pointer form can impair more aggressive
> > optimizations. Using the higher level construct will enable advanced
> > compilers to do a better job of code generation than if the source usurps
> > that by going directly to pointer arithmetic.
> >
> > [Note: the 'index form' would be:
> >     for (size_t i = 0; i < m_database.records; i++)
> >     {
> >         ... m_database[i] ...;
> >     }
> > ]
>
> It appears using inout on foreach also gives the originals, and thus allows you to take the address of them, is this guaranteed behaviour? The docs say "inout can be used to update the original elements".
>
> void main()
> {
>    char[] test = "regan";
>    char *begin = cast(char*)test;
>    char *end   = begin+test.length;
>
>    for(; begin != end; ++begin)
>    {
>      printf("%08x\n",begin);
>    }
>    printf("\n");
>
>    foreach(inout char c; test)
>    {
>      printf("%08x\n",&c);
>    }
>    printf("\n");
> }
>
> prints:
>
> 0040f080
> 0040f081
> 0040f082
> 0040f083
> 0040f084
>
> 0040f080
> 0040f081
> 0040f082
> 0040f083
> 0040f084

Excellent work! I'll use that one.

<abashed>
Actually, all this is now moot, since the implementation of Open-RJ has moved on.
Ha ha!
</abashed>

July 15, 2004

Re: Array bounds checking causes algorithmic nasties

Posted by Lars Ivar Igesund
in reply to Lars Ivar Igesund

Lars Ivar Igesund

Posted in reply to Lars Ivar Igesund

Lars Ivar Igesund wrote:
> Matthew wrote:
> 
>> "Lars Ivar Igesund" <larsivar@igesund.net> wrote in message
>> news:cd2op0$n6m$1@digitaldaemon.com...
>>
>>> Matthew wrote:
>>>
>>>>   ORJRecordA *begin = &m_database.records[0];
>>>>   ORJRecordA *end = &m_database.records[m_database.records.length];
>>>>
>>>>   for(; begin != end; ++begin)
>>>>   {
>>>>        . . .
>>>>
>>>> The process halts with an ArrayBoundsError on the second line above.
>>>
>>>
>>> Well, the last element should be:
>>>
>>>   m_database.records[m_database.records.length - 1];
>>
>>
>>
>> Not correct. Give it another think. :)
>>
> 
> Maybe I misunderstand something, but you get an ArrayBoundsError if you do
> arr[arr.length];
> 
> Lars Ivar Igesund

After rereading your message (rethinking wasn't needed) with some flu cleared from my brain, I understand that my solution was no solution. Sorry for acting stupid :)

Lars Ivar Igesund

July 15, 2004

Re: Array bounds checking causes algorithmic nasties

Posted by Matthew Wilson
in reply to Lars Ivar Igesund

Matthew Wilson

Posted in reply to Lars Ivar Igesund

"Lars Ivar Igesund" <larsivar@igesund.net> wrote in message news:cd5bi7$1pp2$1@digitaldaemon.com...
> Lars Ivar Igesund wrote:
> > Matthew wrote:
> >
> >> "Lars Ivar Igesund" <larsivar@igesund.net> wrote in message news:cd2op0$n6m$1@digitaldaemon.com...
> >>
> >>> Matthew wrote:
> >>>
> >>>>   ORJRecordA *begin = &m_database.records[0];
> >>>>   ORJRecordA *end = &m_database.records[m_database.records.length];
> >>>>
> >>>>   for(; begin != end; ++begin)
> >>>>   {
> >>>>        . . .
> >>>>
> >>>> The process halts with an ArrayBoundsError on the second line above.
> >>>
> >>>
> >>> Well, the last element should be:
> >>>
> >>>   m_database.records[m_database.records.length - 1];
> >>
> >>
> >>
> >> Not correct. Give it another think. :)
> >>
> >
> > Maybe I misunderstand something, but you get an ArrayBoundsError if you
do
> > arr[arr.length];
> >
> > Lars Ivar Igesund
>
> After rereading your message (rethinking wasn't needed) with some flu cleared from my brain, I understand that my solution was no solution. Sorry for acting stupid :)

No worries, mate. :-)

July 15, 2004

Re: Array bounds checking causes algorithmic nasties

Posted by Rex Couture
in reply to Walter

Rex Couture

Posted in reply to Walter

In article <cd3voa$2rie$1@digitaldaemon.com>, Walter says...
>I'd argue that one will be better off using foreach or the index form. One reason is that using the pointer form is NOT necessarilly the most efficient. Another is that the pointer form can impair more aggressive optimizations. Using the higher level construct will enable advanced compilers to do a better job of code generation than if the source usurps that by going directly to pointer arithmetic.

Yeah!

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation