Jump to page: 1 2
Thread overview
[bug] static array slice as associative array index
Apr 11, 2004
serge
Apr 11, 2004
serge
Re: [BUG] static array slice as associative array index
Apr 11, 2004
larry cowan
Apr 11, 2004
larry cowan
Apr 12, 2004
serge
Apr 12, 2004
School
Apr 12, 2004
J Anderson
Apr 13, 2004
School
Apr 13, 2004
J Anderson
Apr 13, 2004
larry cowan
Apr 14, 2004
Ben Hinkle
Re: [bug] static array slice as associative array index - hash-test.zip
Apr 14, 2004
serge
Apr 14, 2004
serge
Apr 14, 2004
Ben Hinkle
Apr 16, 2004
Ben Hinkle
Apr 16, 2004
C
April 11, 2004
I have just found D language and was interested in its performance. I found this page http://www.functionalfuture.com/d/ and was not impressed by 'hash' benchmark performance, so I tried to make my own implementation. But the problem is that it seems to work for small values of 'n' but starts producing incorrect results when it becomes larger (100000 for example, it gives 25659 instead of 18699). I use DMD 0.82 in Linux. If there is no bug in compiler, I would like to know what's wrong with my program. Thanks.


import std.c.stdio;
import std.string;

int main(char[][] args)
{
int n = args.length < 2 ? 1 : atoi(args[1]);
int[char[]] X;
char[32] buf;
int c = 0;

for (int i = 1; i <= n; i++)
{
sprintf(buf, "%x", i);
X[buf[0..strlen(buf)]] = i;
}

for (int i = n; i >= 1; i--)
{
sprintf(buf, "%d", i);
if (buf[0..strlen(buf)] in X) c++;
}

printf("%d\n", c);

return 0;
}


April 11, 2004
In article <c5b7km$osi$1@digitaldaemon.com>, serge@lxnt.info says...
>
>I have just found D language and was interested in its performance. I found this page http://www.functionalfuture.com/d/ and was not impressed by 'hash' benchmark performance, so I tried to make my own implementation. But the problem is that it seems to work for small values of 'n' but starts producing incorrect results when it becomes larger (100000 for example, it gives 25659 instead of 18699). I use DMD 0.82 in Linux. If there is no bug in compiler, I would like to know what's wrong with my program. Thanks.

Here is a smaller testcase:

int main()
{
char[1] key;
int[char[]] dictionary;

key[0] = 'a';
dictionary[key] = 1;
key[0] = 'b';
dictionary[key] = 2;

foreach(char[] k, int v; dictionary)
printf("dictionary[\"%.*s\"] = %d\n", k, v);

return 0;
}

It gives the following results when executed:

dictionary["b"] = 1
dictionary["b"] = 2

Replacing 'dictionary[key]' with 'dictionary[key.dup]' helps to fix this problem.


April 11, 2004
In article <c5c9r5$2bsb$1@digitaldaemon.com>, serge says...
>
>In article <c5b7km$osi$1@digitaldaemon.com>, serge@lxnt.info says...
>>
>>I have just found D language and was interested in its performance. I found this page http://www.functionalfuture.com/d/ and was not impressed by 'hash' benchmark performance, so I tried to make my own implementation. But the problem is that it seems to work for small values of 'n' but starts producing incorrect results when it becomes larger (100000 for example, it gives 25659 instead of 18699). I use DMD 0.82 in Linux. If there is no bug in compiler, I would like to know what's wrong with my program. Thanks.
>
>Here is a smaller testcase:
>
>int main()
>{
>char[1] key;
>int[char[]] dictionary;
>
>key[0] = 'a';
>dictionary[key] = 1;
>key[0] = 'b';
>dictionary[key] = 2;
>
>foreach(char[] k, int v; dictionary)
>printf("dictionary[\"%.*s\"] = %d\n", k, v);
>
>return 0;
>}
>
>It gives the following results when executed:
>
>dictionary["b"] = 1
>dictionary["b"] = 2
>
>Replacing 'dictionary[key]' with 'dictionary[key.dup]' helps to fix this problem.
>

Basically, you have solved your own problem.  It is one of the things you are going to have to just get used to - at least for now.

You have a single array defined (key) and set it to "a", load a dictionary item
("a",1) which saves a reference to the array and the value 1.  Then you change
the array content (no size change, no copy) to "b" with value 2 which picks up a
reference to the key (the char array) and the value 2.

If you then tried to get the value for literal "a", you would get no answer for 2 reasons: 1. the reference for both entries points to an array (key) with current value "b".  2. the literal "a" doesn't match anything.  Furthermore, if you set the array now to value "a" or "c", it would match both entries. Non-intuitive, you bet!  I don't like the way it works either.

Asking the dictionary to pick up a dup gets the value staticized and then intuitive matching can occur.  But I don't see when you would ever want the results as they are used now unless you were loading a prebuilt array of keys with appropriate values or some other load method with distinct key and value references if they are arrays.

And finally, if you add a line
printf("key = %.*s, value = %d\n","x",dictionary["x"]);
before your display loop, you will somehow add an entry "x" with a value of 0
just for having made the reference.

I think there are problems in the current implementation, but I don't know whether it's by accident or by design.  Maybe there are some rules for use that will prevent all this, but they are not posted anywhere I know about.

Generally, the performance of the compiler and its code is very good, but there are still limitations (bugs?) in both the design and the implementation.  Some of the library stuff is still in alpha state and can be too slow.

Everything is still in an as-is - help improve it - state as long as the release level stays below 1.0 - sort-of alpha, sort-of beta, but coming along very nicely with lots of potential.  Criticisms, suggested or donated code,  bug reports, etc are all welcome.  Join the party.


April 11, 2004
>>In article <c5b7km$osi$1@digitaldaemon.com>, serge@lxnt.info says...

As far as your original program is concerned, after 6999, then it starts incrementing twice for each entry found instead of just the c++.  This looks like a compiler bug.

Also, you don't need the slices - "buf" works just as well - because the sprintf's are extern C, and create a D array for buf, but the strlen works because the \0 that the sprintf ends buf with is either still there, or carried back by the current D compiler.  I don't know if you can count on this as a compiler specification or just as Walter's current method of implementation.


April 12, 2004
In article <c5ckom$2t8g$1@digitaldaemon.com>, larry cowan says...
>
>
>>>In article <c5b7km$osi$1@digitaldaemon.com>, serge@lxnt.info says...
>
>As far as your original program is concerned, after 6999, then it starts incrementing twice for each entry found instead of just the c++.  This looks like a compiler bug.

The original program is not written right, no suprise it behaves strange :) Inserting the same ".dup" suffix fixes all the problems, at least for me.

>Also, you don't need the slices - "buf" works just as well - because the sprintf's are extern C, and create a D array for buf, but the strlen works because the \0 that the sprintf ends buf with is either still there, or carried back by the current D compiler. I don't know if you can count on this as a compiler specification or just as Walter's current method of implementation.

Using just "buf" is not quite the same, it is a string 32 characters long with embedded '\0' bytes. But the program just needs hextradecimal strings as associative array indexes. For example, formatting strings using sprintf into a static buffer could result in the following two strings: "123\0A\0\0\0...", "123\0B\0\0\0...". So using just "buf.dup", they will be different keys, but using "buf[0..strlen(buf)].dup", they are the same.

I wonder if it is possible to add runtime checks to detect modifications of associative array keys, as it could be a source of the bugs which are hard to debug (just forget to duplicate a string for array index and you are in a trouble).

Also, experimenting more with static arrays, I came to the following (incorrect) program. It returns a pointer to the string, allocated on stack and DMD compiler does not report it as error. Making such bugs is very easy for people unfamiliar with D (newcomers like me) and maybe detection of this situation as error in a compiler would be useful.

import std.string;

char[] f()
{
char[32] x;
strcpy(x, "abc");
return x;
}

int main()
{
printf("%.*s\n", f());
return 0;
}


April 12, 2004
serge 提到:

> In article <c5ckom$2t8g$1@digitaldaemon.com>, larry cowan says...
> 
>>
>>>>In article <c5b7km$osi$1@digitaldaemon.com>, serge@lxnt.info says...
>>
>>As far as your original program is concerned, after 6999, then it starts incrementing twice for each entry found instead of just the c++.  This looks like a compiler bug.
> 
> 
> The original program is not written right, no suprise it behaves strange :) Inserting the same ".dup" suffix fixes all the problems, at least for me.
> 
> 
>>Also, you don't need the slices - "buf" works just as well - because the sprintf's are extern C, and create a D array for buf, but the strlen works because the \0 that the sprintf ends buf with is either still there, or carried back by the current D compiler. I don't know if you can count on this as a compiler specification or just as Walter's current method of implementation.
> 
> 
> Using just "buf" is not quite the same, it is a string 32 characters long with embedded '\0' bytes. But the program just needs hextradecimal strings as associative array indexes. For example, formatting strings using sprintf into a static buffer could result in the following two strings: "123\0A\0\0\0...", "123\0B\0\0\0...". So using just "buf.dup", they will be different keys, but using "buf[0..strlen(buf)].dup", they are the same.
> 
> I wonder if it is possible to add runtime checks to detect modifications of associative array keys, as it could be a source of the bugs which are hard to debug (just forget to duplicate a string for array index and you are in a trouble).
 Hope future releases would give a warning. In fact, it is not a bug, it
is only the way D handle the char, or even String.
-- 
School, yet another nickname for anonymous.
:D ;-D
April 12, 2004
School wrote:

> Hope future releases would give a warning. In fact, it is not a bug, it
>
>is only the way D handle the char, or even String.
> 
>
There are no warnings in D. With warnings, Walter believes that it's either true or false, not in between.

-- 
-Anderson: http://badmama.com.au/~anderson/
April 13, 2004
J Anderson wrote:
> There are no warnings in D. With warnings, Walter believes that it's either true or false, not in between.
> 
Oh sorry I don't get that. I just wonder I only get errors in any times. It only because D compiler has no "so-called" warnings.

-- 
School, yet another nickname for anonymous.
:D ;-D
April 13, 2004
School wrote:

>J Anderson wrote:
>  
>
>>There are no warnings in D. With warnings, Walter believes that it's
>>either true or false, not in between.
>>
>>    
>>
>Oh sorry I don't get that. I just wonder I only get errors in any times.
>It only because D compiler has no "so-called" warnings.
>
>  
>

Walter:
" D compilers will not generate warnings for questionable code. Code will either be acceptable to the compiler or it will not be. This will eliminate any debate about which warnings are valid errors and which are not, and any debate about what to do with them. The need for compiler warnings is symptomatic of poor language design."
http://www.digitalmars.com/d/overview.html

I agree.

Once you've used several different C++ compiles with different errors and interpretations of warnings, they become a real pain.  ...And then there are people who simply ignore warnings until they have hundreds.  What's the use of warnings then, your not going to read them all and neither is anyone new who jumps on the team (and if they do they're wasting there time as well because you know 99% are ok).  If you go through and fix/hide them anyway, then they should be errors in the first place.

--  -Anderson: http://badmama.com.au/~anderson/
April 13, 2004
In article <c5e1im$20dn$1@digitaldaemon.com>, serge says...
>
..
>
>Using just "buf" is not quite the same, it is a string 32 characters long with embedded '\0' bytes. But the program just needs hextradecimal strings as associative array indexes. For example, formatting strings using sprintf into a static buffer could result in the following two strings: "123\0A\0\0\0...", "123\0B\0\0\0...". So using just "buf.dup", they will be different keys, but using "buf[0..strlen(buf)].dup", they are the same.

(I presume you are talking about linefeed x0A vs carriage return x0D). To slice
the last char off the string you need "buf[0..strlen(buf)-1]" whether or not you
are using ".dup".

You are right about the length of buf, but that opens the question of whether the key compares should be string compares or array compares - should "123/0456/0789/0/0/..." be the same as "123/0654/0987/0/0..." or not?

>I wonder if it is possible to add runtime checks to detect modifications of associative array keys, as it could be a source of the bugs which are hard to debug (just forget to duplicate a string for array index and you are in a trouble).
>

That would probably be terribly expensive for the compiler to insert, or for you
to do it either.  I feel that the point where the string is taken as a key for
the array should be a place where the compiler takes a copy not a pointer.
It should have frozen data there for the key, not some multiuse area with
changeable contents.  I don't like the potential for obscure tricks being
developed with the current method, or worse, the likelihood of insecure programs
being developed so easily.


« First   ‹ Prev
1 2