Thread overview | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
March 07, 2010 Holes in structs and opEquals | ||||
---|---|---|---|---|
| ||||
This comes from a small thread that is going on in digitalmars.D.learn. This program asserts: import std.c.string; struct S { // 16 bytes, with a hole ushort s; double d; } void main() { S s1, s2; memset(&s1, ubyte.min, S.sizeof); memset(&s2, ubyte.max, S.sizeof); s1.s = s2.s = 0; s1.d = s2.d = 0; assert(s1 == s2); } But a correctly implemented opEquals (and opCmp) among structs has to ignore the contents of the holes, because they can be filled with anything, for example if the structs where not initialized (with =void). A correct opEquals has to work recursively (so if the struct contains a string, it has to control the string equality too). And a built-in recursive opCmp/opHash for structs is about as useful as having a correct opEquals. A correct opEquals can be a little slower, but correctness come first. In the uncommon situations where I really need max speed and I don't care of correctness I can use a memcmp(&s1, &s2, S.sizeof). (And the compiler is free to use just memcmp when a struct has no holes and doesn't contain references, associative arrays and dynamic arrays). Correctness of basic struct operators is not an optional feature, like properties or even enums. If the opEquals among structs is wrong then it's better to not have it at all. Bye, bearophile |
March 07, 2010 Re: Holes in structs and opEquals | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 2010-03-07 15:09:53 +0100, bearophile <bearophileHUGS@lycos.com> said:
> This comes from a small thread that is going on in digitalmars.D.learn.
>
> This program asserts:
>
>
> import std.c.string;
> struct S { // 16 bytes, with a hole
> ushort s;
> double d;
> }
> void main() {
> S s1, s2;
> memset(&s1, ubyte.min, S.sizeof);
> memset(&s2, ubyte.max, S.sizeof);
> s1.s = s2.s = 0;
> s1.d = s2.d = 0;
> assert(s1 == s2);
> }
>
>
> But a correctly implemented opEquals (and opCmp) among structs has to ignore the contents of the holes, because they can be filled with anything, for example if the structs where not initialized (with =void).
>
> A correct opEquals has to work recursively (so if the struct contains a string, it has to control the string equality too).
>
> And a built-in recursive opCmp/opHash for structs is about as useful as having a correct opEquals.
>
> A correct opEquals can be a little slower, but correctness come first. In the uncommon situations where I really need max speed and I don't care of correctness I can use a memcmp(&s1, &s2, S.sizeof). (And the compiler is free to use just memcmp when a struct has no holes and doesn't contain references, associative arrays and dynamic arrays).
>
> Correctness of basic struct operators is not an optional feature, like properties or even enums. If the opEquals among structs is wrong then it's better to not have it at all.
one could argue that the unsafe operation is memset.
The compiler always initializes a struct, so that what you describe should never happen in a safe program.
Still as you say the following example that might indeed considered a bug:
S s1=void,s2=void;
s1.s=0; s1.d=0;
s2.s=0; s2.d=0;
assert(s1 == s2);
here the assert might fail depending on the previous content of the memory.
I am not sure of what is the best solution, I am not sure that defining a special comparison operation by default for each struct is the correct solution (it can be quite some bloat), please note that a user defined comparison function will not have these problems.
Still I agree that traking down a bug due to this might be very ugly...
|
March 07, 2010 Re: Holes in structs and opEquals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Fawzi Mohamed | Fawzi Mohamed: >I am not sure of what is the best solution,< This is true for me too, that's why I have created this thread, to find a way to solve this problem. >I am not sure that defining a special comparison operation by default for each struct is the correct solution (it can be quite some bloat), please note that a user defined comparison function will not have these problems.< Note that if you write a opEquals for each struct then you have the same code bloat. The difference is that you have had to write lot of boring code, that can contain bugs, and your program is longer. In a program you don't use == and != (or hashing) on all the structs, so in theory the compiler can add such opEquals only to the structs that are tested for equality in the program. I guess the usual modular compilation requirement of D code asks too all structs to have such opEquals. In this case the Link-Time Optimization of LLVM comes to the rescue and removed unused functions from the whole program/whole compilation block. Another possible solution is to have a single opEquals in the runtime, that works like the array sort, using run time knowledge about the types. But this is of probably slower. Bye, bearophile |
March 07, 2010 Re: Holes in structs and opEquals | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | There are other solutions. For example remove the current opEquals from structs, so doing == among two structs become a syntax error. And then add a property like @equable that when present beside the struct name adds a specific and correct and recursive opEquals to it. @equable stuct Foo { int x; } Probably instead of @equable it's better to have an attribute that adds opHash, opEquals and opCmp to a struct. This is safe, avoids most code bloat, and keeps code fast. The disadvantage is a that it adds a new attribute to the language. If the user can create new attributes, then it's surely possible to define this attribute with D code and some compile-time reflection, keeping the language simple enough. There is one or more Java libs that use attributes for this specific purpose. Bye, bearophile |
March 07, 2010 Re: Holes in structs and opEquals | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | bearophile wrote:
> This comes from a small thread that is going on in digitalmars.D.learn.
>
> This program asserts:
>
>
> import std.c.string;
> struct S { // 16 bytes, with a hole
> ushort s;
> double d;
> }
> void main() {
> S s1, s2;
> memset(&s1, ubyte.min, S.sizeof);
> memset(&s2, ubyte.max, S.sizeof);
> s1.s = s2.s = 0;
> s1.d = s2.d = 0;
> assert(s1 == s2);
> }
>
>
> But a correctly implemented opEquals (and opCmp) among structs has to ignore the contents of the holes, because they can be filled with anything, for example if the structs where not initialized (with =void).
> Correctness of basic struct operators is not an optional feature, like properties or even enums. If the opEquals among structs is wrong then it's better to not have it at all.
Yes. Most of the problems with struct opEquals were fixed in bug 3433. You've found an annoying case which wasn't covered.
|
March 07, 2010 Re: Holes in structs and opEquals | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | bearophile wrote: > But a correctly implemented opEquals (and opCmp) among structs has to > ignore the contents of the holes, because they can be filled with > anything, The "holes" are defined to be filled with 0, and are when initialized by the compiler. This is specifically so that memcmp can be done to compare the struct contents. > for example if the structs where not initialized (with =void). If you use =void, or allocate with malloc(), it is your responsibility to deal with the holes. |
March 07, 2010 Re: Holes in structs and opEquals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Fawzi Mohamed | Fawzi Mohamed wrote: > one could argue that the unsafe operation is memset. That's right. > The compiler always initializes a struct, so that what you describe should never happen in a safe program. Right. > Still as you say the following example that might indeed considered a bug: > > S s1=void,s2=void; > s1.s=0; s1.d=0; > s2.s=0; s2.d=0; > assert(s1 == s2); > > here the assert might fail depending on the previous content of the memory. > I am not sure of what is the best solution, I am not sure that defining a special comparison operation by default for each struct is the correct solution (it can be quite some bloat), please note that a user defined comparison function will not have these problems. No, it's not a bug in the language, it's a bug in the user code. Using =void is an advanced feature, and it requires the user to know what he's doing with it. That's why it isn't allowed in safe mode. > Still I agree that traking down a bug due to this might be very ugly... The idea with =void initializations is that they are findable using grep, and so can be singled out for special attention when there is a problem. |
March 08, 2010 Re: Holes in structs and opEquals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright:
> The "holes" are defined to be filled with 0, and are when initialized by the compiler. This is specifically so that memcmp can be done to compare the struct contents.
Thank you for your answer. I didn't read this important detail in the online documentation, I must have missed it, sorry. It's a detail that D programmers must keep in mind well.
A future lint program for D will need to try to warn the programmer that using the default opEquals among uninitialized structs is wrong.
Bye,
bearophile
|
March 08, 2010 Re: Holes in structs and opEquals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright Wrote:
> Fawzi Mohamed wrote:
> > one could argue that the unsafe operation is memset.
>
> That's right.
>
>
> > The compiler always initializes a struct, so that what you describe should never happen in a safe program.
>
> Right.
>
>
> > Still as you say the following example that might indeed considered a bug:
> >
> > S s1=void,s2=void;
> > s1.s=0; s1.d=0;
> > s2.s=0; s2.d=0;
> > assert(s1 == s2);
> >
> > here the assert might fail depending on the previous content of the memory.
> > I am not sure of what is the best solution, I am not sure that defining
> > a special comparison operation by default for each struct is the correct
> > solution (it can be quite some bloat), please note that a user defined
> > comparison function will not have these problems.
>
> No, it's not a bug in the language, it's a bug in the user code. Using =void is an advanced feature, and it requires the user to know what he's doing with it. That's why it isn't allowed in safe mode.
>
>
> > Still I agree that traking down a bug due to this might be very ugly...
>
> The idea with =void initializations is that they are findable using grep, and so can be singled out for special attention when there is a problem.
The compiler knows at compile-time what variables are initialized with "=void". The compiler than can add a compilation error when such a struct variable is used in an equals expression. this doesn't cover use of malloc() which must be the user's responsebility.
e.g.
S s1=void,s2=void;
s1.s=0; s1.d=0;
s2.s=0; s2.d=0;
assert(s1 == s2); // <- this line should not compile
|
March 08, 2010 Re: Holes in structs and opEquals | ||||
---|---|---|---|---|
| ||||
Posted in reply to yigal chripun | yigal chripun wrote:
> The compiler knows at compile-time what variables are initialized with "=void". The compiler than can add a compilation error when such a struct variable is used in an equals expression. this doesn't cover use of malloc() which must be the user's responsebility.
>
> e.g.
> S s1=void,s2=void;
> s1.s=0; s1.d=0;
> s2.s=0; s2.d=0;
> assert(s1 == s2); // <- this line should not compile
>
It can, but I don't agree that it should. For an =void initialization, it's the user's responsibility to initialize it properly. The use of =void implies the user knows what he's doing with it, and will take care to initialize the 'holes' as necessary.
Trying to disable == for such structs is a losing battle, anyway, as the compiler could only detect the most obvious cases. Pass a reference to it to a function, store it in a data structure, etc., and all that goes away.
|
Copyright © 1999-2021 by the D Language Foundation