April 17, 2010
bearophile wrote:
> Walter Bright:
>> You won't be able to cast pointers from integral types in safe
>> functions.
> 
> That doesn't solve the problem, because I will surely want to use
> unsafe code in D, and unsafe modules will keep having the same
> undefined-derived bugs inherited from C. What I was asking for in
> this thread is to fix some of the C holes, not to just forbid the
> things I was looking for in D in the first place. If I use D instead
> of for example Python is because D has unions and pointers, that
> allow me to create the tight data structures that have a good
> performance. I am not interested in using D just as a Java.
> 
> This can be an irreducible difference between my ideal language and
> D. Maybe my purpose is  hopeless, who knows. My ideal system language
> is like a C that helps me avoid a large percentage of possible bugs.
> A language that the programmer can predict what it will do, with
> lower level features. Maybe someday I'll try to create this language
> :-)

I don't see any way to make conversions between pointers and ints implementation defined, and make dereferencing a pointer coming from some int anything but undefined behavior.

April 17, 2010
bearophile wrote:
> Lars T. Kyllingstad:
>> The effect of @safe would be to forbid code that leads to undefined
>>  behaviour, not make it well-defined.
> 
> Right, but that's not the solution I was looking for, and it's not
> going to solve the problems inherited from C. Because if people that
> use D want to use unsafe code too, otherwise they use C#/Java. Having
> safe modules in D is a good idea, but safe modules can't be a
> replacement for efforts to make safer the low level code too.

I'm confused. It appears you want to write unsafe code and yet have it be guaranteed safe.

Functions tagged with @system are where you should put unsafe code.
April 17, 2010
Hello Walter,

> bearophile wrote:
> 
> I'm confused. It appears you want to write unsafe code and yet have it
> be guaranteed safe.

Currently the described code is legal, unsafe (it can result in invalid pointers) and has undefined semantics (it can result in unpredictable, implementation defined results). What I think bearophile wants is for only the last to be changed, that is; you can still do things that result in invalid pointers, but it does so in a well defined way (at least with regards to the bit pattern the pointer ends up as)



-- 
... <IXOYE><



April 17, 2010
BCS wrote:
> Currently the described code is legal, unsafe (it can result in invalid pointers) and has undefined semantics (it can result in unpredictable, implementation defined results). What I think bearophile wants is for only the last to be changed, that is; you can still do things that result in invalid pointers, but it does so in a well defined way (at least with regards to the bit pattern the pointer ends up as)

I don't think that's a useful thing to specify - where's the advantage, and if D is on a machine that does pointers differently, why make it impossible to port standard D to it?
April 17, 2010
On 2010-04-17 13:35:16 -0400, Walter Bright <newshound1@digitalmars.com> said:

> BCS wrote:
>> Currently the described code is legal, unsafe (it can result in invalid pointers) and has undefined semantics (it can result in unpredictable, implementation defined results). What I think bearophile wants is for only the last to be changed, that is; you can still do things that result in invalid pointers, but it does so in a well defined way (at least with regards to the bit pattern the pointer ends up as)
> 
> I don't think that's a useful thing to specify - where's the advantage, and if D is on a machine that does pointers differently, why make it impossible to port standard D to it?

Me thinks this is not a very good argument. Supporting obscure platforms isn't very useful, that's why D only supports complement-2 arithmetics (you said it yourself).

There is a very good reason to disallow manipulating the bit pattern in safe D however: memory safety. If you can dereference a pointer made from an arbitrary bit pattern, you may have an exploitable flaw similar to a buffer overrun. Dereferencing an arbitrary value is definitely *not* memory-safe and should *not* be allowed in safe D.

So you shouldn't be able to cast a value to a pointer. The reverse, casting a pointer to a value, makes sense in my opinion: you may want to print the pointer value in a debug output of some sort. There's nothing unsafe with that so it should be allowed.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

April 18, 2010
Hello Walter,

> BCS wrote:
> 
>> Currently the described code is legal, unsafe (it can result in
>> invalid pointers) and has undefined semantics (it can result in
>> unpredictable, implementation defined results). What I think
>> bearophile wants is for only the last to be changed, that is; you can
>> still do things that result in invalid pointers, but it does so in a
>> well defined way (at least with regards to the bit pattern the
>> pointer ends up as)
>> 
> I don't think that's a useful thing to specify - where's the
> advantage, and if D is on a machine that does pointers differently,
> why make it impossible to port standard D to it?
> 

#1 point to a machine in use now (keeping in mind D already dumped near/far pointers) that "does pointers differently"?

#2 why allow code to compile, runs without error and work on one architecture but on another, it compiles, runs without error and does NOT work?

I'll grant Michel's point about pointer->int for debugging, etc. but even then I'd consider requiring an explicit cast.

In the end, while I see the point and see some merit, I'm almost natural on the subject.

-- 
... <IXOYE><



April 18, 2010
Michel Fortin wrote:
> There is a very good reason to disallow manipulating the bit pattern in safe D however: memory safety. If you can dereference a pointer made from an arbitrary bit pattern, you may have an exploitable flaw similar to a buffer overrun. Dereferencing an arbitrary value is definitely *not* memory-safe and should *not* be allowed in safe D.

And it is not allowed in safe functions.

> So you shouldn't be able to cast a value to a pointer. The reverse, casting a pointer to a value, makes sense in my opinion: you may want to print the pointer value in a debug output of some sort. There's nothing unsafe with that so it should be allowed.

These are allowed in safe functions.
April 18, 2010
Walter Bright:

Sorry for the delay, I was away.
In this post I try to write in a quite explicit way.


>I don't see any way to make conversions between pointers and ints implementation defined,<

I see. Thank you for the explanation, I'm often ignorant enough.


In my original post I was talking about all places where C standard leaves things undefined. I'm not a C language lawyer, so I don't know all the things the C standard leaves undefined, but I know there are other undefined things in C beside the pointer <-> int conversion. That's why I was saying that it can be quite positive to write down a list of such things. So even if there is no hope to fix this pointer <-> int hole, maybe there are other C holes that can be fixed. I will not be able to write down a complete list, but I think having a complete list can be a good starting point.

In my original post I have listed two more things that I think the C standard leaves undefined:
- Pointer aliasing;
- Read of an enum field different from the last field written;

The first of them is fixed in C99 with the 'restrict' keyword. I guess the D compiler has to assume all pointers can be an alias to each other (but I don't remember if the D docs say this explicitely somewhere) because I think D prefers to not give keywords that the compiler itself can't then test and make sure they are correct.

The second of them is relative to code like:

enum SI { short s; int i; }
void main() {
  SI e;
  e.i = 1_000_000;
  int foo = e.s;
}

I think that according the C standard this code (the contents of foo) is undefined. Is D going to define this, or is it going to leave this undefined as in C? (Leaving it undefined can speed up a little the D code, but making it defined can make D more flexible, for example you can use an enum to split an int in two shorts in a reliable way). Note: here I am talking about D unsafe modules, because I think safe D modules can't use enums. So I am talking about the possibility of removing some undefined behaviours from unsafe D modules.

Probably the C standard leaves other things undefined. Some of them can cause bugs in unsafe D code.

Bye,
bearophile
April 19, 2010
bearophile wrote:
> The first of them is fixed in C99 with the 'restrict' keyword. I
> guess the D compiler has to assume all pointers can be an alias to
> each other (but I don't remember if the D docs say this explicitely
> somewhere) because I think D prefers to not give keywords that the
> compiler itself can't then test and make sure they are correct.

'restrict' is not at all about eliminating undefined behavior. It is about providing more information to the optimizer so better code can be generated. If restrict is used incorrectly, however, undefined behavior can result. D doesn't have this problem because D doesn't have the restrict qualifier.

> The second of them is relative to code like:
> 
> enum SI { short s; int i; } void main() { SI e; e.i = 1_000_000; int
> foo = e.s; }
> 
> I think that according the C standard this code (the contents of foo)
> is undefined. Is D going to define this, or is it going to leave this
> undefined as in C? (Leaving it undefined can speed up a little the D
> code, but making it defined can make D more flexible, for example you
> can use an enum to split an int in two shorts in a reliable way).
> Note: here I am talking about D unsafe modules, because I think safe
> D modules can't use enums. So I am talking about the possibility of
> removing some undefined behaviours from unsafe D modules.

D leaves byte ordering (endianness) implementation defined. I see no way to do otherwise without incurring severe performance penalties.

> Probably the C standard leaves other things undefined. Some of them
> can cause bugs in unsafe D code.

Yes, endianness issues can cause bugs.
April 19, 2010
On 04/18/2010 02:46 PM, Walter Bright wrote:
> Michel Fortin wrote:
>> So you shouldn't be able to cast a value to a pointer. The reverse,
>> casting a pointer to a value, makes sense in my opinion: you may want
>> to print the pointer value in a debug output of some sort. There's
>> nothing unsafe with that so it should be allowed.
>
> These are allowed in safe functions.

Just checking, this is allowed:

@safe void crash_maybe() {
    int* p = cast(int*)uniform(size_t.min, size_t.max);
    *p = 14;
}

right?