July 09, 2016
On Saturday, 9 July 2016 at 16:38:02 UTC, Max Samukha wrote:
> On Saturday, 9 July 2016 at 14:58:55 UTC, Andrew Godfrey wrote:
>> On Saturday, 9 July 2016 at 06:31:01 UTC, Max Samukha wrote:
>>> On Saturday, 9 July 2016 at 04:32:25 UTC, Andrew Godfrey wrote:
>>>
>
>> This is a tangent from the subject of this thread, but: No, that just says how it is implemented, not what it means / intends. See "the 7 stages of naming", here: http://arlobelshee.com/good-naming-is-a-process-not-a-single-step/
>>
>> (That resource is talking about identifier naming, not keywords. But it applies anyway.)
>
> You have a point, but the name is still not 'just bonkers', all things considered. Metonymy is justified in many cases, and I think this is one of them. What better name would you propose?

First, I'm not proposing a change to existing keywords, I'm using existing examples to talk about future language changes. Second, I had to look up "metonymy" in Wikipedia. Using its example: Suppose "Hollywood" referred to both the LA movie industry and, say, the jewelry industry; that's roughly equivalent to the pattern I'm talking about.

Others in this thread have suggested alternatives, many of those have things to criticize, but I would prefer something cryptic over something that has multiple subtly-different meanings in the language.
I'm drawn to "#if", except people might end up thinking D has a macro preprocessor. "ifct" seems fine except I'm not sure everyone would agree how to pronounce it. Compile-time context seems significant enough that maybe it warrants punctuation, like "*if" or "$if".

I especially want to establish: If we were adding a new feature as significant as "static if", and we decided a keyword was better than punctuation, could we stomach the cost of making a new keyword, or would we shoehorn it either into one of the existing keywords unused in that context, or start talking about using attributes? I have a lot of experience with backward-compatibility but I still don't understand the reticence to introduce new keywords (assuming a freely available migration tool).
July 09, 2016
On 07/09/2016 06:36 PM, Timon Gehr wrote:
> Undefined behaviour means the language semantics don't define a
> successor state for a computation that has not terminated. Do you agree
> with that definition? If not, what /is/ UB in D, and why is it called UB?

Yah, I was joking with Walter that effectively the moment you define undefined behavior it's not undefined any longer :o). It happens to the best of us. I think we're all aligned here.

There's some interesting interaction here. Consider:

int fun(int x)
{
    int[10] y;
    ...
    return ++y[9 >> x];
}

Now, under the "shift by negative numbers is undefined" rule, the compiler is free to eliminate the bounds check from the indexing because it's always within bounds for all defined programs. If it isn't, memory corruption may ensue. However, if the compiler says "shift by negative numbers is implementation-specified", the the compiler cannot portably eliminate the bounds check.

It's a nice example illustrating how things that seem to have nothing with memory corruption do effect it.


Andrei

July 09, 2016
On Sat, Jul 09, 2016 at 07:17:59PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
> On 07/09/2016 06:36 PM, Timon Gehr wrote:
> > Undefined behaviour means the language semantics don't define a successor state for a computation that has not terminated. Do you agree with that definition? If not, what /is/ UB in D, and why is it called UB?
> 
> Yah, I was joking with Walter that effectively the moment you define undefined behavior it's not undefined any longer :o). It happens to the best of us. I think we're all aligned here.
> 
> There's some interesting interaction here. Consider:
> 
> int fun(int x)
> {
>     int[10] y;
>     ...
>     return ++y[9 >> x];
> }
> 
> Now, under the "shift by negative numbers is undefined" rule, the compiler is free to eliminate the bounds check from the indexing because it's always within bounds for all defined programs. If it isn't, memory corruption may ensue. However, if the compiler says "shift by negative numbers is implementation-specified", the the compiler cannot portably eliminate the bounds check.

I find this rather disturbing, actually.  There is a fine line between taking advantage of assert's to elide stuff that the programmer promises will not happen, and eliding something that's defined to be UB and thereby resulting in memory corruption.

In the above example, I'd be OK with the compiler eliding the bounds check if there an assert(x >= 0) either in the function body or in the in-contract.  Having the compiler elide the bounds check without any assert or any other indication that the programmer has made assurances that UB won't occur is very scary to me, as plain ole carelessness can easily lead to exploitable security holes.  I hope D doesn't become an example of this kind of security hole.

At the very least, I'd expect the compiler to warn that the function argument may cause UB, and suggest that an in-contract or assert be added.

On a more technical note, I think eliding the bounds check on the grounds that shifting by negative x is UB is based on a fallacy. Eliding a bounds check should only be done when the compiler has the assurance that the bounds check is not needed. Just because a particular construct is UB does not meet this condition, because, being UB, there is no way to tell if the bounds check is needed or not, therefore the correct behaviour IMO is to leave the bounds check in. The elision should only happen if the compiler is assured that it's actually not needed.

To elide simply because negative x is UB basically amounts to saying "the programmer ought to know better than writing UB code, so therefore let's just assume that the programmer never makes a mistake and barge ahead fearlessly FTW!". We all know where blind trust in programmer reliability leads: security holes galore because humans make mistakes. Assuming humans don't make mistakes, which is what this kind of exploitation of UB essentially boils down to, leads to madness.


> It's a nice example illustrating how things that seem to have nothing with memory corruption do effect it.
[...]


T

-- 
Stop staring at me like that! It's offens... no, you'll hurt your eyes!
July 10, 2016
On Saturday, 9 July 2016 at 23:44:07 UTC, H. S. Teoh wrote:
> On a more technical note, I think eliding the bounds check on the grounds that shifting by negative x is UB is based on a fallacy. Eliding a bounds check should only be done when the compiler has the assurance that the bounds check is not needed. Just because a particular construct is UB does not meet this condition, because, being UB, there is no way to tell if the bounds check is needed or not, therefore the correct behaviour IMO is to leave the bounds check in. The elision should only happen if the compiler is assured that it's actually not needed.
>
> To elide simply because negative x is UB basically amounts to saying "the programmer ought to know better than writing UB code, so therefore let's just assume that the programmer never makes a mistake and barge ahead fearlessly FTW!". We all know where blind trust in programmer reliability leads: security holes galore because humans make mistakes. Assuming humans don't make mistakes, which is what this kind of exploitation of UB essentially boils down to, leads to madness.

There is also a huge practical benefit in leaving such checks
in the code.  I've worked a lot in Perl over the last decade,
and one soon finds that it has great error-checking sprinkled
throughout the implementation.  Based on that experience, I can
tell you it's tremendously helpful for development efforts if
unexpected problems are detected immediately when they occur,
as opposed to forcing the programmer to debug based on the wild
particles left over after an atom-smashing experiment.
July 09, 2016
On 7/9/16 7:44 PM, H. S. Teoh via Digitalmars-d wrote:
> On Sat, Jul 09, 2016 at 07:17:59PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
>> On 07/09/2016 06:36 PM, Timon Gehr wrote:
>>> Undefined behaviour means the language semantics don't define a
>>> successor state for a computation that has not terminated. Do you
>>> agree with that definition? If not, what /is/ UB in D, and why is it
>>> called UB?
>>
>> Yah, I was joking with Walter that effectively the moment you define
>> undefined behavior it's not undefined any longer :o). It happens to
>> the best of us. I think we're all aligned here.
>>
>> There's some interesting interaction here. Consider:
>>
>> int fun(int x)
>> {
>>     int[10] y;
>>     ...
>>     return ++y[9 >> x];
>> }
>>
>> Now, under the "shift by negative numbers is undefined" rule, the
>> compiler is free to eliminate the bounds check from the indexing
>> because it's always within bounds for all defined programs. If it
>> isn't, memory corruption may ensue. However, if the compiler says
>> "shift by negative numbers is implementation-specified", the the
>> compiler cannot portably eliminate the bounds check.
>
> I find this rather disturbing, actually.  There is a fine line between
> taking advantage of assert's to elide stuff that the programmer promises
> will not happen, and eliding something that's defined to be UB and
> thereby resulting in memory corruption.

Nah, this is cut and dried. You should just continue being nicely turbed. "Shifting by a negative integer has undefined behavior" is what it is. Now I'm not saying it's good to define it that way, just that if it's defined that way then these are the consequences.

> In the above example, I'd be OK with the compiler eliding the bounds
> check if there an assert(x >= 0) either in the function body or in the
> in-contract.  Having the compiler elide the bounds check without any
> assert or any other indication that the programmer has made assurances
> that UB won't occur is very scary to me, as plain ole carelessness can
> easily lead to exploitable security holes.  I hope D doesn't become an
> example of this kind of security hole.

Yeah, we'd ideally like very little UB and no UB in safe code. I think we should define shift with out-of-bounds values as "implementation specified".

> At the very least, I'd expect the compiler to warn that the function
> argument may cause UB, and suggest that an in-contract or assert be
> added.

You should expect the compiler to do what the language definition prescribes.

> On a more technical note, I think eliding the bounds check on the
> grounds that shifting by negative x is UB is based on a fallacy.

No.

> Eliding
> a bounds check should only be done when the compiler has the assurance
> that the bounds check is not needed. Just because a particular construct
> is UB does not meet this condition, because, being UB, there is no way
> to tell if the bounds check is needed or not, therefore the correct
> behaviour IMO is to leave the bounds check in. The elision should only
> happen if the compiler is assured that it's actually not needed.
>
> To elide simply because negative x is UB basically amounts to saying
> "the programmer ought to know better than writing UB code, so therefore
> let's just assume that the programmer never makes a mistake and barge
> ahead fearlessly FTW!". We all know where blind trust in programmer
> reliability leads: security holes galore because humans make mistakes.
> Assuming humans don't make mistakes, which is what this kind of
> exploitation of UB essentially boils down to, leads to madness.

You're overthinking this. Undefined is undefined. We're done here.


Andrei


July 09, 2016
On 7/9/16 6:58 PM, Andrew Godfrey wrote:
> On Saturday, 9 July 2016 at 16:38:02 UTC, Max Samukha wrote:
>> On Saturday, 9 July 2016 at 14:58:55 UTC, Andrew Godfrey wrote:
>>> On Saturday, 9 July 2016 at 06:31:01 UTC, Max Samukha wrote:
>>>> On Saturday, 9 July 2016 at 04:32:25 UTC, Andrew Godfrey wrote:
>>>>
>>
>>> This is a tangent from the subject of this thread, but: No, that just
>>> says how it is implemented, not what it means / intends. See "the 7
>>> stages of naming", here:
>>> http://arlobelshee.com/good-naming-is-a-process-not-a-single-step/
>>>
>>> (That resource is talking about identifier naming, not keywords. But
>>> it applies anyway.)
>>
>> You have a point, but the name is still not 'just bonkers', all things
>> considered. Metonymy is justified in many cases, and I think this is
>> one of them. What better name would you propose?
>
> First, I'm not proposing a change to existing keywords, I'm using
> existing examples to talk about future language changes. Second, I had
> to look up "metonymy" in Wikipedia. Using its example: Suppose
> "Hollywood" referred to both the LA movie industry and, say, the jewelry
> industry; that's roughly equivalent to the pattern I'm talking about.

Way ahead of ya. The average English noun has 7.8 meanings, and the average verb has 12.

> Others in this thread have suggested alternatives, many of those have
> things to criticize, but I would prefer something cryptic over something
> that has multiple subtly-different meanings in the language.
> I'm drawn to "#if", except people might end up thinking D has a macro
> preprocessor. "ifct" seems fine except I'm not sure everyone would agree
> how to pronounce it. Compile-time context seems significant enough that
> maybe it warrants punctuation, like "*if" or "$if".

No. As an aside I see your point but "static if" is the worst example to support it, by a mile.

> I especially want to establish: If we were adding a new feature as
> significant as "static if", and we decided a keyword was better than
> punctuation, could we stomach the cost of making a new keyword, or would
> we shoehorn it either into one of the existing keywords unused in that
> context, or start talking about using attributes? I have a lot of
> experience with backward-compatibility but I still don't understand the
> reticence to introduce new keywords (assuming a freely available
> migration tool).

It just depends. There is no rigid strategy here. Worrying about the hypothetical possibility seems unnecessary.


Andrei

July 10, 2016
On Saturday, 9 July 2016 at 08:39:10 UTC, Walter Bright wrote:
> Seems that in order to make it useful, users had to extend it. This doesn't fit the criteria.

Scheme is a simple functional language which is easy to extend. Why would you conflate "useful" with "used for writing complex programs"?

Anyway, there are many other examples, but less known.

> Wirth's Pascal had the same problem. He invented an elegant, simple, consistent, and useless language. The usable Pascal systems all had a boatload of dirty, incompatible extensions.

I am not sure if Pascal is elegant, but it most certainly is useful. So I don't think I agree with your definition of "useful".


> What programmers think of as "intuitive" is often a collection of special cases.

I think I would need examples to understand what you mean here.

July 10, 2016
On Sunday, 10 July 2016 at 02:29:15 UTC, Andrei Alexandrescu wrote:
> You're overthinking this. Undefined is undefined. We're done here.

Andrei, you're underthinking this.  You're treating it like an
elegant academic exercise in an ivory tower, without consideration
for the practical realities of using the language productively
(i.e., getting direct feedback when the programmer makes mistakes,
which we all do, so s/he doesn't need to spend hours in a debugger).
July 10, 2016
On Sunday, 10 July 2016 at 02:44:14 UTC, Ola Fosheim Grøstad wrote:
> On Saturday, 9 July 2016 at 08:39:10 UTC, Walter Bright wrote:
>> Seems that in order to make it useful, users had to extend it. This doesn't fit the criteria.
>
> Scheme is a simple functional language which is easy to extend. Why would you conflate "useful" with "used for writing complex programs"?
>
> Anyway, there are many other examples, but less known.
>
>> Wirth's Pascal had the same problem. He invented an elegant, simple, consistent, and useless language. The usable Pascal systems all had a boatload of dirty, incompatible extensions.
>
> I am not sure if Pascal is elegant, but it most certainly is useful. So I don't think I agree with your definition of "useful".
>
>
>> What programmers think of as "intuitive" is often a collection of special cases.
>
> I think I would need examples to understand what you mean here.

I agree with Walter here. Scheme is not a language that you can generally do useful things in. If you want to do anything non-trivial, you switch to Racket (which is not as minimalistic and "pure" as Scheme).
July 10, 2016
On 7/9/16 11:52 PM, Observer wrote:
> On Sunday, 10 July 2016 at 02:29:15 UTC, Andrei Alexandrescu wrote:
>> You're overthinking this. Undefined is undefined. We're done here.
>
> Andrei, you're underthinking this.  You're treating it like an
> elegant academic exercise in an ivory tower, without consideration
> for the practical realities of using the language productively
> (i.e., getting direct feedback when the programmer makes mistakes,
> which we all do, so s/he doesn't need to spend hours in a debugger).

Oh, I'm all for defining formerly undefined behavior. But don't call it undefined. I'm just fact checking over here. -- Andrei