August 09, 2005
>> void doSomething(int* i, int* j)
>
> [snip] But still the outcome is undefined by design.

Yep. Other code, same problem. That's what i was trying to say.
August 09, 2005
Manfred Nowak wrote:
> Derek Parnell <derek@psych.ward> wrote:
>>I think the most useful position for us would be to have D
>>continue to allow this possibility, and provide a standard
>>function that we can use to detect overlapping arguments. Then
>>the coder can choose how to react to the situation.
> 
> 
> Agreed. But now to something completely different: the details :-)
> 
> -manfred
> 

assert( !std.memOverlap( i, j ) );

August 09, 2005
In article <dd8b1o$4an$1@digitaldaemon.com>, Manfred Nowak says...

>Applying the same actual parameter to both formal parameters would render the first assignement in the function body useless.

No. It will just be overwritten. And if you don't want this then don't use inout parameter. If you want the language to protect you for this then do it as all the languages out there: forbid "inout", if you don't do this, this code is absolutely legal. And if you use inout parameter like a local variable then your personal programming skills must be improved, not the language.

>And in D dead code is as illegal as expressions that do not have an effect.

Sorry but there is no reason to forbid a effectless expressions, and there is no language in the world that forbids this. So this statement is simply wrong.

And it is not dead code, dead code is defined as code that is not can't be reached based on static programming analysis. So a useless statement is a useless statement but not dead code. It is also not illegal as there are many good reasons to write dead code, especially if you have a "version" feature.




August 09, 2005
Manfred Nowak wrote:

> Burton Radons <burton-radons@smocky.com> wrote:
> 
> [ some remarks about some people ]
> 
> I really like developers getting personal and through my hole business life I could not resist mentioning them to my boss with the emphasis that these are really nice people.

It isn't "getting personal" to comment on your conduct, which was to find some place in the spec where no behaviour is defined at all, to assume that an ambiguity must mean that an implementation is free to act in any way it wants (which would mean that any use of inout/out would be
undefined!), to make a silly thread where your intentions could not be
less clear, and then to act like you duped us all when a simple question
could have cleared up the issue right away.  You acted like an ass; I noted it.

By contrast an ad hominem attack would be for me to say you can't be
right because you're a Spaniard, or that you must believe something because you come from a FORTH background.

> What I never understood in total was, that these people usually resignated to the next possible date.

Okay, you really don't write good English.  This is fine and I won't
punish you for that, but please be as elaborate as you can so that we
can figure out what you mean, repeating important segments with different wordings.  Did you mean "resigned at", as in quit
the job, and is this sarcastic/coy/ironic/etc?  Please don't use sarcasm when typing in English, it's difficult enough to detect it in plaintext when conversing with another English speaker.

> So, if you want me that I praise your skills to the quality assurer of your company send me a private mail.

Oh, it's a threat.  Great.

> Meanwhile simply explain how you would solve the task that your implementation should forbid any call with identical actual parameters.

I can't even begin to discuss this until you say straight out what your problem is with the way D implements it!  What problem am I trying to solve in my implementation?  What am I implementing?  What task?  I'm not even sure if you're using "quality assurance" with the same meaning that I have of it.
August 09, 2005
In article <dd3mon$2eds$1@digitaldaemon.com>, Manfred Nowak says...
>
>1) Review this code:
>
>void f( inout int i, inout int j){
>  i= 1;
>  j= 2;
>}
>
>
>2) Pass or Fail?

Pass. The call f(i,i) should probably fail to compile. It is atleast undefined.

>3) Explain.

This is another reason to forbid aliasing between function arguments. (i.e.
implicit "restrict" on all function parameters.)
IIRC this was suggested by Walter a while back.

That would define the call f(i,i) as illegal. (But not always checkable at
compile time).

It would also allow for more optimizations... *ducks*

/O


August 09, 2005
Hi,

>>2) Pass or Fail?
>Pass. The call f(i,i) should probably fail to compile. It is atleast undefined.

Why would it fail to compile? The call is _not_ undefined. If we take this step by step, the function:

FIRST : sets param1 (i) to 1. SECOND: sets param2 (j) to 2.

If param1 happens to also be param2 (&i == &j), then that one, single variable
will:

FIRST : Be set to 1.
SECOND: Be set to 2.

Why does everybody keep saying this is undefined? It is _perfectly_ well defined. I know _exactly_ what it will do, and it will always do the same thing. No calling convention or (correct) optimization will alter this fact.

If you consider the function/call to be semantically confusing, then that's another thing entirely. Anyway, please, the code the OP posted is:

A) Not undefined.
B) Not unstable.
C) Not unpredictable.
D) Not inherently unsafe.
E) Not a "fault" of the language.
F) Not even remotely related to application-level security.

Consider this the "punchline" of this whole pseudo-ironic joke.

>>3) Explain.
>
>This is another reason to forbid aliasing between function arguments. (i.e.
>implicit "restrict" on all function parameters.)
>IIRC this was suggested by Walter a while back.

Ah. Now we are getting somewhere because this changes the rules of the game. Sure, if you say "no aliased pointers/references," then the OP's code becomes illegal/undefined. But as it stands, that is _not_ the case.

FWIW, I like the idea of implicit restrict. It could even help with the "const" optimization that Walter mentioned, which was also due to aliasing.

>That would define the call f(i,i) as illegal. (But not always checkable at
>compile time).

Yes. This could get tricky. If people aren't careful, it could lead to some really interesting bugs down the road. Then again, I've never sent aliased pointers/references. So in the end, I am for it.

Perhaps there could be an implicit run-time check (in -debug) to see if the restrict contract was broken at call time. This would help a lot against the subtle bugs.

>It would also allow for more optimizations... *ducks*

I like this kind of thinking ;).

Cheers,
--AJG.



August 10, 2005
In article <ddb17v$b8k$1@digitaldaemon.com>, AJG says...
>
>Hi,
>
>>>2) Pass or Fail?
>>Pass. The call f(i,i) should probably fail to compile. It is atleast undefined.
>
>Why would it fail to compile? The call is _not_ undefined.

Is the spec clear on that inout arguments are passed as pointers? Is it
unthinkable to pass them as (caller save) registers and let the caller then do
the assignment to i in the f(i,i) case? (Unthinkable if you want aliased
parameters to work) I just can't find it in the spec. Therefore undefined.

>FWIW, I like the idea of implicit restrict. It could even help with the "const" optimization that Walter mentioned, which was also due to aliasing.
>
>Perhaps there could be an implicit run-time check (in -debug) to see if the restrict contract was broken at call time. This would help a lot against the subtle bugs.

Yes. It would probably catch most cases, though I guess it is possible to make obscure ones where such a check would fail too.

/O


August 10, 2005
AJG wrote:

> Hi,
>>As I see it, the solution is this: if you write a function where two or more reference parameters could refer to the same thing, and at least one of them is non-const, you should make an explicit note to potential users of the function as to whether that's allowed, and what the results are. Sometimes it's quite reasonable to pass the same thing in twice, and, by careful ordering of the writes (& reads) you can arrange to get the expected result. If you intend to *call* such a function with overlapped operands, and the author has *not* made such a note, then you're on your own... Even if the behaviour happens to be what you want, there's no contract that it won't change.
> 
> 
> I think we're back to restricted pointers here. IIRC, the compiler trusts the
> programmer not to send the same thing twice. I wish there were automated ways to
> check for this kind of thing. Walter said it's not possible "in the general
> case," so it always kinda comes down to trusting the programmer.
> 
Right.
Here's a very specific case. Suppose you want to write a function  which negates a bunch of int16's and stores them to a separate array (or possibly back to the same place).
The core might be

   for( int i = 0; i < n; ++i ){ d[i]= -s[i];}	/* (1) */

A common trick to make this go a little faster, by reducing loop overhead (assume n is a multiple of 4), is:

   for( int i = 0; i < n; i+=4 ){	/* (2) */
	d[i]= -s[i];
	d[i+1] = -s[i+1];
	d[i+2] = -s[i+2];
	d[i+3] = -s[i+3];
  }

Some processors have zero-overhead loop modes, which means the above won't help. What can be surprisingly helpful is:

   for( int i = 0; i < n; i+= 4 ){  /* (3) */
        int16 d0 = -s[i];
	int16 d1 = -s[i+1];
	int16 d2 = -s[i+2];
        int16 d3 = -s[i+3];
	d[i] = d0;
	d[i+1] = d1;
	d[i+2] = d2;
	d[i+3] = d3;
  }

The transformation from (2) to (3) may change the results, but only if the src and dst buffers are overlapping and non-identical. But it can result in a noticeable speedup on some machines. What happens
is, you are separating the four reads from the four writes, and giving the compiler explicit permission to overlap  the four read/negate/write operations. It can't do this itself, since the writes to d[] may affect the reads from s[].  On x86 this doesn't help, I think; it deals with overlapping on-the-fly in silicon.(it might actually make things much worse on an x86, since the compiler has to read all four s[] values before writing any d[] values, and there's not enough registers to store them all, and 'i' and the pointers. So there may be copies done to the stack, which the silicon then tries to optimize out. ick.  Of course, you'd really want to use MMX in this particular case).

On a Risc processor, typically there are effect latencies after load operations, and lots of regs to store concurrent operations, so (3) can make a very noticeable difference in inner loops. The code  can "read t0=s[i], read t1=s[i+1], neg d0=-t0, neg d1=-t1," etc. The interesting thing is, in (3), you're not telling the compiler how to optimize it (as was the case with the good ol' *d++ = *s++, for PDP-11's or 68K); you are simply giving it some space in which to work. The compiler's instruction scheduler knows all the effect latencies of different operations, but it  can't make use of them if there's only one thread of evaluation to schedule. In (2), each d[]=-s[] has to be completed before the next one starts.

However, I really don't like having to do this. It's a quite machine-specific thing, which is exactly what compilers are supposed to take care of. But they can't, since the language doesn't give you a way to tell the compiler that s and d are independent. The best you can say, is that it's better than writing it in assembler, and allows you to get close to the same result, if you have a good optimizer to work with.
Some compilers have extensions which allow you to state that 's' and 'd' are 'noalias' pointers. This gives the compiler permission to make optimizations that affect the results only when the buffers overlap; which means it becomes  possible for a compiler to see the code in (1) and actually generate the code in (3), or something else, according to the underlying machine.
 This is the kind of improvement I'd really like to see over C. (but I'm aware that, unless they are writing speed-critical signal processing code, most programmers won't see much point to this).
 At first, this seems dangerous, since somebody might call the function with overlapping buffers, and the optimization would change the effect of the code. But, of course, you can write functions in ordinary C or D which work fine on non-overlapping buffers and blow up when overlapping buffers are supplied. You really aren't any further behind, it's still, as you say, a question of trusting the programmer to make their own rules and follow them.

I'm not sure this kind of extension is a good idea for C++ or D; there are more hidden things going on, and it's harder for the programmer to understand where the real dangers for aliasing are (e.g, a function parameter might point to the same memory pointed to by a member variable of 'this')

Another issue, a big one, is that C compilers generally need to assume that all function calls can change any data, except for local variables whose addresses have not been made known. Clearly this inhibits optimization of operations on global variables, or operations on memory pointed at by local variables, both of which are extremely common. A little-appreciated benefit of C++ (or gcc C) inline functions is that the inlined code becomes part of the data-flow analysis in the calling function; so any side-effects of that code are known, rather than assumed worst case. This benefit can be greater than that arising from the elimination of the call overhead.


I feel that this sort of thing is the brick wall up against which C code optimization finds itself. It's been a very long time since I used fortran, but, in that language, the compiler generally can know exactly what you are doing with arrays (since you can do a lot less),  so optimizations like (1)->(3) are at least possible in theory. I think the original appeal of C came from the fact that it let you do your own optimizations when the compiler couldn't; programmers, at that time, generally switched from assembler to C, and were quite happy using pointers etc. Now that programmers are more concerned with higher-level problems (esp. maintainabily) and not code efficiency, advanced code optimization technology is much more important, and C's flexibity is holding it back IMO.

C's relatively recent alias rules change (gcc 3 knows about them, gcc 2 didn't) are a step in the right direction, anyway. Basically, the compiler is allowed to assume, for instance, that a 'double *' and 'int *' can't point at the same thing.

--Greg



August 10, 2005
Hi,

>>>>2) Pass or Fail?
>>>Pass. The call f(i,i) should probably fail to compile. It is atleast undefined.
>>
>>Why would it fail to compile? The call is _not_ undefined.
>
>Is the spec clear on that inout arguments are passed as pointers? Is it
>unthinkable to pass them as (caller save) registers and let the caller then do
>the assignment to i in the f(i,i) case? (Unthinkable if you want aliased
>parameters to work) I just can't find it in the spec. Therefore undefined.

No, that would be an "optimization," and it would an an erroneous one at that, because it's based on an incorrect assumption. Nowhere does it say pointers/references can't be aliased. Therefore, defined.

>>FWIW, I like the idea of implicit restrict. It could even help with the "const" optimization that Walter mentioned, which was also due to aliasing.
>>
>>Perhaps there could be an implicit run-time check (in -debug) to see if the restrict contract was broken at call time. This would help a lot against the subtle bugs.
>
>Yes. It would probably catch most cases, though I guess it is possible to make obscure ones where such a check would fail too.

Perhaps. To make it more robust the whole program could be statically analyzed for restriction violations, but this also only works if enough info is available. Sigh...

Cheers,
--AJG.


August 11, 2005
llothar <llothar_member@pathlink.com> wrote:

[...]
> And if you use inout parameter like a local variable then your personal programming skills must be improved, not the language.

Sorry. If this is your professional opinion for the needs of QA in the production of a piece of software that by malfunction my cause loss of lives, would you trust your life to such software delivered by your own company?


>>And in D dead code is as illegal as expressions that do not have an effect.
> 
> Sorry but there is no reason to forbid a effectless expressions, and there is no language in the world that forbids this. So this statement is simply wrong.
> 
> And it is not dead code, dead code is defined as code that is not can't be reached based on static programming analysis. So a useless statement is a useless statement but not dead code. It is also not illegal as there are many good reasons to write dead code, especially if you have a "version" feature.

Nice try. But I do not see any use of a 'version' feature in the presented piece of software.

Furthermore, I ask whether the conjecture is true, that you as the member of a QA-team would let pass the presented piece of software even if the first statement in the body of the function would be prepended by something like

    if( ackermann( long.max / 2 ) > 42)
      transferLotsOfMoneyTo( "llothar");

accompanied by the appropriate definitions of the two functions used, because the two assignment statements of the program originally presented may still not be dead code and still are not forbidden by the language?

-manfred