June 23, 2007
Walter Bright wrote:
> Sean Kelly wrote:
>> Walter Bright wrote:
>>> Sean Kelly wrote:
>>>> Matter of opinion, I suppose.  The C++ design was immediately clear to me, though it obviously wasn't for others.  I grant that the aliasing problem can be confusing, but I feel that it is a peripheral issue.
>>>
>>> I don't think it is a peripheral issue. It completely screws up optimization, is useless for threading support, and has spawned endless angst about why C++ code is slower than Fortran code.
>>
>> As a programmer, I consider the optimization problem to be a non-issue.    Optimization is just magic that happens to make my program run faster.
> 
> Optimization often makes the difference between a successful project and a failure. C++ has failed to supplant FORTRAN, because although in every respect but one C++ is better, that one - optimization of arrays - matters a whole lot. It drives people using C++ to use inline assembler. They spend a lot of time on the issue. Various proposals to fix it, like 'noalias' and 'restrict', consume vast amounts of programmer time. And time is money.

FORTRAN is also helped by having a fairly standardized ABI that can be called easily from lots of languages, which C++ lacks.  But C has that, and it has also failed to supplant FORTRAN for numeric code.  But I think Sean's right.  A lot of that is just that the language supports things like actual multi-dimensional arrays (by 'actual' I mean contiguous memory rather than pointers to pointers), and mathematical operations on them right out of the box.  Telling a numerics person that C/C++ will give them much better IO and GUI support, but take them a step back in terms of core numerics is like trying to sell a hunter a fancy new gun with a fantastic scope that will let you pinpoint a mouse at 500 yards but -- oh, I should mention it only shoots bb's.

On the other hand, I suspect there's lots of code that's written in FORTRAN supposedly for performance reasons that doesn't really need to be.  Just as there's lots of code written in C++ that would perform fine in a scripting language.  But people will still swear up and down that {whatever} is the only language fast enough.  A lot of numerics folks do realize this, however.  They just go from Fortran straight to Matlab, and skip the other compiled languages altogether.

I guess what I'd like to say in summary is that I'm skeptical about the claim that optimization "often" makes the difference between success and failure.  "occasionally" I could believe.  Ill-advised premature optimization has probably led to the demise of many more a project than actual optimization problems in the end product.  We'll all gladly take a free 20% speed improvement if the compiler can give it to us, but I don't believe there are that many projects that will fail simply for lack of that 20%.

--bb
June 23, 2007
Walter Bright wrote:
> Sean Kelly wrote:
>> Walter Bright wrote:
>>> In C++, sometimes const means invariant, and sometimes it means readonly view. I've found even C++ experts who don't know how it works.
>> Odd.  The C++ system always seemed extremely simple to me.
> 
> It isn't.

It is to many people; you seem to have more experience with people who have trouble understanding it, maybe.  I don't know why else it is that I find your diatribes about the terrible problems of C++'s const to be so strange, given that C++'s const works widely and well.

> I run into people all the time who are amazed to discover that const references can change.

In C++ code without undefined behavior, references cannot change.

Maybe you're referring to the fact that some people don't learn that const means read-only in C++, rather than the "invariant" notion that you've introduced for D 2.0.  That seems not to be hard for competent programmers to learn, compared to many of the other things they have to learn.

(I'm concerned at using two synonyms to mean different
things, but it's almost certainly too late to change
that now.)

> Few understand when const is invariant and
> when it isn't. I've never even seen anyone mention the problem where the
> non-transitive const destroys any hope of having FP like capabilities in
> C++.

And yet C++ *has* "FP-like" capabilities, depending on how
we define our terms.  Const works well in the hands of many,
many C++ programmers.  Bad programmers will be bad programmers
with D too.

Value, rather than reference, semantics are a great aid in providing FP capabilities in a manner that scales to parallel systems.  I'm glad to see that D is getting closer to C++ in providing more capable structs.

>> I personally find the use of three keywords to represent three overlapping facets of const behavior to be very confusing, and am concerned about trying to explain it to novice programmers.  With three keywords, there are six possible combinations:
>>
>> final
>> const invariant
>> final const
>> final invariant
>> const invariant
>> final const invariant
> 
> Probably the thing to do is simply outlaw using more than one.

That would simplify things, I think.

-- James
June 23, 2007
Walter Bright wrote:
> Sean Kelly wrote:
>> Matter of opinion, I suppose.  The C++ design was immediately clear to me, though it obviously wasn't for others.  I grant that the aliasing problem can be confusing, but I feel that it is a peripheral issue.
> 
> I don't think it is a peripheral issue.

No, you clearly don't.  But to many of us, it is a peripheral issue, and the fact that you make it so central seems odd.

> It completely screws up optimization,

It has some detrimental effects on some optimizations.  It's a huge aid to good design.

> is useless for threading support, and has spawned endless angst about why C++ code is slower than Fortran code.

Which will continue, even though C++ compilers can often now beat Fortran compilers for speed.  (When used with idiomatic C++ styles, rather than lower-level, C style extensive use of pointers.)

>> Or am I missing something?
> 
> Yes, you've missed the distinction between a readonly view and a truly immutable value.

I doubt that.

> That's not surprising, since my experience is that very
> few who are used to C++ const see the difference, and it took me a while
> to figure it out.

It's pretty trivial.  Which isn't to say that it's of no use, but it's not complicated.  Then again, neither is the C++ model of const, so maybe it depends on perspective.

-- James
June 23, 2007
Sean Kelly wrote:
> However, my point was that to a programmer, optimization should be invisible.

I can't agree with that. The compiler and the programmer need to cooperate with each other in order to produce fast programms. If a programmer just throws the completed program "over the wall" to the optimizer and expects good results, he'll be sadly disappointed.


> I can appreciate that 'invariant' may be of tremendous use to the compiler, but I balk at the notion of adding language features that seem largely intended as compiler "hints."

It's a lot more than that. First, there's the self-documenting aspect of it. Second, it opens the way for functional programming, which can be of huge importance.


> Rather, it would be preferable if the language were structured in such a way as to make such hints unnecessary.

That would be preferable, but experience with such languages is that they are complete failures at producing fast code. Fast code comes from a language that enables the programmer to work with the optimizer to produce better code.

Lots of people think that just 'cuz they write in C++, which has good optimizers available, they'll get fast code. That's not even close to being true.


> To that end, and speaking as someone who isn't primarily involved in numerics programming, my impression of FORTRAN is that the language is syntactically suited to numerics programming, while C++ is not.  Even if C++ performed on par with FORTRAN for similar work (and Bjarne suggested last year that it could), I would likely still choose FORTRAN over C++ because the syntax seems so much more appealing for that kind of work.

I programmed for years in FORTRAN. The syntax is not appealing, in fact, it sucks. The reason it is suited for numerics programming is because FORTRAN arrays, by definition, cannot be aliased. This means that optimizers can go to town parallelizing array operations, which is a big, big deal for speed.

You can't do that with C/C++ because arrays can be aliased. I have a stack of papers 6" deep in the basement on trying to make C more suitable for numerics, and it's mostly about, you guessed it, fixing the alias problem. They failed.

There are some template libraries for C++ which parallelize array operations and can finally approach FORTRAN. But they are horrifically kludgy and complicated. No thanks. But the effort that has gone into them is astonishing, and is indicative of how severe the problem is.


>> I'm less sure about that. I think we're all so used to C++ and its mushy concept of const that we don't know yet what will emerge from the use of invariant. I do know, however, that those who want to do advanced array optimizations are going to want to be using invariant function parameters.
> 
> You may be right, and I'm certainly willing to give it a try.  This is simply my initial reaction to the new design, and I wanted to voice it before becoming placated by experience.  My gut feeling is that a better design is possible, and I'm not yet ready to close the door on alternatives.

Andrei, I and Bartosz have each expended probably a hundred hours trying to figure this out, and we've tried a lot of designs. If there is a better design, it's not like we haven't tried. I wish to reiterate that const designs in other languages like C++ (and to some extent Java) utterly fail at the objectives we set for const in D. Furthermore, Andrei is familiar with the many research papers on the topic, which were of invaluable help. D's const system is more ambitious than any of them.



> The compiler can inspect the code however, and a global const is as good as an invariant for optimization (as far as I know).  As for the rest, I think the majority of remaining cases aren't ones where 'invariant' would apply anyway: dynamic buffers whose contents are guaranteed not to change either in word or by design, etc.

But they do apply - that's the whole array optimization thing. You're not just going to have global arrays.
June 23, 2007
Bill Baxter wrote:
> Walter Bright wrote:
>> Optimization often makes the difference between a successful project and a failure. C++ has failed to supplant FORTRAN, because although in every respect but one C++ is better, that one - optimization of arrays - matters a whole lot. It drives people using C++ to use inline assembler. They spend a lot of time on the issue. Various proposals to fix it, like 'noalias' and 'restrict', consume vast amounts of programmer time. And time is money.
> 
> FORTRAN is also helped by having a fairly standardized ABI that can be called easily from lots of languages, which C++ lacks.  But C has that, and it has also failed to supplant FORTRAN for numeric code.  But I think Sean's right.  A lot of that is just that the language supports things like actual multi-dimensional arrays (by 'actual' I mean contiguous memory rather than pointers to pointers),

C has them too:
	int array[3][5];
is not an array of pointers to arrays.

> and mathematical operations on them right out of the box.

No, FORTRAN does not have array operations out of the box. It has no more mathematical operations than C does (in fact, it has fewer).

> Telling a numerics person that C/C++ will give them much better IO and GUI support, but take them a step back in terms of core numerics is like trying to sell a hunter a fancy new gun with a fantastic scope that will let you pinpoint a mouse at 500 yards but -- oh, I should mention it only shoots bb's.

I've read the papers on it. It's very clear that the only technical reason FORTRAN is better than C at numerics is it doesn't have array aliasing.

That's it.

> I guess what I'd like to say in summary is that I'm skeptical about the claim that optimization "often" makes the difference between success and failure.  "occasionally" I could believe.  Ill-advised premature optimization has probably led to the demise of many more a project than actual optimization problems in the end product.  We'll all gladly take a free 20% speed improvement if the compiler can give it to us, but I don't believe there are that many projects that will fail simply for lack of that 20%.

When you're paying by the minute for supercomputer time, 20% is a big deal.

When you're predicting the weather, a 20% slowdown means you're producing history rather than predictions.

If Google could get 20% more speed out of their servers, they could cut the size of their server farm by 20%. That's hundreds of millions of dollars.

When you're writing a game, numerics performance is what makes your game's graphics better than the competition.

When you're writing code for embedded systems, faster code means you might be able to use a slower, cheaper processor, which can translate into millions of dollars in cost savings when you're shipping millions of units.

I don't want D to be fundamentally locked out of these potential markets. If D compilers can produce fundamentally better code than C++, that's a big selling point for D into companies like Google. And when they use it for their critical server farm apps, they'll naturally tend to use it for much more.
June 23, 2007
James Dennett wrote:
> Walter Bright wrote:
>> Sean Kelly wrote:
>>> Walter Bright wrote:
>>>> In C++, sometimes const means invariant, and sometimes it means
>>>> readonly view. I've found even C++ experts who don't know how it works.
>>> Odd.  The C++ system always seemed extremely simple to me.
>> It isn't. 
> 
> It is to many people; you seem to have more experience with
> people who have trouble understanding it, maybe.

Many of these people are well known C++ experts (no, I won't name names <g>). It's a lot more complicated than it appears.


> I don't
> know why else it is that I find your diatribes about the
> terrible problems of C++'s const to be so strange, given
> that C++'s const works widely and well.

Probably because not many people realize that two well known shortcomings of C++, the alias optimization problem and the parallelization problem, are related to inadequacies of const.


> Maybe you're referring to the fact that some people don't
> learn that const means read-only in C++, rather than the
> "invariant" notion that you've introduced for D 2.0.

Yes. But sometimes C++ const means invariant, sometimes it means final, and sometimes it means readonlyview.

> That
> seems not to be hard for competent programmers to learn,
> compared to many of the other things they have to learn.

C++ programmers with many years experience often are surprised by this. C++ is a hard language to learn, and takes several years to master.


> (I'm concerned at using two synonyms to mean different
> things, but it's almost certainly too late to change
> that now.)

It's better than C++ using one word (const) to mean 3 different things, which is the root of why people are confused with what C++ const means.


>> Few understand when const is invariant and
>> when it isn't. I've never even seen anyone mention the problem where the
>> non-transitive const destroys any hope of having FP like capabilities in
>> C++.
> 
> And yet C++ *has* "FP-like" capabilities, depending on how
> we define our terms.

Not really. C++ has been pretty resistant to attempts to automatically parallelize code, and that's because of the weakness of const.

> Const works well in the hands of many, many C++ programmers.

For its limited application, yes. But rethinking the role of const can open the door to a lot more.

> Value, rather than reference, semantics are a great aid in
> providing FP capabilities in a manner that scales to parallel
> systems.  I'm glad to see that D is getting closer to C++ in
> providing more capable structs.

Invariant can provide value semantics with the performance of references. After all, you can't really pass a dictionary by value.

The other crucial bit is transitivity of const/invariant. In C++, you can say with const that you won't change the root of the dictionary, but you simply cannot specify that you won't change any nodes of it. Without that, you don't have FP. It's why people who want to parallelize C++ have to write extensions to the language to do it.

Writing extensions to C++, then implementing it as a translator that outputs C++, is a very difficult and expensive thing to do. People are not going to do it unless they can expect a very large return on their investment. This means they are trying to address a serious shortcoming of the language.

Based on this, I don't agree that C++ const is good enough for the future. It's a 20 year old design, and we can learn from it and do better.
June 23, 2007
Walter Bright wrote:
> 
>> I can appreciate that 'invariant' may be of tremendous use to the compiler, but I balk at the notion of adding language features that seem largely intended as compiler "hints."
> 
> It's a lot more than that. First, there's the self-documenting aspect of it. Second, it opens the way for functional programming, which can be of huge importance.
> 
You mention functional programming a fair bit with respect to const, which is nice to hear. But nothing in the current const system allows you to declare a verifiably 'pure' function; can we expect some annotation for functions which says 'this function doesn't read/write any global variables?'

    int b

    int foo()
    {
        b++;
        return b * 2;
    }

    pure int square(int x)
    {
        return x * x;
    }

    pure int baz()
    {
        return foo(); // fails: foo is not pure
    }
June 23, 2007
Walter Bright wrote:

> http://www.digitalmars.com/d/const.html

When i read the discussion, about the complicated const/final/invariant, it
come in my mind, if we really need pointer "*"?
Maybe we can solve this with ".ptr" .

What i have understand, we need it only to interface with C. Ok, this is important enough.

I knew many people didn't like pointer "*" and make a bow about D.

My 2 cents.

Manfred


June 23, 2007
"Reiner Pope" <some@address.com> wrote in message news:f5ipqd$2m6j$1@digitalmars.com...

> can we expect some annotation for functions which says 'this function doesn't read/write any global variables?'

Personally, I like the PHP way; each global variable must be declared as such before use; the syntax is similar to a local variable. Any function including any "global" declarations is thus "not pure".

....int b
....
....int foo()
....{
........global int b;
........
........b++;
........return b * 2;
....}
....
....int square(int x)
....{
........return x * x;
....}
....
....pure int baz()
....{
........return foo(); // fails: foo is impure (it has "global" declarations)
....}
....
....pure int bar(int x)
....{
........global int b; // fails: global conflicts with pure
........
........return x*(b++); // would fail
....}


June 23, 2007
Martin Howe wrote:
> "Reiner Pope" <some@address.com> wrote in message news:f5ipqd$2m6j$1@digitalmars.com...
> 
>> can we expect some annotation for functions which says 'this function doesn't read/write any global variables?'
> 
> Personally, I like the PHP way; each global variable must be declared as such before use; the syntax is similar to a local variable. Any function including any "global" declarations is thus "not pure".

That might actually work in D: in PHP, it was, for me, the source of countless bugs, as you don't have to declare local variables before use. Thus I'd happily use something like "return $b++" without "global $b" and wonder why the function always returned zero.

A good plus about such a requirement is that it discourages you from writing functions that rely on many globals: when you start thinking about something like http://www.php.net/manual/en/language.variables.scope.php#18748 you should consider refactoring instead.

However, it can still make code unwieldy when you're writing a lot of one-liners, each of which need to refer to the same global, and you need to repeat the "global foo bar" in each one.

I'd say "pure" wins here, though "global" isn't without its merits.

-- 
Remove ".doesnotlike.spam" from the mail address.