Getting the const-correctness of Object sorted once and for all (page 5) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Getting the const-correctness of Object sorted once and for all (page 5)

May 14, 2012

Re: Getting the const-correctness of Object sorted once and for all

Posted by Walter Bright
in reply to Alex Rønne Petersen

Walter Bright

Posted in reply to Alex Rønne Petersen

On 5/13/2012 10:34 PM, Alex Rønne Petersen wrote:
> I have yet to see any compiler make sensible use of the information provided by
> both C++'s const and D's const.

D's const is part of purity, which is optimizable.

> const in particular is completely useless to an optimizer because it does not
> give it any information that it can use for anything. The kind of information
> that an optimization pass, in general, wants to see is whether something is
> guaranteed to *never* change. const does not provide this information. const
> simply guarantees that the code working on the const data cannot alter it (but
> at the same time allows *other* code to alter it), which, as said, is useless to
> the optimizer.
>
> immutable is a different story. immutable actually opens the door to many
> optimization opportunities exactly because the optimizer knows that the data
> will not be altered, ever. This allows it to (almost) arbitrarily reorder code,
> fold many computations at compile time, do conditional constant propagation,
> dead code elimination, ...

You cannot have immutable without also having const. Or, at least, it would be impractical.

> This seems reasonable. But now consider that the majority of functions *are
> written for const, not immutable*. Thereby, you're throwing away the immutable
> guarantee, which is what the *compiler* (not the *programmer*) cares about.
> immutable is an excellent idea in theory, but in practice, it doesn't help the
> compiler because you'd have to either
>
> a) templatize all functions operating on const/immutable data so the compiler
> can retain the immutable guarantee when the input is such, or
> b) explicitly duplicate code for the const and the immutable case.

strings are immutable, not just const. It's been very successful.


> Both approaches clearly suck. Templates don't play nice with polymorphism, and
> code duplication is...well...duplication. So, most of druntime and phobos is
> written for const because const is the bridge between the mutable and immutable
> world, and writing code against that rather than explicitly against
> mutable/immutable data is just simpler. But this completely ruins any
> opportunity the compiler has to optimize!

That isn't true when it comes to purity.


> (An interesting fact is that even the compiler engineers working on compilers
> for strictly pure functional languages have yet to take full advantage of the
> potential that a pure, immutable world offers. If *they* haven't done it yet, I
> don't think we're going to do it for a long time to come.)

It isn't just what the compiler can do, purity and immutability offer a means to prove things about code.

> Now, you might argue that the compiler could simply say "okay, this data is
> const, which means it cannot be changed in this particular piece of code and
> thus nowhere else, since it is not explicitly shared, and therefore not touched
> by any other threads". This would be great if shared wasn't a complete design
> fallacy. Unfortunately, in most real world code, shared just doesn't cut it, and
> data is often shared between threads without using the shared qualifier
> (__gshared is one example).

Yes, if you're thinking like a C programmer!


> shared is another can of worms entirely. I can list a few initial reasons why
> it's unrealistic and impractical:
>
> 1) It is extremely x86-biased; implementing it on other architectures is going
> to be...interesting (read: on many architectures, impossible at ISA level).

I don't see why.

> 2) There is no bridge between shared and unshared like there is for mutable and
> immutable. This means that all code operating on shared data has to be
> templatized (no, casts will not suffice; the compiler can't insert memory
> barriers then) or code has to be explicitly duplicated for the shared and
> unshared case. Funnily, the exact same issue mentioned above for const and
> immutable!

Frankly, you're doing it wrong if you're doing more than trivial things with shared types. Running an algorithm on a shared type is just a bad idea.

> 3) It only provides documentation value. The low-level atomicity that it is
> supposed to provide (but doesn't yet...) is of extremely questionable value. In
> my experience, I never actually access shared data from multiple threads
> simultaneously, but rather, transfer the data from one thread to another and use
> it exclusively in the other thread (i.e. handing over the ownership). In such
> scenarios, shared just adds overhead (memory barriers are Bad (TM) for
> performance).

Transferring data between threads should be done either using value types, which are copied, or references which are typed as shared only transitorially.

May 14, 2012

Re: Getting the const-correctness of Object sorted once and for all

Posted by Era Scarecrow
in reply to H. S. Teoh

Era Scarecrow

Posted in reply to H. S. Teoh

On Monday, 14 May 2012 at 05:31:01 UTC, H. S. Teoh wrote:> That's part of the reason I didn't really get comfortable with D programming until I bought TDPL -- I needed to learn D "from scratch", as it were, to think about it from a fresh perspective instead of bringing along my years of C/C++ baggage. For that, looking at a bunch of online reference docs didn't help: you're just learning the "vocabulary", as it were, and not really "thinking in the language". As any foreign language learner knows, you will never speak the language well if you just keep translating from your native language; you have to learn to "think in that language". I needed to read through TDPL like a newbie in order to learn to write D the way it's supposed to be written.
>
> Once I started doing that, many things began to make a lot more sense.

 Exactly! If you don't have a good concise enough overview or explanation of the language and features you end up with guesswork from all over the place. Thankfully my only OO experience beyond D2 has been Java, and that only helped me understand polymorphism and interfaces, giving me what I needed without cluttering me with the necessaries of the language. I am perhaps as 'unlearned' as I can be, aside from doing some ASM/C work (and going away from pointer and -> manipulation is nice :) )

 I'm still trying to find a good OO book that will teach me how to think and build with OOP properly. I'm probably only half as effective as I could be. If you have any good book recommendations I'll try and get ahold of them.

May 14, 2012

Re: Getting the const-correctness of Object sorted once and for all

Posted by Walter Bright
in reply to Era Scarecrow

Walter Bright

Posted in reply to Era Scarecrow

On 5/13/2012 11:39 PM, Era Scarecrow wrote:
> I'm still trying to find a good OO book that will teach me how to think and
> build with OOP properly. I'm probably only half as effective as I could be. If
> you have any good book recommendations I'll try and get ahold of them.

The classic is Object-Oriented Programming by Bertrand Meyer.

May 14, 2012

Re: Getting the const-correctness of Object sorted once and for all

Posted by Chris Cain
in reply to Walter Bright

Chris Cain

Posted in reply to Walter Bright

On Monday, 14 May 2012 at 06:27:15 UTC, Walter Bright wrote:
> On 5/13/2012 11:09 PM, Mehrdad wrote:
>> (1) Compiler helps you write correct multithreaded code
>> (2) You help compiler perform optimizations based on contracts
>> (3) I don't think there exists a #3 that's very different from #1 and #2
>
> #3 Improves self-documentation of code - it's more understandable and less susceptible to breakage during maintenance.
>
> #4 Improves encapsulation
>
> #5 Makes function purity possible

And all this together culminates into a much greater ability to _reason about code_.

Suppose I have code like this:
    immutable item = ...;
    auto res = aPureFunct(item);
    // tons of code
    auto calc = someVal * aPureFunct(item);

Barring the fact that the compiler will probably optimize this out for you in D... If you were writing C++ code like this, you couldn't tell _almost anything_ about this code based on that fragment I gave you. The mere fact that I can tell you _something_ is shocking. I can figure out properties and reason about code, even in small chunks like this. I know that I can replace "aPureFunct(item)" with res and it'll be equivalent.

I'd say that this is my favorite reason (but non-obvious unless you actually code using const/immutable):
#6 Reduced cognitive load

I posted this elsewhere, but it also ties multiple things together (including the GC):
How about if you're trying to figure out how often a substring of length 10 is repeated in some large string (maybe 4 million characters long?). In D:
    uint[string] countmap;
    foreach(i; 0..str.length-10)
        ++countmap[str[i..i+10]];

And you're done. Because your string is immutable, D will do the correct thing and use pointer+length to the original huge string and won't waste time copying things that are guaranteed not to change into its hash table. Thanks to the GC, you don't have to worry about watching your pointers to make sure they get deleted whenever it's appropriate (when you're done, you can just clear out countmap and the GC will collect the string if it's not used by anyone else...). Presumably the associative array implementation could take advantage of both the immutability of the string and the purity of toHash to cache the result of hashing (although, I'm not certain it does).

Now imagine the code you'd have to write in C++ or many other languages to do this _right_. Not just fast, but _also_ correct.

The fact of the matter is that, considering all of D's features, the whole is greater than the sum of its parts. Can you find something wrong with each part? Of course. But the fact of the matter is that each of them support each other in such a way that they synergistically improve the language ... if you take any one of them away, you would be left with everything else being less useful than before.

This is why you really ought to spend some time with const and immutable (and, yeah, they're mostly inseparable) and really get to know what it enables you to do. Tons of features and capabilities depend on it, so it's really not something that "should be gotten rid of to make the language more marketable". I'm not saying I wouldn't use D if we lost const/immutable, but it would certainly sour my experience with it.

May 14, 2012

Re: Getting the const-correctness of Object sorted once and for all

Posted by Mehrdad
in reply to Walter Bright

Mehrdad

Posted in reply to Walter Bright

On Monday, 14 May 2012 at 06:27:15 UTC, Walter Bright wrote:
> #3 Improves self-documentation of code - it's more understandable and less susceptible to breakage during maintenance.

#3 is also valid for C++, so I wasn't exactly considering that.

> #4 Improves encapsulation

o.O How does it improve encapsulation?

> #5 Makes function purity possible

Ah yea, this one I forgot.

May 14, 2012

Re: Getting the const-correctness of Object sorted once and for all

Posted by Alex Rønne Petersen
in reply to Walter Bright

Alex Rønne Petersen

Posted in reply to Walter Bright

On 14-05-2012 08:37, Walter Bright wrote:
> On 5/13/2012 10:34 PM, Alex Rønne Petersen wrote:
>> I have yet to see any compiler make sensible use of the information
>> provided by
>> both C++'s const and D's const.
>
> D's const is part of purity, which is optimizable.

A function can still be weakly pure and operating on const arguments, meaning you *still* have no opportunity for optimization. Strongly pure functions are easily optimizable - because they operate on immutable data (or, at least, data with shared indirection).

>
>> const in particular is completely useless to an optimizer because it
>> does not
>> give it any information that it can use for anything. The kind of
>> information
>> that an optimization pass, in general, wants to see is whether
>> something is
>> guaranteed to *never* change. const does not provide this information.
>> const
>> simply guarantees that the code working on the const data cannot alter
>> it (but
>> at the same time allows *other* code to alter it), which, as said, is
>> useless to
>> the optimizer.
>>
>> immutable is a different story. immutable actually opens the door to many
>> optimization opportunities exactly because the optimizer knows that
>> the data
>> will not be altered, ever. This allows it to (almost) arbitrarily
>> reorder code,
>> fold many computations at compile time, do conditional constant
>> propagation,
>> dead code elimination, ...
>
> You cannot have immutable without also having const. Or, at least, it
> would be impractical.

I agree entirely. I'm just saying that the way it is in the language right now doesn't make it easy for the compiler to optimize for immutable at all, due to how programmers tend to program against it.

>
>> This seems reasonable. But now consider that the majority of functions
>> *are
>> written for const, not immutable*. Thereby, you're throwing away the
>> immutable
>> guarantee, which is what the *compiler* (not the *programmer*) cares
>> about.
>> immutable is an excellent idea in theory, but in practice, it doesn't
>> help the
>> compiler because you'd have to either
>>
>> a) templatize all functions operating on const/immutable data so the
>> compiler
>> can retain the immutable guarantee when the input is such, or
>> b) explicitly duplicate code for the const and the immutable case.
>
> strings are immutable, not just const. It's been very successful.

And yet, the majority of functions operate on const(char)[], not immutable(char)[], thereby removing the guarantee that string was supposed to give about immutability.

>
>
>> Both approaches clearly suck. Templates don't play nice with
>> polymorphism, and
>> code duplication is...well...duplication. So, most of druntime and
>> phobos is
>> written for const because const is the bridge between the mutable and
>> immutable
>> world, and writing code against that rather than explicitly against
>> mutable/immutable data is just simpler. But this completely ruins any
>> opportunity the compiler has to optimize!
>
> That isn't true when it comes to purity.

I don't follow. Can you elaborate?

>
>
>> (An interesting fact is that even the compiler engineers working on
>> compilers
>> for strictly pure functional languages have yet to take full advantage
>> of the
>> potential that a pure, immutable world offers. If *they* haven't done
>> it yet, I
>> don't think we're going to do it for a long time to come.)
>
> It isn't just what the compiler can do, purity and immutability offer a
> means to prove things about code.

Absolutely, and I think that has significant value. Keep in mind that I am only contesting the usefulness of const in terms of optimizations in a normal compiler.

>
>> Now, you might argue that the compiler could simply say "okay, this
>> data is
>> const, which means it cannot be changed in this particular piece of
>> code and
>> thus nowhere else, since it is not explicitly shared, and therefore
>> not touched
>> by any other threads". This would be great if shared wasn't a complete
>> design
>> fallacy. Unfortunately, in most real world code, shared just doesn't
>> cut it, and
>> data is often shared between threads without using the shared qualifier
>> (__gshared is one example).
>
> Yes, if you're thinking like a C programmer!

Or if you're doing low-level thread programming (in my case, for a virtual machine).

>
>
>> shared is another can of worms entirely. I can list a few initial
>> reasons why
>> it's unrealistic and impractical:
>>
>> 1) It is extremely x86-biased; implementing it on other architectures
>> is going
>> to be...interesting (read: on many architectures, impossible at ISA
>> level).
>
> I don't see why.

Some architectures with weak memory models just plain don't have fence instructions.

>
>> 2) There is no bridge between shared and unshared like there is for
>> mutable and
>> immutable. This means that all code operating on shared data has to be
>> templatized (no, casts will not suffice; the compiler can't insert memory
>> barriers then) or code has to be explicitly duplicated for the shared and
>> unshared case. Funnily, the exact same issue mentioned above for const
>> and
>> immutable!
>
> Frankly, you're doing it wrong if you're doing more than trivial things
> with shared types. Running an algorithm on a shared type is just a bad
> idea.

So you're saying that casting shared away when dealing with message-passing is the right thing to do? (Using immutable is not always the answer...)

>
>> 3) It only provides documentation value. The low-level atomicity that
>> it is
>> supposed to provide (but doesn't yet...) is of extremely questionable
>> value. In
>> my experience, I never actually access shared data from multiple threads
>> simultaneously, but rather, transfer the data from one thread to
>> another and use
>> it exclusively in the other thread (i.e. handing over the ownership).
>> In such
>> scenarios, shared just adds overhead (memory barriers are Bad (TM) for
>> performance).
>
> Transferring data between threads should be done either using value
> types, which are copied, or references which are typed as shared only
> transitorially.
>

But with references come the issue that they are practically unusable with shared, forcing one to cast it away. This should be a clear sign that the feature is incomplete.

-- 
- Alex

May 14, 2012

Re: Getting the const-correctness of Object sorted once and for all

Posted by Stewart Gordon
in reply to Jonathan M Davis

Stewart Gordon

Posted in reply to Jonathan M Davis

On 13/05/2012 23:50, Jonathan M Davis wrote:
<snip>
> Caching and lazy
> evaluation _will_ be impossible in those functions without breaking the type
> system.

Unless stuff is added to the type system to accommodate it.

For example, a type modifier that adds a "set" flag.  When unset, a const method cannot read it (to do so would throw an AssertError or similar), but can assign to it, at which point it becomes set.  When set, it acts just like any member (non-const methods have read-write access, const methods have read-only access, etc.).  Non-const methods can also unset it, which they would do if they change the state of the object in a way that invalidates the cached value.

Alternatively, you could argue that it's the compiler's job to implement caching as an optimisation for pure methods.  But it would have to implement the logic to decache it when relevant state of the object changes, which could get complicated if you want to do it efficiently.

> Anything that absolutely requires them will probably have to either
> have to break the type system or use _other_ functions with the same
> functionality but without those attributes.
<snip>

I think that's half the point of std.functional.memoize - to be such a function for you to use when you want it.

Stewart.

May 14, 2012

Re: Getting the const-correctness of Object sorted once and for all

Posted by Steven Schveighoffer
in reply to Jakob Ovrum

Steven Schveighoffer

Posted in reply to Jakob Ovrum

On Mon, 14 May 2012 02:35:16 -0400, Jakob Ovrum <jakobovrum@gmail.com> wrote:

> On Sunday, 13 May 2012 at 17:02:46 UTC, Stewart Gordon wrote:
>> On 13/05/2012 17:41, Alex Rønne Petersen wrote:
>> <snip>
>>> I agree with everything but toString(). I'm afraid that forcing toString() to be const
>>> will have harm flexibility severely. Can't we do better, somehow?
>>
>> How exactly?
>>
>> If you're talking about memoization, it ought to be possible to make use of std.functional.memoize to implement it.
>>
>> Otherwise, using toString to change the state of an object is bending semantics.  If you want a method to generate a string representation of an object in a way that might do this, then create your own method to do it.
>>
>> Stewart.
>
> How about logically constant opEquals, toString etc? Currently, this is perfectly possible by just *not using const*. Logical constancy goes beyond memoization.

This means you cannot compare two const objects.

The issue is, non-const opEquals makes sense on some objects, and const opEquals makes sense on others.  However, you must make them all come together in Object.opEquals.

I think we already have the hooks to properly compare objects without requiring Object.opEquals.

Right now, when two objects are compared, the compiler calls object.opEquals (that's little o for object, meaning the module function *not* the class method).

So why can't object.opEquals be a template that decides whether to use Object.opEquals (which IMO *must be* const) or a derived version?  I don't think it's that much of a stretch (not sure if we need a compiler fix for this).

-Steve

May 14, 2012

Re: Getting the const-correctness of Object sorted once and for all

Posted by bearophile
in reply to Alex Rønne Petersen

bearophile

Posted in reply to Alex Rønne Petersen

Alex Rønne Petersen:

> But how would you memoize the value in the instance of the object if it's const?

I opened a thread on such matters:
http://forum.dlang.org/thread/gtpdmrfektaygfmecupj@forum.dlang.org

-------------------

Jonathan M Davis:

>Caching and lazy evaluation _will_ be impossible in those functions without breaking the type system.<

Take a look at the articles I've linked.

Bye,
bearophile

May 14, 2012

Re: Getting the const-correctness of Object sorted once and for all

Posted by Steven Schveighoffer
in reply to Dmitry Olshansky

Steven Schveighoffer

Posted in reply to Dmitry Olshansky

On Sun, 13 May 2012 16:52:15 -0400, Dmitry Olshansky <dmitry.olsh@gmail.com> wrote:

> On 14.05.2012 0:48, Stewart Gordon wrote:
>> On 13/05/2012 20:42, Walter Bright wrote:
>> <snip>
>>> I'd like to see std.stream dumped. I don't see any reason for it to
>>> exist that std.stdio
>>> doesn't do (or should do).
>>
>> So std.stdio.File is the replacement for the std.stream stuff?
>>
>> How does/will it provide all the different kinds of stream that
>> std.stream provides, as well as the other kinds of stream that
>> applications will need?
>>
>
> I think I've seen proper replacement (or rather a draft of it). If only Steven can be bothered to finish it :)

Yes, I know.

I hate starting things and then not finishing them.  Especially when I've got so much of it completed...

I'm going to make some time to finish this.  I need probably a good few days (of solid time).  Which means, based on my normal schedule, 2-3 weeks.  Most of the difficult parts are complete, I have a working buffer implementation, fast unicode translation (meaning UTF-8, UTF-16, UTF-16LE, UTF-32, and UTF-32LE), and a path to use RAII for seamless integration with std.stdio.File.  I even have preliminary agreement from Andrei and Walter on a design (no guarantees they accept the final product, but I think I can massage it into acceptance).

The one last puzzle to solve is sharing.  File is this half-breed of sharing, because it contains a FILE *, which is a shared type, but File is not.  Then it does some casting to get around the problems.  We need a better solution than this, but shared is so difficult to use, I think I'm going to have to implement something similar.  It has been stipulated by Walter and Andrei that fixing this shared situation is a requirement for any new replacement.  I have some ideas, but I have to play around to see if they actually work and make sense.

-Steve

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation