April 03, 2008
Walter Bright wrote:
> Bill Baxter wrote:
>> So does that mean
>>
>>   pure int f(const Class c)
>>   { ...
>>   }
>>
>> will *not* be allowed?  Because some part of c *could* change, even if it's not f() doing the changing.  I.e. another thread could be changing c concurrently.
> 
> Right. That declaration of f is erroneous.
> 
>> If so, that seems unnecessarily restrictive to me.  Any side effect or indeterminacy of f() in that case is not because of f()'s doing.  So f()'s behavior *is* pure even if it takes a const argument.  It's not f's fault if the environment it's living in is unstable.
> 
> A function is pure or it isn't, there really isn't any room for "unstable" purity, or the point of purity is lost.

I think there is. A function which accesses a global is inherently impure. But, a function which takes a non-invariant argument, can in theory safely be called from inside a pure function, provided any non-invariant arguments are local variables, or invariant.

eg.

class C
{
  int numcalls;
  this() { numcalls=0; }
  void foo() { ++numcalls; } // Not pure - but no global side effects.
}

pure int bar(int x)
{
    C c = new C;
    for (int i=0; i<x; ++i) c.foo();
    return c.numcalls;
}

Is bar() pure or not?

Incidentally this type of code currently works in CTFE.


> 
> Look at it another way. You can take the value of the bits pushed on the stack as arguments to a pure function, and compare that with previous bits passed to the same function, and if there's a binary bit match, you can skip calling the function and return a previously cached result.
> 
> If that cannot be done, then the function is NOT pure.
> 
>> Maybe there need to be two levels of pure?  One says this function by itself has no side effects, the other says additionally that the function is invariant with respect to the environment.  The former category of functions can all be run in parallel safely provided there are no other threads modifying the environment simultaneously.
> 
> No, you cannot run them simultaneously. The problem with const (instead of invariant) is that the transitive state of the passed object is NOT guaranteed to be the same:
April 03, 2008
"Don Clugston" wrote
> Walter Bright wrote:
>> Bill Baxter wrote:
>>> So does that mean
>>>
>>>   pure int f(const Class c)
>>>   { ...
>>>   }
>>>
>>> will *not* be allowed?  Because some part of c *could* change, even if it's not f() doing the changing.  I.e. another thread could be changing c concurrently.
>>
>> Right. That declaration of f is erroneous.
>>
>>> If so, that seems unnecessarily restrictive to me.  Any side effect or indeterminacy of f() in that case is not because of f()'s doing.  So f()'s behavior *is* pure even if it takes a const argument.  It's not f's fault if the environment it's living in is unstable.
>>
>> A function is pure or it isn't, there really isn't any room for "unstable" purity, or the point of purity is lost.
>
> I think there is. A function which accesses a global is inherently impure. But, a function which takes a non-invariant argument, can in theory safely be called from inside a pure function, provided any non-invariant arguments are local variables, or invariant.
>
> eg.
>
> class C
> {
>   int numcalls;
>   this() { numcalls=0; }
>   void foo() { ++numcalls; } // Not pure - but no global side effects.
> }
>
> pure int bar(int x)
> {
>     C c = new C;
>     for (int i=0; i<x; ++i) c.foo();
>     return c.numcalls;
> }
>
> Is bar() pure or not?
>
> Incidentally this type of code currently works in CTFE.

Yes, bar is pure, but in order to be statically verifyable that it is pure, I think you need to slap a pure tag on C.foo, because without source code, the compiler cannot check foo to see if it accesses any other values.

I think Walter's vision is to have pure be statically verified by the compiler to ensure that one does not erroneously label a non-pure function as pure.  In this context, I am very interested to hear the set of rules that allow this.  What I am also interested is whether general pointers to non-invariant data will be allowed to be used.

For example:

char[] globaldata;

pure invariant(char)[] transform(invariant(char)[] input)
{
    char[] result = input.dup;
    // transform result
    return cast(invariant)result;
}

Will this compile?  If so, how does the compiler statically know during the transformation phase that result was pointing to data that was local, since it is on the heap?  Is the compiler going to 'tag' such variables as pure-safe?  i.e. how does it have the context information about result to know that it is ok to use, it could have been pointing to globaldata?

-Steve


April 03, 2008
"Walter Bright" wrote
> Jason House wrote:
>> Walter Bright wrote:
>>
>>> If you do away with transitive const, you cannot have invariant either. Const allows you to reuse code that works on invariants to work with mutables, too.
>>>
>>> Logical const just utterly breaks the whole system. Every scheme we've looked at for logical const winds up in some way breaking invariants. If invariants are broken, the advantages of them all just go away. I suspect that logical const is like perpetual motion - if you think you've solved it, you've just made a mistake somewhere <g>. I also suspect that the failure of logical const validates the D scheme of invariants as being correct - there shouldn't be any holes in it.
>>>
>>> You're right that invariant by itself is not enough to specify a pure function, a pure function also cannot read or write global state. But invariant is still a necessary condition, it's just not sufficient.
>>
>> Is it possible to describe what tricks the compiler can do given:
>>  A. transitive const      w/ mutable access to globals/statics
>>  B. transitive invariance w/ mutable access to globals/statics
>>  C. transitive const      w/ const access to globals/statics
>>  D. transitive invariance w/ invariant access to globals/statics
>>
>> It'd be extremely helpful to have that kind of discussion in the const
>> FAQ. Without it, I feel like we're all discussing theoretical stuff with
>> little
>> basis for our arguments.
>
> Sure, but static class members are *not* part of the transitive closure state of the object. They're just global variables.

Neither are mutable member variables.

-Steve


April 03, 2008
Walter Bright wrote:

> Bill Baxter wrote:
>> So const isn't a guarantee of thread safety.
> 
> You're right, it isn't sufficient. But it is necessary.
> 
> 
>> Hey! That probably *is* thread safe, and it didn't even use const at all.
> 
> You can certainly write thread-safe code that doesn't use const at all.
> You just have to be careful not to change it - i.e. the checking will be
>   done by you, the programmer, rather than the compiler.
> 
>> I can see invariant having some value in conjunction with pure, but I also suspect that invariant, being so strict, is also something that will not get used a lot outside of simple constants and code carefully designed for running in parallel.  Like what Don's said.
> 
> Invariant strings have turned out to be a resounding success (and I was very, very skeptical of that initially). I suspect that more and more programs will gravitate towards using invariant as people get more used to the idea. I know my programs will.

Do you have any references to this resounding success?

-- 
Lars Ivar Igesund
blog at http://larsivi.net
DSource, #d.tango & #D: larsivi
Dancing the Tango
April 03, 2008
On 03/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
> Have you no imagination? :)

Apparently.

So to cut a long story short, you're saying that you can have a class in which a non-pure function caches a result (in a mutable variable), and in which the pure version of the same function doesn't cache the result.

You might just as well say, you can have a class in which a non-const function caches a result, and in which the const version of the same function doesn't. What's the difference?

I guess what I'm trying to get at is that it is never /necessary/ to use mutable variables. There's always a way of rewriting the code so you don't need them. In this case:

    class Calclulator
    {
        int config;
        int f(int x) const { ... }
        pure int f(int x) invariant { ... }
    }

    class CachingCalculator
    {
        Calculator calculator;
        int[int] cache;
        int f(int x)
        {
            auto p = x in cache;
            if (p !is null) return *p;
            int r = calculator.f(x);
            cache[x] = r;
            return r;
        }
    }

You just have to wean yourself off the addiction to logical const. After you've gone cold turkey for a bit, you'll just stop wanting to use it.
April 03, 2008
"Janice Caron" wrote
> On 03/04/2008, Steven Schveighoffer wrote:
>> Have you no imagination? :)
>
> Apparently.
>
> So to cut a long story short, you're saying that you can have a class in which a non-pure function caches a result (in a mutable variable), and in which the pure version of the same function doesn't cache the result.
>
> You might just as well say, you can have a class in which a non-const function caches a result, and in which the const version of the same function doesn't. What's the difference?

The difference is that things start breaking when they expect to receive a const calculator.  For example, if I have a function that looks like this:

int[] calculateSomeValues(const(Calculator) c);

written by some other developer who says "I won't be changing c, and c defines that f is const, so I'll declare that c is const".

And then I say "hey, why don't I add a caching feature to Calculator that speeds up the code tremendously", but now, since I didn't write calculateSomeValues, I cannot pass the caching version to it.  So what ends up happening is I'll create a globby so I can "work around" the const system and create faster executing code (the globby has a penalty in the AA lookup, but we're assuming that each call to f is much more expensive than that). The problem is that in this case, const gets in the way, and didn't help me at all.  Const is supposed to help multiple developers work together, not prevent them from doing so.

You could say "so pass the cache in with the calculator", which is fine, but then I'm losing the whole point of being able to group related objects together!  Not only that, but now caluclulateSomeValues has to know at least that I'm passing some mutable state to it, so it has knowledge of the implementation, which it shouldn't have to know.  Why not encapsulate the mutable state in with the argument?  All a mutable state says is to the compiler "this part is not part of the object, so ignore it for const purposes."  That's all.  Think of it as an extra argument to all functions which take a class of that type, and that argument is implicitly passed to all member functions of the object, just like the 'this' pointer is.

>
> I guess what I'm trying to get at is that it is never /necessary/ to use mutable variables. There's always a way of rewriting the code so you don't need them. In this case:
>
>    class Calclulator
>    {
>        int config;
>        int f(int x) const { ... }
>        pure int f(int x) invariant { ... }
>    }
>
>    class CachingCalculator
>    {
>        Calculator calculator;
>        int[int] cache;
>        int f(int x)
>        {
>            auto p = x in cache;
>            if (p !is null) return *p;
>            int r = calculator.f(x);
>            cache[x] = r;
>            return r;
>        }
>    }
>
> You just have to wean yourself off the addiction to logical const. After you've gone cold turkey for a bit, you'll just stop wanting to use it.

I don't use const in most of my stuff, since 1) I only use Tango, and 2) all my D apps, and most of my other apps, are written only by me :)

The problem is that inevitably, I'll want to use someone else's library whose author decided to use const, and inevitably, they will have incorrectly implemented it, since they didn't think of all possible ways someone can use their lib.

-Steve


April 03, 2008
== Quote from Sean Kelly (sean@invisibleduck.org)'s article
> == Quote from Walter Bright (newshound1@digitalmars.com)'s article
> > Sean Kelly wrote:
> > > My traditional argument in support of logical const is this:
> > >
> > >     class C
> > >     {
> > >         mutable mutex monitor;
> > >         std::string name;
> > >     public:
> > >         std::string name() const
> > >         {
> > >             scoped_lock sl( monitor );
> > >             return name;
> > >         }
> > >
> > >         void name( std::string const& n )
> > >         {
> > >             scoped_lock sl( monitor );
> > >             name = n;
> > >         }
> > >     };
> > >
> > > Here, the mutex has nothing to do with the state of the object and must be modified even during logically non-modifying operations. Similar behavior might be necessary for a logging system in some instances, etc.  However, I'm not certain whether this is sufficient to justify logical const in general.  As you say--it has some serious problems as well.
> > If C.name were invariant, there would be no need for any locks. I don't think this is a good example, because with D's invariant strings there is no need for such locking.
> I'm not sure that's true.  Mutexes do two things: first, they guarantee atomicity for an arbitrary sequence of instructions and second, they provide a memory barrier so the result of those instructions is made visible to any other thread which later acquires the same mutex.  The invariant label provides a slightly different guarantee: that the data underlying an invariant reference will not ever change.

I realized that I didn't mention it in this message so I wanted to follow up.
From my perspective, the advantage of invariant strings are that the user
doesn't have to worry about COW or anything like that because the duping
is done up front.  So:

    class Person
    {
        private string m_name;
        void name( string n ) { synchronized m_name = n; }
        string name() { synchronized return m_name; }
    }

If m_name were not invariant, the code would have to look like this to guarantee safety:

    class Person
    {
        private char[] m_name;
        void name( string n ) { synchronized m_name = n.dup; }
        string name() { synchronized return m_name.dup; } // added .dup
    }

Without the explicit copying, the creator of Person would have to rely on
documentation to enforce ownership rules (ie. that passing a string to
Person.name indicates a transferral of ownership and that the user should
not retain a reference to this buffer or modify it after the transfer takes place)
because there's no way to indicate them in code in D.  In C++, as I'm sure
you're aware, std::auto_ptr is used to indicate transfer of ownership for
dynamic data, or plain old pass by value is used instead.


Sean
April 03, 2008
Walter Bright Wrote:

> 
> Logical const just utterly breaks the whole system.


I think we had a different expectations about invariant and pure.

Do you believe that (A),

   invariant A a;
   int x = f_pure(a);
   g_not_pure(a); // potentially mutating mutable members of A
   int y = f_pure(a);
   int z = f_pure(a);

which is can be transformed without problem to (B),

   invariant A a;
   int x = f_pure(a);
   g_not_pure(a);
   int y = f_pure(a);
   int z = y;

MUST also be equivalent to (C) ?

   invariant A a;
   int x = f_pure(a);
   int y = x;
   int z = x;
   g_not_pure(a);


A = B = C is only possible if invariant A as no mutable part. I agree. Is this your position?

I was expecting a weaker pure model, where only A = B is true (optimization based on purity only possible inside a contiguous block of pure functions).

If this is correct then I understand your concerns against logical const.


April 03, 2008
On 03/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>  For example, if I have a function that looks like this:
>
>  int[] calculateSomeValues(const(Calculator) c);
>
>  written by some other developer who says "I won't be changing c, and c
>  defines that f is const, so I'll declare that c is const".

The difference between Calculator and CachingCalculator is a bit like the difference between File and BufferedFile. We all /know/ that file access is slow, and so buffering speeds things up. Ditto here. That other developer should have written

    int[] calculateSomeValues(CachingCalculator c);

and if they were dumb enough to use an underpowered class, then don't call their function.



>  and inevitably, they [other people] will have
>  incorrectly implemented it

One doesn't introduce new language features on the assumption that other people will incorrectly implement other language features.

It's just education, that's all. When using D, you program "the D way". And that means, you don't declare something const, if you know that bits of it are going to change. That's how it works in D. Look at it like this - D mandates good habits.
April 03, 2008
== Quote from Walter Bright (newshound1@digitalmars.com)'s article
> Sean Kelly wrote:
> > I know you like to talk about the unreliability of const in C++ because of const_cast and the like, but in my opinion those are theoretical objections because I've never encountered even a single use of const_cast in actual code.
> It's not even const_cast. It's simply going *one* level of indirection, and all the const checking is gone, gone, gone. You can have const objects in C++, but not const data structures.
> > So while I do agree that C++ const may not be useful for static
> > analysis, I also believe that C++ const works just fine for achieving more
> > demonstrably correct code.  For example, if I create this class:
> >
> >     class BoxedInt
> >     {
> >         int val;
> >     public:
> >         BoxedInt( int v ) : val( v ) {}
> >         int get() const { return val; }
> >         void set( int v ) { val = v; }
> >     };
> >
> > I am confident that if someone tries to call set() on a const instance of
> > BoxedInt a compile-time error will occur.  To me, this means that C++
> > const helps to write demonstrably correct code because the compiler
> > will trap at least some logic errors related to intended program behavior.
> > Sure, the user could theoretically cast away const-ness of his BoxedInt
> > object and thus break everything, but as far as I'm concerned that's his
> > problem.
> You've got no references in BoxedInt. How many data structures do you use that contain no references?

No references at all?  Perhaps half.  But very few of my C++ objects contain references to external data, and references are almost always managed via shared_ptr or the equivalent.  I can say with confidence that I've never been bitten by a bug resulting from non-transitive const in C++.  That isn't to say that I think there's no need for transitive const--I think it's a great idea-- but only that this hasn't been a problem in the code that I've written.

> Essentially, const in C++ only works for
> trivial objects. (I'm not saying trivial objects are not useful, I'm
> just saying they are trivial.)

I think this statement is open to interpretation.  From a theoretical perspective
I agree.  But with RAII objects managing references and the like, it's rare for
even very complex objects to contain raw pointers, and such RAII objects can
be made to fake transitivity of const by overloading their const and non-const
methods appropriately.

> >> This will become more obvious as C++ tries
> >> to compete in the multiprogramming arena. The C++ compiler *cannot* help
> >> with multiprogramming issues; the C++ programmer is entirely dependent
> >> on visual inspection of the code, making C++ a tedious, labor intensive
> >> language for multiprogramming.
> >
> > I agree with this.  C++ const is intended to catch logical mistakes and is not useful for multiprogramming.
> >
> >> How many times have you looked at the documentation for a function API, and wondered which of its parameters were input and which were output?
> >
> > Never.
> Ok, but I do it all the time.

That's fine.  I recognize that I'm not singularly representative of the entire programming world :-)

>  > Though it perhaps helps a bit that C++ classes are value types.
> They are value types, true, but I don't think that is relevant to the
> issue of transitive const.
> > I know that counter-examples could be provided along the lines of:
> >
> >     struct C
> >     {
> >         char* buf;
> >     };
> >
> >     void fn( const C val )
> >     {
> >         val.buf[0] = 0;
> >     }
> >
> > But I've never seen anyone actually make this kind of mistake.  Perhaps I've just been lucky.
> So what you're saying is you follow the *convention* that const in C++ is transitive. I suggest that this convention is the norm, and that normal idioms are ripe fruit for codification into the language rules.

I agree.

> I further suggest that since the C++ compiler cannot help you find any of those mistakes, your projects may unknowingly suffer from them. I do know that when I've tried to do COW in C++ (which relies on transitive const), there have been many bugs and it's been hard to get them flushed out.

Yup.

> >> When you talk about the "misapplication" of const, does that mean failure to understand what the specific API of a function is?
> >
> > Most often, the problem I've seen is for const to not be consistently applied
> > to class member functions or within the program itself.  It's been a while so
> > I'm actually fuzzy on what the exact problems were, but I remember modifying
> > a bunch of unrelated code when I created a new member function in some class.
> > If I had to guess I'd say this had something to do with the non-transitive nature
> > of const in C++ and that the class contained pointers which were being used in
> > questionable ways.
> Doesn't something used in a questionable way sound like a bug that const helped you find, even if it was just pointing out something that should be clarified and better documented?

Sure.  I'll definitely agree that the code was poorly written and not at all documented, even if it actually worked as-is.


Sean