Jump to page: 1 2
Thread overview
[Issue 1961] New: Allow "scoped const" contracts
Mar 31, 2008
d-bugmail
Apr 01, 2008
d-bugmail
Apr 01, 2008
d-bugmail
Apr 01, 2008
d-bugmail
Apr 01, 2008
d-bugmail
Apr 08, 2008
d-bugmail
Apr 27, 2008
d-bugmail
Apr 27, 2008
d-bugmail
Apr 27, 2008
d-bugmail
Apr 28, 2008
d-bugmail
Dec 02, 2008
d-bugmail
Dec 02, 2008
d-bugmail
Dec 02, 2008
d-bugmail
Dec 02, 2008
d-bugmail
Dec 02, 2008
d-bugmail
Dec 07, 2008
d-bugmail
March 31, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1961

           Summary: Allow "scoped const" contracts
           Product: D
           Version: 2.013
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla@digitalmars.com
        ReportedBy: schveiguy@yahoo.com


The current const scheme does not allow scoped const.  A scoped const contract means a function will not change an input parameter or anything referred to through that input parameter, but the function does not restrict the contract that the calling function has with the parameter.

First I will describe the problem I am trying to solve, then describe my solution.

Two examples come to mind, C's strstr function, and C++'s min function, written in D form without const here:

char[] strstr(char[] source, char[] pattern);

T min(T)(T a, T b);

First strstr.  Source is an input type, and should not be changed in the function.  But the return value is a substring of source, and so the constancy must be consistent.  In the current const scheme, this would be implemented as:

const(char)[] strstr(const(char)[] source, const(char)[] pattern)

Now, however, strstr cannot be used in an iterative pattern on a mutable or invariant string:

char[] x = "hello".dup;

x = strstr(x, "l"); // error

So essentially, strstr is modifying the contract that the calling function has with x, namely that once strstr is called on x, any result must be const.  This sort of contract modification is not necessary in the const scheme, as one is perfectly allowed to change x.

With an invariant version, you cannot re-assign to the argument also:

auto y = "hello";
y = strstr(y, "l"); // error

Again, we're modifying the contract.

If we solve the problem using templates, then we have:

T[] strstr(T)(T[] source, const(T)[] pattern);

Note, however, that this does not achieve what we want.  We are not guaranteeing that strstr does not modify source in the case of a mutable argument passed into the function.  There could be a static-if clause in the function that allows strstr to compile and modify source if it is mutable.  A user of the function cannot rely on strstr not changing source by looking at the method signature, he must inspect the implementation.

Now let us look at the second example, min:

T min(T)(T a, T b);

Like the template version of strstr, this does not guarantee the constancy of it's arguments if T is mutable.

If we add const such that min does not modify it's arguments it becomes:

const(T) min(T)(const(T) a, const(T) b);

And now, min is no longer as useful, because it cannot be used in the most common case:

t = min(t, maxvalue); // clamp t to be <= maxvalue

if t is mutable or invariant.

What we need is a way to say that a parameter is constant only for the scope of the function, and then reverts to its original constancy when returning.

Here is my proposal (with help from Janice) on how to solve this problem:

Let the 'inout' keyword describe this contract.  The inout keyword is completely obsoleted by ref, and so can be used for this purpose.  Used like const, the inout keyword used on a parameter means that that parameter is essentially const  for the scope of the function.  When used on the return value, it means that the return value is casted back to the constancy factor of the inout parameters at the call-site.  The constancy factor of the parameters follows the following rules:

  - If all inout parameters are homogeneous, meaning they are all const, all
invariant, all mutable, or all inout (think recursive functions), then the
constancy factor is equivalent to the constancy of the parameters.
  - If any inout parameters differ in constancy, then the constancy factor is
equivalent to const.

The following rules are in force during the function scope:

 - inout(T) can be implicitly cast to const(T)
 - inout(T) is transitively inout, meaning access to any member of T is also
inout.
 - The only thing implicitly castable to inout(T) is T.init (think return
null).

At the call-site, the compiler makes the decision as to what the constancy factor is for the call (see rules above).  It then follows the rules of const as if the constancy factor was substituted for inout.

Example, strstr:

inout(char)[] strstr(inout(char)[] source, const(char)[] pattern)
{
   /*
    * this would be an error, as const(char)[] is not implicitly castable to
    * inout(char)[]
    */
   // return pattern;

   for(int i = 0; i + pattern.length <= source.length; i++)
   {
      inout(char)[] tmp = source[i..pattern.length]; // ok
      if(tmp == pattern) // ok, tmp implicitly casts to const(char)[]
         return source[i..$]; // implicitly casts back to call-site source
   }
   return source[$..$]; // to be consistent with strstr.
}

void f()
{
   auto a = "hello";
   a = strstr(a, "llo"); // cf (constancy factor) == invariant
   // char[] b = strstr(a, "llo"); // error, cannot cast invariant to mutable
   char[] b = "hello".dup;
   b = strstr(b, "llo"); // cf == mutable (note that "llo" doesn't play a role
         // because that parameter is not inout)
   const(char)[] c = strstr(b, "llo"); // cf = mutable, ok because mutable
         // implicitly casts to const
   c = strstr(a, "llo"); // cf = invariant, ok invariant casts to const
}

Now, let's see how min works:

inout(T) min(T)(inout(T) a, inout(T) b)
{
  return a < b ? a : b;
}

void f()
{
   invariant(char)[] i = "hello";
   const(char)[] c = "there";
   char[] m = "Walter".dup;

   //i = min(i, c); // error, since i and c vary in constancy, the result
           // is const, and you cannot implicitly cast const to invariant.

   c = min(i, c); // ok, cf == const, because not homogeneous
   c = min(m, c); // ok, cf == const
   c = min(m, i); // ok, cf == const
   i = min(i, "blah"); // ok, cf == invariant, homogeneous
   //m = min(m, c); // error, cf == const because not homogeneous.
   //m = min(m, "blah"); // error, cf == const
   m = min(m, "blah".dup); // ok
}

The implementation in the compiler of this produces one version of the function for all permutations of constancy.  This makes this solution superior over the template solution (in addition to enforcing const for all versions) in that there is only one function versus 3^x, where x is the number of inout parameters.

Optionally, the compiler may generate multiple functions if certain versions of constancy can be built optimized.  for example, if all the inout parameters are invariant, the resulting function may be optimized differently.

Also optionally, the overload resolution could allow specializations.  For example, if invariant allows you to avoid locking mutexes, then you may want to write a special version of the method where everything is invariant.  In this context, the overload resolution could match against the inout version last.

Note that the choice of 'inout' as a keyword should not degrade from the semantic meaning of this enhancement.  If 'inout' is no good as a keyword, then something else is fine with me.


-- 

April 01, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #1 from andrei@metalanguage.com  2008-03-31 21:34 -------
Looks interesting. I discussed the issue with Janice a bit. The solution seems sound, and it's not templated, which is important for virtual functions. My only concern is that it's a new feature, unrelated to all else in the language, and as such must be learned. Since some people already complain that the const regime is already complicated, I fear that people might say that inout adds even more complication.


-- 

April 01, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #3 from andrei@metalanguage.com  2008-03-31 22:59 -------
(In reply to comment #2)
> On 01/04/2008, d-bugmail@puremagic.com <d-bugmail@puremagic.com> wrote:
> > Since some people already complain that the const
> >  regime is already complicated,
> 
> Yes, they complain about the fact that the current regime forces us to duplicate many functions three times. (C++ makes us do it twice, which is bad enough!) :-)
> 
> They complain about the choice of words, "const" and "invariant". They /especially/ complain most furious about the word "enum", and the pointlessness of "in".
> 
> They complain that "const R f()" looks like f is returning a const(R).
> 
> But they don't (often) complain about transitivity, or anything that
> really matters.

I think the choice of words is a classic dilemma. Whatever words are chosen, some won't like them, and they will be the ones to voice their opinion. I think this will just pass. What I'm worried about is arms thrown in the air, "This was supposed to be simpler than C++!" etc.

I agree that prefix const for methods is confusing. I personally like
const(this), as you proposed.

I believe "in" will become more interesting once we figure "scope" out (no pun
intended).

And enum... you'll have to yank that out from my dead cold hands. Extending enum instead of adding yet another way of defining symbolic constants is The Right Thing to do. I am sure people would have realized how ridiculous the whole "manifest" thing is if we first proposed it. We just can't define one more way for each kind of snow there is.

But by and large, as Walter said, although the current constancy regime is excellent, we need to find a good PR consultant :o).

> >  I fear that people might say that inout adds
> >  even more complication.
> 
> It's a wildcard. People understand wildcards. It's simple enough to explain in terms of "this stands for mutable, const or invariant". From that point of view, it can be presented not as a new flavor of constancy, but as token which stands for any of the existing flavors. I think that with that description, people will get it.

I didn't say they don't understand. I just said they have to run to the manual when they see it. It's yet another thing to learn.

> However, I agree that simplifying const is something we should do /first/, before throwing in new stuff. Priorities. :-)

The challenge is how...


-- 

April 01, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #5 from aarti@interia.pl  2008-04-01 09:15 -------
> And enum... you'll have to yank that out from my dead cold hands. Extending enum instead of adding yet another way of defining symbolic constants is The Right Thing to do. I am sure people would have realized how ridiculous the whole "manifest" thing is if we first proposed it. We just can't define one more way for each kind of snow there is.

While I agree completely with what you said about 'manifest' keyword, I quite don't agree about choosing 'enum' as word for manifest constant. IMHO there is much better keyword for this purpose: 'alias'.

Please notice that using 'alias' will be very consistent:
1. Alias defines new symbols. Literals (e.g true, false, [], "text")are in fact
also symbols as they don't have its address. They exists only inside compiler.
So both are of 'same kind'. Personally I think about them as "#define" from C.

2. It seems that there will be also alias syntax like below (as you suggested
in [1] comment #4):
alias TRUE = true;
So there will be no real difference between 'enum' and 'alias' as a way of
defining manifest constants. Two ways to define same think seems just one too
much.

3. Enum can retain its previous meaning as a way to define *collection*. Please notice that not every set of manifest constant creates collection. Also there can possibly be collections of different things than manifest constants e.g. collections of variables. So there is many ways to extend meaning of enum to make it more programmer friendly and useful. That said I think that enum should only define subset of all manifest constant usages - manifest constants which can be ordered into collection.

If alias will be extended to syntax 'alias TRUE=true;' (which seems to be right thing) then there is no way to avoid completely duplication of functionality between 'enum' and 'alias'. But IMHO better way is to specialize 'enum' to mean collection (in specific case collection of manifest constant) and 'alias' to mean definition of symbol.

Sorry for being a bit off topic. I can put it into new enhancement req. if you wish.

[1] I suggested it already in enhancement request 1827# . http://d.puremagic.com/issues/show_bug.cgi?id=1827


-- 

April 01, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #6 from andrei@metalanguage.com  2008-04-01 10:30 -------
(In reply to comment #5)
> > And enum... you'll have to yank that out from my dead cold hands. Extending enum instead of adding yet another way of defining symbolic constants is The Right Thing to do. I am sure people would have realized how ridiculous the whole "manifest" thing is if we first proposed it. We just can't define one more way for each kind of snow there is.
> 
> While I agree completely with what you said about 'manifest' keyword, I quite don't agree about choosing 'enum' as word for manifest constant. IMHO there is much better keyword for this purpose: 'alias'.
> 
> Please notice that using 'alias' will be very consistent:

These are sensible points. Alias was a close contender. There is one issue with it. Consider:

alias x = foo();

To the unwary, this looks like, hey, it's an alias - so whenever I write "x",
it will be automagically rewritten into "foo()". In reality CTFE will be done
against foo().

Before any other proposal comes forth: Enum does not have this problem and has a learning curve that looks more like a flatline. Anonymous enums already have 99.9% of the needed properties, they only didn't accept other types than integrals. Walter made them accept other types than integrals. It is good language design, it solves the problem, and it removes an unjustified limitation. Why the hubbub over this?

We have so much bigger fish (whales!) to fry, it's frivolous to spend time on this one shrimp. Let us move on, please.


-- 

April 08, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #7 from brunodomedeiros+bugz@gmail.com  2008-04-08 09:42 -------
Andrei & co., please don't discuss topics in the bugzilla, especially if they are not related to the bug. That's what the NG is for, no?

As for this issue, isn't there a similar problem with member functions and the this parameter? I.e., suppose you want to write a getter:

  class Foo {
    Bar bar;
    Bar getBar() {
      return bar;
    }
  }

But then the return value should have the same constness of the this parameter, that is, when invoked with a const this, the return value should be const, but when invoked with a mutable this, the return value should be mutable (and similarly for invariant). It's the same concept as your inout but applied to the this parameter. However if the getter is defined as in the code above, it will only work for mutable this.

How is this problem solved currently in D? I had the impression there was a way to do it, but I can't remember it, and I couldn't it find in the documentation either. Or am I mistaken, and there is no way to do it in D (without creating 3 distinct functions)?


-- 

April 27, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #8 from brunodomedeiros+bugz@gmail.com  2008-04-27 13:08 -------
(In reply to comment #7)
> How is this problem solved currently in D? I had the impression there was a way to do it, but I can't remember it, and I couldn't it find in the documentation either. Or am I mistaken, and there is no way to do it in D (without creating 3 distinct functions)?

I've just updated myself on D's const system and on old unread posts about
const. I found that I was mistaken, and that D didn't already provide a
mechanism to do what "scoped const" does.
More unexpectedly, I found that it seems neither Walter nor Andrei had thought
of this issue before (I assume this from Andrei because of his "Looks
interesting." comment... I could be wrong)
I find this woefully disconcerting due to the reasons detailed here:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=70593
. (basicly Walter and Andrei seem to be ignorant of a very good and thorough
article about a const system proposal for Java, unnecessarily hitting the wall
on several issues)

Back to proposal at hand, it seems the main question now is this one:

Walter Bright wrote:
> 
> For me, the question is is solving these issues a large enough problem that justifies adding a rather confusing new variation on const?


-- 

April 27, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #9 from andrei@metalanguage.com  2008-04-27 13:36 -------
(In reply to comment #8)
> More unexpectedly, I found that it seems neither Walter nor Andrei had thought
> of this issue before (I assume this from Andrei because of his "Looks
> interesting." comment... I could be wrong)
> I find this woefully disconcerting due to the reasons detailed here:
> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=70593
> . (basicly Walter and Andrei seem to be ignorant of a very good and thorough
> article about a const system proposal for Java, unnecessarily hitting the wall
> on several issues)

We were well aware on the work on Javari. The link to the relevant paper has been on our internal discussion wiki for over a year, and I knew of that work long before that.

If we've been hitting the wall, it must be stupidity more than ignorance.


-- 

April 27, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #10 from brunodomedeiros+bugz@gmail.com  2008-04-27 15:12 -------
(In reply to comment #9)
> We were well aware on the work on Javari. The link to the relevant paper has been on our internal discussion wiki for over a year, and I knew of that work long before that.

Well then, were you previously aware of Javari's romaybe?


-- 

April 28, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1961





------- Comment #11 from schveiguy@yahoo.com  2008-04-28 09:59 -------
(In reply to comment #8)
> Back to proposal at hand, it seems the main question now is this one:
> 
> Walter Bright wrote:
> > 
> > For me, the question is is solving these issues a large enough problem that justifies adding a rather confusing new variation on const?
> 

IMO, using this proposal does not create more confusion, it creates less confusion.  With this proposal, and with properly implemented library functions one does not

- need to worry about whether a template function truly does not molest the
parameters
- need to worry about which function to call
- need to read documentation that contains umpteen different versions of the
same function, which vary just on constancy
- need to do any special casting to get around deficiencies in the library.

And a developer no longer needs to implement umpteen different versions of the same function.

I think the result of this proposal being adopted might be that people feel less intimidated by const, if they no longer have to worry about whether they can call functions with any variable types.  And the results make intuitive sense.


-- 

« First   ‹ Prev
1 2