July 03, 2005
"Anders F Björklund" <afb@algonet.se> wrote in message news:da8ivk$12cm$1@digitaldaemon.com...
> Stefan Zobel wrote:
>
>> For what it's worth, that String isn't immutable in Java/C# sense, is it? Every time you hand out a char[], you give the caller a reference to your internal member. Should be a "return str.dup;" :-(
>
> This goes for most implementations of toString(), I guess ? This actually would be the key issue we are debating here...
>
> The char[] that is given out *is* read-only, there's just no way of indicating that - except for the Gentlemen's Agreement.

Exactly!

You can theoretically reproduce String (immutable string value) which will be *completely useles and inneffective*.

What is the point to have String which you cannot use e.g. here:

FILE * fopen([const] char* path, [const] char* flags);

?


>
>
> And, no, it should NOT be a dup (as my "StringBuffer" did do that).
> It's _supposed_ to return a reference to the (immutable) contents.
>
> And you could still peek and poke it by casting to char* and changing the memory inside, but that's besides the point here.
>
>
> I don't think that D needs a string class, but "something else" ? (as in Java and C#, the class internals are protected by the VM)
>
> I like the current strings, it's just that they are (or: could be) a little dangerous - just like a raw char* in C could be, I guess.

Agreed, String as a class is not needed in D.
But String as an entity consist of two things:

string value : const char[], char#[], whatever you like,

and string manipulation buffer (array): char[].

And this *duo* is the must for success.

>
>
> But for C/C++ there is "const" - D needs something similar, but better.

D already has better part - in, inout, out.
These combined with const *is* better from notational point of view
than in current C/C++.












July 03, 2005
"Dave" <Dave_member@pathlink.com> wrote in message news:da931u$1fto$1@digitaldaemon.com...
> "Ben Hinkle" <ben.hinkle@gmail.com> wrote in message news:da8qa5$17op$1@digitaldaemon.com...
>>
>> "Dave" <Dave_member@pathlink.com> wrote in message news:da8nqk$161t$1@digitaldaemon.com...
>>>
>>> "Walter" <newshound@digitalmars.com> wrote in message news:da6k71$11rs$1@digitaldaemon.com...
>>>>
>>>> "Ben Hinkle" <bhinkle@mathworks.com> wrote in message news:da4afr$24jc$1@digitaldaemon.com...
>>>>> How would this cover the following case that Walter's proposal covers:
>>>>> class A { int x; }
>>>>> void foo(in A obj){
>>>>>   int y = obj.x;
>>>>>   // do whatever you want...
>>>>>   assert( obj.x == y );
>>>>> }
>>>>> void main() {
>>>>>   A obj = new A;
>>>>>   foo(obj); // says it doesn't change obj
>>>>> }
>>>>> Is it practical to force A to implement some immutability just because
>>>>> foo
>>>>> doesn't change its inputs? I'm not saying Walter's proposal is
>>>>> prefect.
>>>> It's
>>>>> very strong for foo to say no other thread or reference will change
>>>>> obj
>>>>> since it has no control over what other threads are doing. I would be
>>>> taking
>>>>> pretty much blind leap of faith when declaring an object reference as
>>>>> 'in'
>>>>> and saying that no other thread is stepping on that object during the
>>>>> call
>>>>> lifetime.
>>>>
>>>> You're right. I have misgivings about it for that reason.
>>>>
>>>> On the other hand, look at C++ "const". Nothing about "const" says that
>>>> some
>>>> other thread or reference cannot change the value out from under you at
>>>> any
>>>> moment. Furthermore, it can be cast away and modified anyway. The
>>>> semantic
>>>> value of C++ "const" is essentially zip. This is why I am often
>>>> flummoxed
>>>> why it is given such weight in C++.
>>>>
>>>
>>> It is equally probable that another thread could step on obj within the foo() call lifetime as it is now, regardless of any optimizations that the "in" proposal allows, so I don't think that concern should be a show stopper. Either way the results are the same - sporadically incorrect results, and either way it is going to be a bear to debug and proper synchronization by the /user/ of obj and foo() will be the way to fix the problem - same as any other multi-threading issue.
>>>
>>> I see no problem with just removing the part of the proposal that promises that no other /thread/ will change obj. because I don't think that is practical in any sense at the language implementaton level for a language with the goals of D (D doesn't offer or promise auto-synchronization in any other case, does it?).
>>>
>>> - Dave
>>
>> You're right - the threading model shouldn't enter into it. I'm liking Walter's suggestion more now. Earlier I thought it was too strong a condition but now I can see, for example, lots of places in phobos where 'in' can be used.
>
> I think Walter's idea is a great idea and a good compromise!
>
> Truth be told, I actually think the following would be the best:
>
> [none]/in/out/inout - [none] and 'in' would follow the new proposal with regard to 'immutable', and /all/ would allow the new opimizations that I think Walter has in mind. The exceptions would be pointer params. and functions declared extern or exported, which would simply operate as they do now, where the expectation is that they are treated differently anyway.
>

1) How in will solve this:

class Record
{
    Field[] fields()
    {
        return m_fields.dup; // this is not needed, isn't it?
                                     // but it must be here in current D.
    }
}

2) 'In' does not allow to distinguish cases:

void toUpper(in char[] str_in_place);
char[] toUpper(in const char[] str);

and select optimal implementation.

> volatile/volatile in/volatile out/volatile inout - 'volatile', 'volatile in', 'volatile out' and 'volatile inout' params. would basically operate exactly as they do now. Just like other statements in any other scope, a "volatile" param. would mean "inhibit optimizations on the variable that may effect memory reads/writes". Not even a new keyword..
>
> The whole justification for these new defaults is because the cases where "volatile" would have to be used for the code to operate correctly are a small minority, so shouldn't the default go with what benefits the great majority of cases?
>
> Also, the same justification goes for making the default [none] and 'in' operate the same way w.r.t. the new proposal - majority of cases rules. Another big plus would be not having to litter code with 'in' (like C++ code is littered with 'const').
>

"littered with 'const'"....

well, it depends.... If European will see Stone Garden in Japan he will find it littered by stones.

For me personally code which is not using const looks non-prefessional and written by young hacker in 'fire-n-forget' mode. Non maintainable, non reliable, non optimized.

Someone can tell that:

private int something1
public int something2

is littered with private and public.
Does it *really* mean that it is littered?

Andrew.



July 03, 2005
"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:da93ap$1g54$1@digitaldaemon.com...
>
> "Walter" <newshound@digitalmars.com> wrote in message news:da88rb$ruj$1@digitaldaemon.com...
>>
>> "AJG" <AJG_member@pathlink.com> wrote in message news:da7vvd$lm6$1@digitaldaemon.com...
>>> >Yet the value can change anyway. Another reference can change it, and another thread can change it. It isn't "constant" at all, and cannot be relied on to be constant, even if your program is 100% "const-correct".
>> The
>>> >C++ const is *completely useless* to the compiler optimizer for that
>> reason.
>>>
>>> I can't fathom that the compiler can't make use of the fact that you are
>> telling
>>> it: "The following function body will not modify [variable] at all,
>> ever."?
>>
>> Consider the following C++ function:
>>
>>    int foo(const int *p, int *q)
>>    {
>>        int i = *p;
>>        *q = 3;
>>        return *p;
>>    }
>>
>> At the return statement, could we replace *p with i, thereby saving ourselves a dereference operation? After all, p points to "const" data, right? Wrong. Consider calling foo() this way:
>>
>>    int a[3];
>>    ...
>>    foo(a, a);
>>
>> Since p and q now hold the same value, *q = 3; will change the contents
>> of
>> what p points to.
>>
>> This code is perfectly legal C++, and is "const-correct". It happens
>> often
>> enough in real code, too, as I discovered when trying to implement this
>> optimization.
>
> Walter, it is enough for me to know that
>
> int foo(const int *p, int *q)
>   {
>       int i = *p;
>       *q = 3;
>
>       .... 200 lines of code
>
>       return *p;
>   }
>
> will not change *p. I am not asking for optimization here. Optimization happened before - when I wrote this fucntion declaration using const parameters.
>

The whole problem is, with the C++ rules, the optimizations cannot even be done by the code generator - you may think you've optimized something with const, but in reality the code generator cannot make use of it because of how const is implimented in C++ and the compiler cannot figure out 100% for certain what is being done with referenced data.

> I understand that you focused on optimiziation as compiler and
> codegenerator
> writer. This is perfectly good and thank you for that.
>

What I think Walter has in mind with his "explicit in" proposal for D is "constness" while still allowing for the optimizations that C++ can't do, or at least can't do without making the compiler much more complicated.

> But const is a matter of language design and not about optimization. Well not exactly as if I will write both:
>
> int foo(const int *p, int *q)
> int foo(int *p, int *q)
>
> then I will make an act of optimization on the design level.
>
> Again: from the point of optimization 'public', 'private' attributes are almost worthless.

'private' and 'package' data members may have some optimizations applied because of the protection attribute scope rules.

private and package methods can't be virtual, so there are some optimizations having to do with inlining and the vtable that can be (and are) done by the compiler, because 'private' or 'package' guarantees that those are not virtual methods.

> But despite of that they are in the language. const is the same - it is a
> contract.
> Any argument that you can *intentionally* break constness are true.
> The same apply to visibility attributes.
>
> Andrew.
>
> 


July 03, 2005
In article <da88rb$ruj$1@digitaldaemon.com>, Walter says...
>
>Consider the following C++ function:
>
>    int foo(const int *p, int *q)
>    {
>        int i = *p;
>        *q = 3;
>        return *p;
>    }
>
>At the return statement, could we replace *p with i, thereby saving ourselves a dereference operation? After all, p points to "const" data, right? Wrong. Consider calling foo() this way:
>
>    int a[3];
>    ...
>    foo(a, a);
>
>Since p and q now hold the same value, *q = 3; will change the contents of what p points to.
>
>This code is perfectly legal C++, and is "const-correct". It happens often enough in real code, too, as I discovered when trying to implement this optimization.

Why do I suddenly feel like we're moving towards defining 'restrict' in D? :)


Sean


July 03, 2005
"Walter" <newshound@digitalmars.com> wrote in message news:da888e$rhs$1@digitaldaemon.com...
>
> "Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:da818n$md2$1@digitaldaemon.com...
>>
>> "Walter" <newshound@digitalmars.com> wrote in message news:da7u7k$kkh$1@digitaldaemon.com...
>> >
>> > "Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:da7s6r$jcd$1@digitaldaemon.com...
>> >> "I'll give you my slice and don't forget to dup it as it is not
>> >> yours."
>> >> This does not work in serious programming as
>> >> slice creation happens in one place and its consuming
>> >> in galaxy far far away and code of three developers
>> >> happens in between. It is EXTREMELY difficult to debug.
>> >> I know. It happened to me once in Harmonia and I wasted
>> >> almost three days to catch it. Sigh...
>> >> And I was designing it by myself - not even in team...
>> >
>> > The COW rule for when to .dup is:
>> >
>> >    When you're going to change the data, and you're not *sure* you're
> the
>> > sole handle to it.
>>
>> In real projects and in real life this *not sure* is almost always.
>
> True, but it's also true that most of the time, one doesn't care because
> one
> has no need to modify the data.

Absolutely true. This is why e.g. string literals
shall have type const char[] and not just a char[] by default.
This is stupid design mistake in C++:
"hello"[0] = 0;
will compile and run on one machines and will not on others.


>
>> > I don't see why the need is there for m_fields.dup in openrj.d.
>> > Preemptively
>> > duping something by the provider is the wrong approach to COW. Duping
>> > should
>> > be done by the consumer of it, and then only if the consumer modifies
> it.
>>
>> I know this theory.
>>
>> The whole team are C++ programmers. They get used to const.
>> And you know.... this is really a "code culture" thing -
>> respect of developers who are working with you and using
>> your code.
>> const - it is mine - don't touch it. no const - it is yours.
>> And C++ compiler helps us a lot here to prevent stupid mistakes.
>>
>> All most respectfull C++ guys are unified in 'const is good''
>> http://artima.com/intv/const.html
>> Shall I tell my team that const is wrong because:
>> "it is useless for optimizing"?
>> I'll never will do that, sorry.
>
> C++ const benefits are greater than zero, even though it is useless to the optimizer as it gives no meaningful semantic information.
>
>> pointers are so effective but dangerous. without const they are dangerous in order of magnitude. The same apply to D slices.
>
> I find it curious that many features of C++ have leaked out into other languages, but not const.

Walter, need of const appears together with raw pointers. It is critical to have pointers (slices included) together with const.

If some language takes ++ from C but not pointers then
it does not need const so much.

There is no language in active use which has raw pointers and has no concept of pointers to const data.

Even C. It had no const from the very beginning.
But C99 (year of 1999, sic) has it.

const is simply a type designator:

const typename* and
typename*

*are* different types - with different behavior.

Combining them in one entity as you did in D design is exactly the same as having only one int type without unsigned counterpart.

>
> I agree with you that C++ has a culture of "const is good".
>
> I'd like to find something better, something that works (i.e. that gives useful semantic information).
>
> I also am not understanding how const would help with COW mistakes.
>

What are the "COW mistakes"? I don't understand this exactly....

Anyway....

void  toLower(in char[] str)
char[]  toLower(in const char[] str)


Having them both is a matter of optimization.
First does transformation in place, second allocates
new string.

Do these two together make sense? Definitely, yes.





July 03, 2005
"Anders F Björklund" <afb@algonet.se> wrote in message news:da8but$t3j$1@digitaldaemon.com...
> Walter wrote:
>
> > So I am not understanding why const or immutable is needed for D
strings.
> > I've written a number of string processing programs in D (including a
macro
> > text processor, and the silly program that transforms newsgroup postings into the "archive" pages), and didn't have any difficulty with D
strings.
> > They went together with much less effort and far fewer bugs than their
C++
> > counterparts.
>
> No offense, but didn't you design the language and string handling ? :-)

Yup (with some help from others like Jan Knepper and Arcane Jill), but it did go through a few iterations.

> I've written several segfaults by changing "readonly" literals, and gotten some strange results by setting hash string keys without duping them first (and then later changing the value of the string, that is...) I also think it's pretty easy to "mess up" when using any array slices.

There is a mindset to COW that takes some getting used to.

> It would be great if there were some "extra mechanism" enforcing CoW.

I agree, but I am not convinced that "const" is it.

> > I do understand the desire for const. But there are many different ways
to
> > approach the problem, and I'm not willing to just duplicate the C++
const
> > with all its weaknesses and problems. There's got to be a better way.
>
> It would nice to come up with an approach that would:
> a) default local strings/arrays to read-write
> b) default external parameters to read-only
>
> "in"/"out" doesn't really work, since it only affects the array pointer/length and not the actually characters. Maybe something like "char[out] string", but that looks a little like a variable...
>
> Anyway, I wouldn't want to see "const" (or "readonly", "immutable")
> all over the place either - better to have a "readwrite" or "mutable"
> keyword for the (few?) cases when you *do* want to modify the contents ?

Yes.

> I guess it's the D Gentlemen's Agreement, while keeping on searching. (it's easy enough* to implement java.lang.String and StringBuffer in D, but I'm still hoping there is some compromise between "C" and "Java")
>
> --anders
>
> * PS. Here's one such hack (mostly for the discussion really) http://www.algonet.se/~afb/d/dcaf/html/class_string.html http://www.algonet.se/~afb/d/dcaf/html/class_string_buffer.html


July 03, 2005
In article <da89e7$s7m$1@digitaldaemon.com>, Walter says...
>
>
>"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:da7ulp$kub$1@digitaldaemon.com...
>> ooohh... could you first provide me an option to *declare* e.g. slice as immutable thus I'll be able to return it from function without duping it? *That* would be optimization.
>
>In following COW principles, you do not need to dup the slice when returning it. The producer of the value doesn't need to dup it, the *consumer* does, and only if the *consumer* modifies it. Taking a slice of an array does not modify it in any way, so no dup is required.
>
>So I am not understanding why const or immutable is needed for D strings. I've written a number of string processing programs in D (including a macro text processor, and the silly program that transforms newsgroup postings into the "archive" pages), and didn't have any difficulty with D strings. They went together with much less effort and far fewer bugs than their C++ counterparts.

To me, C++ 'const' just a means of indicating, in code, that a value is not intended to be modifiable.  So if I have a string property in a class, I can return a const reference my internal copy and be somewhat assurred that the client will have to make a conscious attempt if he wants to modify the data. This is really only useful for strings IMO in D, as they can be expensive to copy and there's no way to restrict their functionality.  But I'm not sure it's worth adding a language feature just to help idiot-proof string use.

>I do understand the desire for const. But there are many different ways to approach the problem, and I'm not willing to just duplicate the C++ const with all its weaknesses and problems. There's got to be a better way.

I agree.  Hopefully, something will present itself.


Sean


July 03, 2005
"AJG" <AJG_member@pathlink.com> wrote in message news:da8vgt$1bos$1@digitaldaemon.com...
> In article <da88rb$ruj$1@digitaldaemon.com>, Walter says...
> >
> >
> >"AJG" <AJG_member@pathlink.com> wrote in message news:da7vvd$lm6$1@digitaldaemon.com...
> >> >Yet the value can change anyway. Another reference can change it, and another thread can change it. It isn't "constant" at all, and cannot
be
> >> >relied on to be constant, even if your program is 100%
"const-correct".
> >The
> >> >C++ const is *completely useless* to the compiler optimizer for that
> >reason.
> >>
> >> I can't fathom that the compiler can't make use of the fact that you
are
> >telling
> >> it: "The following function body will not modify [variable] at all,
> >ever."?
> >
> >Consider the following C++ function:
> >
> >    int foo(const int *p, int *q)
> >    {
> >        int i = *p;
> >        *q = 3;
> >        return *p;
> >    }
>
> >    int a[3];
> >    ...
> >    foo(a, a);
>
> Hm... that's an interesting situation. Ok, say pointers are not
considered,
> because they are inherently unsafe anyway. Let's take only regular
in/out/inout
> variables. Could the same trick be done simply with these?

With simple value variables, no. But that isn't interesting, as there isn't much of any point to "const" value variables as function parameters. With reference parameters, yes, it can happen in the same way.


> >This code is perfectly legal C++, and is "const-correct". It happens
often
> >enough in real code, too, as I discovered when trying to implement this optimization.
>
> It's possible that such code, although legal, is simply bad code that
can't be
> protected against. The function documentation should state which variables shouldn't be the same. I think it's simply undefined behaviour.
>
> I dunno if all optimization should be off simply because of the occasinal
badly
> coded, documented and used function.
>
> It's kind of like modifying the iterating array inside a foreach -
perfectly
> valid too, but that's going to be ugly (perhaps that should be made
illegal too,
> btw). Or for example, sending the same variable to multiple parameters of
a
> string-manipulation (str*) function. Consider this:
>
> // char *str = "Some text";
> // str = strcat(str, str);
>
> ---------------------------
>
> Once again, what if pointers are not considered, would this make things
easier?
> If you take out pointers, couldn't the compiler detect if you were sending
the
> same variable to multiple differently-accessable parameters?

The compiler cannot detect it in the general case, as a function may receive a non-const reference to the same data by an arbitrarilly complex path.

What you're suggesting is different from C++ "const" - you're suggesting that const actually mean "constant". This is definitely more interesting that C++ const, but then there's the problem of undefined behavior and trying to detect it.


July 03, 2005
"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:da93ap$1g54$1@digitaldaemon.com...
> Walter, it is enough for me to know that
>
> int foo(const int *p, int *q)
>    {
>        int i = *p;
>        *q = 3;
>
>        .... 200 lines of code
>
>        return *p;
>    }
>
> will not change *p. I am not asking for optimization here. Optimization happened before - when I wrote this fucntion declaration using const parameters.
>
> I understand that you focused on optimiziation as compiler and
codegenerator
> writer. This is perfectly good and thank you for that.

The optimization issue is a symptom of the problem with const, not the problem itself. The inability to make *any* use of const for the compiler to learn something about the program is, to me anyway, indicative that the semantic value of const is not there.

Optimizers work by being able to prove certain properties of code to be true. At a higher level, being able to prove certain things about a program is important for all kinds of code analysis tools, like verifiers, etc. Being able to prove something about a program is also useful for the programmer, as knowing something that must be true about a piece of code is the first step to debugging it.

C++ const doesn't enable proving anything about the const reference. The optimizer can't prove the values won't change, and *neither can the programmer* rely on it. If I see a const reference in C++ code, I don't know if it really is constant or not. As a programmer looking at someone else's code, it offers no value.

> But const is a matter of language design and not about optimization. Well not exactly as if I will write both:
>
> int foo(const int *p, int *q)
> int foo(int *p, int *q)
>
> then I will make an act of optimization on the design level.
>
> Again: from the point of optimization 'public', 'private' attributes are
> almost worthless.
> But despite of that they are in the language. const is the same - it is a
> contract.
> Any argument that you can *intentionally* break constness are true.
> The same apply to visibility attributes.

My point (with the example given) is you are NOT breaking const. The code is legal, supported and is const-correct. It's not like const_cast, which is known to be an escape from const, can be grepped for, and can be warned about. There is no commonly accepted C++ convention that says such code is bad form, undefined, etc.

And yes, these things do appear in real, working, production code, as I found out when attempting to use const in the optimizer.

The only way "const" can be constant in C++ is if you follow an informal convention for your own code that is not enforced by the compiler or the language or by any third party whose code you might wish to use. You cannot assume that const is constant..


July 03, 2005
"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:da95gq$1j4t$1@digitaldaemon.com...
> 1) How in will solve this:
>
> class Record
> {
>     Field[] fields()
>     {
>         return m_fields.dup; // this is not needed, isn't it?
>                                      // but it must be here in current D.
>     }
> }

It won't, but then again, I don't see that the .dup is necessary there, as fields() is a producer of a value, not a consumer.

> 2) 'In' does not allow to distinguish cases:
>
> void toUpper(in char[] str_in_place);
> char[] toUpper(in const char[] str);
>
> and select optimal implementation.

That's correct, there would be no overloading based on const-ness. I know this is commonplace in C++, but frankly I think it's poor design. A function overloaded on const implicitly has significantly different behavior, and so should have a different name.


> > volatile/volatile in/volatile out/volatile inout - 'volatile', 'volatile in', 'volatile out' and 'volatile inout' params. would basically operate exactly as they do now. Just like other statements in any other scope, a "volatile" param. would mean "inhibit optimizations on the variable that may effect memory reads/writes". Not even a new keyword..
> >
> > The whole justification for these new defaults is because the cases
where
> > "volatile" would have to be used for the code to operate correctly are a small minority, so shouldn't the default go with what benefits the great majority of cases?
> >
> > Also, the same justification goes for making the default [none] and 'in' operate the same way w.r.t. the new proposal - majority of cases rules. Another big plus would be not having to litter code with 'in' (like C++ code is littered with 'const').
> >
>
> "littered with 'const'"....
>
> well, it depends.... If European will see Stone Garden in Japan he will find it littered by stones.
>
> For me personally code which is not using const looks non-prefessional and written by young hacker in 'fire-n-forget' mode. Non maintainable, non reliable, non optimized.
>
> Someone can tell that:
>
> private int something1
> public int something2
>
> is littered with private and public.
> Does it *really* mean that it is littered?

The idea is that the most common case should be the default. Would you agree that for the vast majority of function parameters, they are read only? That mutable ones are relatively rare? If so, then it makes sense to have the uncommon ones need the extra syntax.

As for myself, since most parameters in C++ code are const, the const everywhere tends to visually clutter up the code, obscuring the other stuff.