View mode: basic / threaded / horizontal-split · Log in · Help
June 02, 2005
Re: Java String vs wchar[] Was: Re: inner classes
Andrew Fedoniouk wrote:

> 
> Yep, only one:
> "the reference can't be changed"
> I think this is too strict.
> 
> const char[] Dolli = "McArtur"; // fine
> /// mariage happens
>                    Dolli = "O'Connor"; // shoud be also fine.
> /// but attempt  to break Dolli's "private parts" ((C) Booch) - to change
> /// value iself will be an erro:r
>                      Dolli[0] = '\0'; /// ERROR
Though I think I get your point - this just doesn't feel right to me.
Assume for a moment that "const" in D means the simplest thing - the 
reference cannot be changed and the data cannot be changed.  Your 
example can be rewritten as:

char[] Dolli = "McArtur";
const char[] safeDolli = Dolli;
// do things that aren't allowed to change safeDolli
// Dolli gets married
Dolli = "O'Connor"; // safeDolli also is now "O'Connor"

I am all for things being simple, and to me the simplest use of const is 
to make both the reference and the data immutable.

Brad
June 02, 2005
Re: Java String vs wchar[] Was: Re: inner classes
On Fri, 3 Jun 2005 08:55:55 +1000, Derek Parnell <derek@psych.ward> wrote:
> On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:
>
> [snip]
>
>> So, exploring a syntax for enabling all 3 options, it looks like we  
>> have:
>>
>> #1 const char[] x
>> #2 char[] const y;
>> #3 const char[] const z;
>
> I'm getting confused now; sorry. Are these three things mean ...
>
>  #1 I can't modify the reference but I can modify the data.
>  #2 I can modify the reference but I can't modify the data.
>  #3 I can't modify the reference and I can't modify the data.

Yep. Assuming we need all 3 of them.

Regan
June 02, 2005
Re: Java String vs wchar[] Was: Re: inner classes
Regan Heath wrote:
> So, exploring a syntax for enabling all 3 options, it looks like we have:
> 
> <snip />
> 
> 1 - have a constant reference, to non-constant data
> 2 - have a non-constant reference, to constant data
> 3 - have a constant reference, to constant data 


What about:

#1 char const[] x;
#2 const char[] y;
#3 const char const[] z;

It's consistent with the way D declarations are parsed by myBrain(tm)


-- 
Tomasz Stachowiak  /+ a.k.a. h3r3tic +/
June 02, 2005
Re: Java String vs wchar[] Was: Re: inner classes
"Brad Beveridge" <brad@somewhere.net> wrote in message 
news:d7o3d9$2fqb$1@digitaldaemon.com...
> Andrew Fedoniouk wrote:
>
>>
>> Yep, only one:
>> "the reference can't be changed"
>> I think this is too strict.
>>
>> const char[] Dolli = "McArtur"; // fine
>> /// mariage happens
>>                    Dolli = "O'Connor"; // shoud be also fine.
>> /// but attempt  to break Dolli's "private parts" ((C) Booch) - to change
>> /// value iself will be an erro:r
>>                      Dolli[0] = '\0'; /// ERROR
> Though I think I get your point - this just doesn't feel right to me.
> Assume for a moment that "const" in D means the simplest thing - the 
> reference cannot be changed and the data cannot be changed.  Your example 
> can be rewritten as:
>
> char[] Dolli = "McArtur";
> const char[] safeDolli = Dolli;
> // do things that aren't allowed to change safeDolli
> // Dolli gets married
> Dolli = "O'Connor"; // safeDolli also is now "O'Connor"
>
> I am all for things being simple, and to me the simplest use of const is 
> to make both the reference and the data immutable.

Take a look on this:

for( const int* p = ...; p < end; ++p )
 {
 }

You can enumerate but you cannot change.

Again, there are mechanisms for practical implementations of const 
references in D now
e.g.: in, inout, out for parameters

but there are no convenient and effective ways to protect reference values.

Also if you have const ref on const data then you will not be able to do:

foo( out const char[]  ) which is rare but desireable use case.

Andrew.








>
> Brad
June 02, 2005
Re: Java String vs wchar[] Was: Re: inner classes
"Regan Heath" <regan@netwin.co.nz> wrote in message 
news:opsrrmmoii23k2f5@nrage.netwin.co.nz...
> On Fri, 3 Jun 2005 08:55:55 +1000, Derek Parnell <derek@psych.ward> wrote:
>> On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:
>>
>> [snip]
>>
>>> So, exploring a syntax for enabling all 3 options, it looks like we 
>>> have:
>>>
>>> #1 const char[] x
>>> #2 char[] const y;
>>> #3 const char[] const z;
>>
>> I'm getting confused now; sorry. Are these three things mean ...
>>
>>  #1 I can't modify the reference but I can modify the data.
>>  #2 I can modify the reference but I can't modify the data.
>>  #3 I can't modify the reference and I can't modify the data.
>
> Yep. Assuming we need all 3 of them.
>
> Regan

Let's just have

#1 I can't modify the reference but I can modify the data.

It is quite enough.

cases when you need #2 and #3 are very rare in C++ and
always can be expressed by other methods .

E.g. in D if you want to have non-changeable reference field in the class
you can always do

class Foo
{
   private char[] _bar;

   const char[] bar() { return _bar; }
}

and this is it.
June 02, 2005
Re: Java String vs wchar[] Was: Re: inner classes
Andrew Fedoniouk wrote:

> You can enumerate but you cannot change.
> 
> Again, there are mechanisms for practical implementations of const 
> references in D now
> e.g.: in, inout, out for parameters
> 
> but there are no convenient and effective ways to protect reference values.
> 
> Also if you have const ref on const data then you will not be able to do:
> 
> foo( out const char[]  ) which is rare but desireable use case.
> 
You have convinced me :)
Const reference + const data is too simplistic.

Brad
June 02, 2005
Re: Java String vs wchar[] Was: Re: inner classes
"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message 
news:d7o4tf$2h13$1@digitaldaemon.com...
>
> "Regan Heath" <regan@netwin.co.nz> wrote in message 
> news:opsrrmmoii23k2f5@nrage.netwin.co.nz...
>> On Fri, 3 Jun 2005 08:55:55 +1000, Derek Parnell <derek@psych.ward> 
>> wrote:
>>> On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:
>>>
>>> [snip]
>>>
>>>> So, exploring a syntax for enabling all 3 options, it looks like we 
>>>> have:
>>>>
>>>> #1 const char[] x
>>>> #2 char[] const y;
>>>> #3 const char[] const z;
>>>
>>> I'm getting confused now; sorry. Are these three things mean ...
>>>
>>>  #1 I can't modify the reference but I can modify the data.
>>>  #2 I can modify the reference but I can't modify the data.
>>>  #3 I can't modify the reference and I can't modify the data.
>>
>> Yep. Assuming we need all 3 of them.
>>
>> Regan
>
> Let's just have
>
> #1 I can't modify the reference but I can modify the data.

Sorry, above shall be read as:
#2 I can modify the reference but I can't modify the data.



>
> It is quite enough.
>
> cases when you need #2 and #3 are very rare in C++ and
> always can be expressed by other methods .
>
> E.g. in D if you want to have non-changeable reference field in the class
> you can always do
>
> class Foo
> {
>    private char[] _bar;
>
>    const char[] bar() { return _bar; }
> }
>
> and this is it.
>
>
>
>
>
>
>
June 03, 2005
Re: Java String vs wchar[] Was: Re: inner classes
Walter wrote:
> "Andrew Fedoniouk" <news@terrainformatica.com> wrote in message
> news:d7gtvf$qs0$1@digitaldaemon.com...
> 
>>java.lang.String class has a) methods b) String owns buffer - it controls
>>buffer.
>>
>>In D is possible:
>>int[char[]] map;
>>char[] s = "something";
>>map[s] = 1;
>>s[0] = '?'; // I have no idea what result will be. sure not good.
>>
>>And you can bump into such problem quite easily in D. I personally
>>did many times. And too hard to find source sometimes.
>>
>>In Java such collision is not possible in principle: String is final and
>>immutable.
> 
> 
> A number of languages use the immutable string idiom, and its corollary
> "always implicitly copy the string when writing to it". They all share
> another common characteristic - they're slow, and they're slow in a manner
> that is *not fixable*. And they're not just slower by a factor, many
> algorithms run *exponentially* slower because of the copying.
> 
> D must be fast, and the only way to be fast with strings (and arrays) is to
> not have the language implicitly copy them, but to allow the programmer the
> flexibility to copy or not copy. To know when to copy, use the Copy On Write
> principle (COW). That is, if you're not *sure* you've got the only copy of a
> string, .dup it before modifying it.
> 
> So why isn't that just as bad as the languages that implicitly copy on
> write? The answer is that often, you know that you are the sole owner, such
> as:
> 
>     char[] s = new char[10];
>     for (i = 0; i < 10; i++)
>         s[i] = 'c';
> 
> Those other languages are doomed to make 10 copies of s. The D programmer
> needs to make 0 copies.
> 
> As to your example above, when you pass a reference to a string to an
> associative array, then you aren't the sole owner of that string anymore.
> Don't change it. .dup it.
> 
> 
If D had a standard string class there wouldn't be any problem!

The string class would implement immutable strings without COW.

That's what you need in 99% of all applications.
Applications that are an exeception to the rule should use char arrays 
with dup if needed.

That's a win win situation - and no performance hit.
The only thing that people need to get used to is that "normal" strings 
are immutable, but that's not really hard to accept.

Cheers,
Jan
June 06, 2005
Re: Java String vs wchar[] Was: Re: inner classes
"kris" <fu@bar.org> wrote in message news:d7jemq$k2n$1@digitaldaemon.com...
> Ben Hinkle wrote:
>>>>
>>>>>And what will be your advice then for:
>>>>>
>>>>>class Url {
>>>>>  char[] _hostname;
>>>>>  char[] hostname() { return _hostname; }
>>>>>}
>>>>>
>>>>>_hostname should not be changeable nor intentionally
>>>>>nor accidentally.
>>>>>hostname access pattern is primarily read. But it could possibly be
>>>>>passed in some third party functions.
>>>>>
>>>>>I am serious. I really want to know how to design it better.
>>>>
>>>>Third party functions should follow the COW principle too. They should 
>>>>not
>>>>modify strings that they don't know they are the owner of.
>>>
>>>Yes, and cyclists shouldn't run red lights either.
>>>
>>>We have to code in a world in which many people using our libraries don't
>>>care about what they 'should' do; they use anything that seems like an
>>>expedient idea at the time. Yes I know that not following the CoW rules 
>>>is
>>>dangerous, but its not as dangerous as cyclists running red lights and 
>>>they
>>>continue to do that.
>>
>>
>> hmm. around here it isn't the cyclists that run red lights - it's the 
>> things with 4 wheels and that unused pedal called the "brake". :-P
>>
>> But more to topic I'm with Walter that when you look at the big picture 
>> COW is a reasonable balance of trade-offs. The only suggestion I have is 
>> to put COW more front-and-center in the array help so that people see it 
>> from the start and it becomes second nature. Compiler protection against 
>> malicious code isn't that important to me since people will go out of 
>> their way to write malicious code no matter what the compiler does. I'm 
>> more worried about the accidental D-newbie who doesn't know about arrays 
>> or COW. For those cases talking about COW right away in the doc will 
>> decrease the likelihood of newbie errors.
>
>
> Ben; Walter;
>
> I think perhaps you're missing a significant point being made? CoW is not 
> the issue at stake ~ instead, what's being asked for is a mechanism to 
> /enforce/ CoW.
>
> For example: the little example above should not be dup'ing the content 
> before return, if it's only being used for reference (read-only) purposes 
> by both parties (caller and callee). I think we can all agree on that? 
> Yes?
>
> What's being asked for is a means whereby the compiler will 'prohibit' 
> some other caller from using the returned array as a writable lValue; at 
> compile time. That is, the CoW should be performed by the caller (not the 
> callee), /if and when the caller needs to perform a write upon it/. And 
> only at that time.

*exactly*.

>
> Again, CoW is not being questioned. It's the total lack of enforcement 
> that would be good to do something about. The compiler goes out of its way 
> to catch out-of-bounds errors WRT arrays ~ we're asking for something 
> similar here to avoid a source of silly, easily preventable, and hard to 
> track down bugs. It would add some noticable weight to any story regarding 
> robustness.
>
> Turn things around for a minute, and assume such a facility was available. 
> It's not hard to see how this would be viewed in a most favourable light. 
> And there's no downside for the code, or for the developer. Best of all 
> worlds?
>

proposed const costs nothing in runtime. Even better - it helps
to reduce unnecessary allocations.

Let's take a look in some code fragments of Phobos:

module std.openrj -----------------------------------

class Record
{
   Field[] fields() { return m_fields.dup;  } // just in case?
   // probably following is better?
   // const Field[] fields() { return m_fields;  }
}

class Database
{
   Record[]  records()  {   return m_records.dup;   } // the same
   Field[]      fields()      {   return m_fields.dup;  }
}

std.file----------------------------------------------

class FileException : Exception
{
   this(char[] name, uint errno)
   { char* s = strerror(errno);             //  I have no idea
this(name, std.string.toString(s).dup); //  what is going on here.
this.errno = errno;
   }
}

void listdir(char[] pathname, bool delegate(char[] filename) callback)
{
....
    int len = std.string.strlen(fdata.d_name);
    if (!callback(fdata.d_name[0 .. len].dup)) // is dup really needed 
here???
            // allocation of new string on each entry! Doh!

....
}

std.loader ----------------------------------------------
public class ExeModuleException : Exception
{
   this(uint errcode)
   {
super(std.string.toString(strerror(errcode)).dup); // why?
   }
}

std.socket ----------------------------------------------

void populate(protoent* proto)
{
 type = cast(ProtocolType)proto.p_proto;
 name = std.string.toString(proto.p_name).dup; // why?
....
  aliases = new char[][i];
  for(i = 0; i != aliases.length; i++)
  {
   aliases[i] = std.string.toString(proto.p_aliases[i]).dup; // what for?
  }
....

}
----------------------------------------
etc.
June 06, 2005
Re: Java String vs wchar[] Was: Re: inner classes
[snip]
>Let's take a look in some code fragments of Phobos:
>
>module std.openrj -----------------------------------
>
>class Record
>{
>    Field[] fields() { return m_fields.dup;  } // just in case?
>    // probably following is better?
>    // const Field[] fields() { return m_fields;  }
>}
>
>class Database
>{
>    Record[]  records()  {   return m_records.dup;   } // the same
>    Field[]      fields()      {   return m_fields.dup;  }
>}
>
>std.file----------------------------------------------
>
>class FileException : Exception
>{
>    this(char[] name, uint errno)
>    { char* s = strerror(errno);             //  I have no idea
> this(name, std.string.toString(s).dup); //  what is going on here.
> this.errno = errno;
>    }
>}
>
>void listdir(char[] pathname, bool delegate(char[] filename) callback)
>{
> ....
>     int len = std.string.strlen(fdata.d_name);
>     if (!callback(fdata.d_name[0 .. len].dup)) // is dup really needed 
>here???
>             // allocation of new string on each entry! Doh!
>
> ....
>}
>
>std.loader ----------------------------------------------
>public class ExeModuleException : Exception
>{
>    this(uint errcode)
>    {
> super(std.string.toString(strerror(errcode)).dup); // why?
>    }
>}
>
>std.socket ----------------------------------------------
>
>void populate(protoent* proto)
> {
>  type = cast(ProtocolType)proto.p_proto;
>  name = std.string.toString(proto.p_name).dup; // why?
>....
>   aliases = new char[][i];
>   for(i = 0; i != aliases.length; i++)
>   {
>    aliases[i] = std.string.toString(proto.p_aliases[i]).dup; // what for?
>   }
>....
>
>}
>----------------------------------------
>etc.

Are the comments in the code above editorial by you or are they actually in the
code? I'd say someone needs to look at phobos to clean up the dups. If you've
already done the sweep then sending Walter the fixes would be helpful. Phobos
can contain non-D programming styles occasionally - each module has a strong
indication of the author's attitudes IMHO.

ps - for kicks this weekend I've been adding a parameter to the MinTL containers
to indicate read-only vs read-write. For example
struct List(Value, bit ReadOnly = false) {
static if (!ReadOnly) {
void addTail(Value v){...}
.. other functions that modify the list ...
}
.. functions that don't modify the list ...
}
You get a read-only view of a container by using the "readonly" property. I'll
be finishing this stuff up soon and post to D.dtl later in the week.
3 4 5 6 7 8
Top | Discussion index | About this forum | D home