July 06, 2007
BCS wrote:
> The one issue I can see with this is where an input is const but may be changed (and .duped) at any of a number of points. The data though only needs to be .duped once.
> 
> |char[] Whatever(const char[] str)
> |{
> | if(c1) str = Mod1(str.dup);
> | if(c2) str = Mod2(str.dup);
> | if(c3) str = Mod3(str.dup);
> | return str;
> |}
> // causes exces duping

My experience with this is:

1) Such cases are unusual

2) The few cases where they do happen, they are not in that 5% of the code that is a bottleneck

3) If such code is performance critical, there's usually a better way to write it that will yield even better performance than taking repeated passes over the same string. Best performance usually comes by merging all the operations into one pass.
July 06, 2007
Walter Bright wrote:
> Regan Heath wrote:
>> Aaargh!  You're confusing empty and non-existant (null) again!  <g>
> 
> In this case, no.

But a way of emptying something was asked for, and you showed a way to make it null, not empty -- can you explain your "In this case, no"?

>> In some cases there is an important difference between the two.
> 
> The only case is when you're extending into a preallocated buffer.

I've found many times when the difference between an empty string and no string was important; they generally have nothing to do with extending at all.  I'd be interested to know why you assert that no such cases exist.

-- James
July 06, 2007
James Dennett wrote:
> I've found many times when the difference between an empty
> string and no string was important; they generally have
> nothing to do with extending at all.  I'd be interested to
> know why you assert that no such cases exist.

I'd like to know of such cases.
July 06, 2007
On Thu, 05 Jul 2007 20:58:11 -0700, Walter Bright wrote:

> James Dennett wrote:
>> I've found many times when the difference between an empty string and no string was important; they generally have nothing to do with extending at all.  I'd be interested to know why you assert that no such cases exist.
> 
> I'd like to know of such cases.

  char[] Option;

  Option = getOptionFromUser();
  if (Option.ptr = 0)
  {
   Option = DefaultOption;
  }

However, if the user sets the option to "" then that is what they want and not the default one.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
July 06, 2007
On Fri, 6 Jul 2007 14:23:43 +1000, Derek Parnell wrote:

> On Thu, 05 Jul 2007 20:58:11 -0700, Walter Bright wrote:
> 
>> James Dennett wrote:
>>> I've found many times when the difference between an empty string and no string was important; they generally have nothing to do with extending at all.  I'd be interested to know why you assert that no such cases exist.
>> 
>> I'd like to know of such cases.
> 
>   char[] Option;
> 
>   Option = getOptionFromUser();
>   if (Option.ptr = 0)
>   {
>    Option = DefaultOption;
>   }
> 
> However, if the user sets the option to "" then that is what they want and not the default one.

And if you must nitpick that one can code this a different way then here is another example.

Let's say that there is this library routine, which is closed source and I don't have access to its source, that accepts a string as its argument. Further more, if that passed string is null the routine uses a default value - whatever that is because I don't know it. Now in my code I call it with ...

   SomeFunc("");   -- Use an empty string to do its magic
   SomeFunc(null); -- But this time, use the default value

Remember, I have no control over the SomeFunc routine's implementation.

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
6/07/2007 2:54:45 PM
July 06, 2007
Derek Parnell wrote:
> On Fri, 6 Jul 2007 14:23:43 +1000, Derek Parnell wrote:
> 
>> On Thu, 05 Jul 2007 20:58:11 -0700, Walter Bright wrote:
>>
>>> James Dennett wrote:
>>>> I've found many times when the difference between an empty
>>>> string and no string was important; they generally have
>>>> nothing to do with extending at all.  I'd be interested to
>>>> know why you assert that no such cases exist.
>>> I'd like to know of such cases.
>>   char[] Option;
>>
>>   Option = getOptionFromUser();
>>   if (Option.ptr = 0)
>>   {
>>    Option = DefaultOption;
>>   }
>>
>> However, if the user sets the option to "" then that is what they want and
>> not the default one.
> 
> And if you must nitpick that one can code this a different way then here is
> another example.
> 
> Let's say that there is this library routine, which is closed source and I
> don't have access to its source, that accepts a string as its argument.
> Further more, if that passed string is null the routine uses a default
> value - whatever that is because I don't know it. Now in my code I call it
> with ...
> 
>    SomeFunc("");   -- Use an empty string to do its magic
>    SomeFunc(null); -- But this time, use the default value
> 
> Remember, I have no control over the SomeFunc routine's implementation.
> 

In databases NULL being different from empty seems to a big deal too.

Anyway googling for "null versus empty" turns up a bevy of hits, so from that I think we can presume that the distinction is important to a non-empty subset of programmers.

--bb
July 06, 2007
Derek Parnell wrote:
> Let's say that there is this library routine, which is closed source and I
> don't have access to its source, that accepts a string as its argument.
> Further more, if that passed string is null the routine uses a default
> value - whatever that is because I don't know it. Now in my code I call it
> with ...
> 
>    SomeFunc("");   -- Use an empty string to do its magic
>    SomeFunc(null); -- But this time, use the default value
> 
> Remember, I have no control over the SomeFunc routine's implementation.

Of course, if a function is documented to behave that way, and you have no control over it, you must adhere to its documentation.

There are other ways to do default arguments. I suspect we could argue about it like we could argue about tab stops, and never reach any sort of resolution <g>.
July 06, 2007
Walter Bright wrote:
> Derek Parnell wrote:
>> Let's say that there is this library routine, which is closed source and I
>> don't have access to its source, that accepts a string as its argument.
>> Further more, if that passed string is null the routine uses a default
>> value - whatever that is because I don't know it. Now in my code I call it
>> with ...
>>
>>    SomeFunc("");   -- Use an empty string to do its magic
>>    SomeFunc(null); -- But this time, use the default value
>>
>> Remember, I have no control over the SomeFunc routine's implementation.
> 
> Of course, if a function is documented to behave that way, and you have no control over it, you must adhere to its documentation.
> 
> There are other ways to do default arguments. I suspect we could argue about it like we could argue about tab stops, and never reach any sort of resolution <g>.

The first argument which I think holds water is that it is trivial to represent empty and non existant in C, eg.

char *empty = "";
char *non-existant = NULL;

The other argument is the one made earlier about databases.  In a database empty and non-existant are important distinct states a value could have.

Currently, D can model these but it worries me that you don't seem to think that it's important.  So, perhaps in future you might decide to get rid of this, or do so accidently.

Regan
July 06, 2007
Bruno Medeiros wrote:
>>> The current signature:
>>>   const(char)[] tolower(const(char)[] str)
>>> is kinda incorrect, because it returns a const reference for an array that has no mutable references, and that is the same as an invariant reference, so tolower might as well return invariant(char)[].
>>
>> Again, that only holds if a copy was actually made at run time. If no copy was made the original input is returned, to which there may be mutable references.
> 
> You're right, if a copy is not made *every* time (which is the case
> after all), then the above doesn't hold.
> But then, what I think is happening is that Phobo's current tolower is
> suboptimal in terms of usefulness, because the fact that we don't know
> if a new copy is made or not. I'm wondering now what would be the more
> useful form, or forms, of tolower (and similar functions) to have.
> Now that I think of it again (admittedly I haven't got much experience with string manipulation in C++ or D, though), but perhaps the best form is an in-place mutable version:
>   char[] tolower(char[] str);
> And it's this one after all that is the most general form. If you want to call tolower on a const or invariant array you dup it yourself on the call:
>   char[] str = tolower("FOO".dup);

True.. but it's unfortunate that the most efficient case, where no duplication is needed, is no longer possible :(

If we template the function, eg.

T tolower(T)(T input)
{
}

and we have some way to check whether the input is const or not (at runtime is(string) or something?) perhaps we can code the existing efficient solution (no dup of const data) as well as the general case where it mutates.  In the mutate case it can dup if the input is const and not dup if it isn't (adding an efficient solution which doesn't currently exist).

The only problem is that the case where you pass const data and it has to dup, you get back a const reference to a piece of data with no other owner (meaning it doesn't need to be const) which might cause another dup in your code at a later point.

Regan
July 06, 2007
Bruno Medeiros wrote:
> It doesn't make sense to template it, because you'd still have two different function versions, that would work differently. The one that receives a string does a dup, the one that receives a char[] does not dup. The return type of tolower(string str) might also be char[] and not string, if tolower(string str) would allways does a dup, even if no character modifications are necessary.

If the template is

T tolower(T)(T input) {}

then you have

string tolower(string input) {}
char[] tolower(char[] input) {}

and you cases are:

1. input string, output same string (no dup)
2. input string, output string (dup)
3. input char[], output same char[] (no dup)

Case #2 is admitedly not ideal because it may cause a later dup in your code.  But case #1 handles the efficient no modification case of string and case #3 handles both modification and non-modification without any call to dup.

I think the above is better than the current implementation as it avoids a dup in case #3.

Regan