June 30, 2004
---

s = p.getValue("foo");
if (s.length)

---

Whats wrong with this way ?

Charlie

In article <opsadsu8f75a2sq9@digitalmars.com>, Regan Heath says...
>
>On Mon, 28 Jun 2004 22:54:23 -0700, Andy Friesen <andy@ikagames.com> wrote:
>
>> Regan Heath wrote:
>>> ... we need an array specialisation for strings, so I'll have to write my own. This defeats the purpose of char[] in the first place, which was, to be a better more consistent  string handling method than in possible in c/c++.
>>
>> That would work, but it might be better to adjust your thinking to match the language instead of trying to shoehorn the way you're used to thinking onto an abstraction that clearly wasn't built for it.  Don't think in Java/C++/etc.  Think in D. :)
>
>You may be right, so in an effort to change my thinking, pls consider this...
>
>struct Item {
>	char[] label;
>	char[] value;
>}
>
>class Post {
>	Item[] items;
>
>	char[] getValue(char[] label)
>	{
>		foreach(Item i; items)
>		{
>			if (item.label == label)
>				return item.value;
>		}
>		//return null; not allowed
>		return "";
>	}
>}
>
>Web page...
>
><form post.. >
><input type="text" name="foo" value="">
><input type="text" name="bar" value="">
></form>
>
>Code to do something with the post.
>
>char[] s;
>Post p;
>
>s = p.getValue("foo");
>if (s) ..
>s = p.getValue("bar");
>if (s) ..
>
>Right...
>
>If I cannot return null, then (using the code above) I cannot tell the difference between whether foo or bar was passed or had an empty value.
>
>So I have to add a function, something like
>
>class Post {
>	bool isPresent(char[] label)
>	{
>		foreach(Item i; items)
>		{
>			if (item.label == label)
>				return true;
>		}
>		return false;
>	}
>}
>
>and in my code..
>
>if (p.isPresent("foo")) {
>	s = p.getValue("foo");
>	..
>}
>
>looks more complex. In addition I am searching for the label/value twice, doing twice the work.
>
>To avoid that I can add a parameter to the getValue function i.e.
>
>class Post {
>	char[] getValue(char[] label, out bool isNull)
>	{
>		foreach(Item i; items)
>		{
>			if (item.label == label)
>				return item.value;
>		}
>		//return null; not allowed
>		isNull = true;
>		return "";
>	}
>}
>
>then my code looks like...
>
>char[] s;
>bool isn;
>
>s = p.getValue("foo",isn);
>if (!isn) {
>}
>
>more complex code again, less obvious, a 3rd option springs to mind, instead of returning a char[] from getValue I could return existance and fill a passed char[] i.e.
>
>class Post {
>	bool getValue(char[] label, out char[] value)
>	{
>		foreach(Item i; items)
>		{
>			if (item.label == label)
>			{
>				value = item.value;
>				return true;
>			}
>		}
>		return false;
>	}
>}
>
>so my code now looks like...
>
>char[] s;
>
>if (getValue("foo",s)) {
>}
>
>this is perhaps the best soln so far. But! lets consider if this were extended to get 2 or more char[] values, (this is perfectly reasonable/likely, say they are loaded from a file, why process the file twice when you can do so once and get both values).
>
>bool getValue(out char[] val1, out char[] val2)
>{
>}
>
>what do we return if val1 exists but val2 does not? a set of flags? yuck.
>
>It just seems to me, that all this is done to emulate a reference type.. so why not have a reference type?
>
>We already have one, all it would take to make it consistent is 2 minor changes.
>
>If you have a solution to the above that is both as simple, elegant and easy to code as being able to return null.. pls educate me.
>
>Regan.
>
>-- 
>Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/


June 30, 2004
Farmer wrote:

> Andy Friesen <andy@ikagames.com> wrote in
> news:cbpsi6$1u7d$1@digitaldaemon.com: 
> 
> [snip]
> 
>>C++ containers cannot represent null either.  D will (and does) get along just fine if its array type works the same way.
> 
> [snip]
> 
> And probably that is one reason why programmers don't use std::vector.

They don't?  Do you have a source to back that up?  As far as I've ever noticed, bigwig C++ people have always made it clear that std::vector is preferable over an array and that std::string is preferable to a char*.

The concern for distinguishing empty vs null has quite honestly never even occurred to me until it was mentioned here.  Think about expressing the distinction a different way and move on.

I do apologize if I sound naive, (I'll assume that comment was directed at me :) ) but I honestly can't comprehend a situation in which the distinction is going to have any measurable cost on clarity, let alone performance.

 -- andy
June 30, 2004
Sean Kelly wrote:
> In article <cbrhd9$1a0o$1@digitaldaemon.com>, Sam McCall says...
> 
>>>This might be very handy.  If so, I wouldn't mind seeing rbegin and rend
>>>parameters as well though.
>>
>>Huh? They're pointers... wouldn't rbegin == end and rend == begin?
>>I think I missed the point...
> 
> 
> Actually, rbegin == end-1 and rend == begin-1.
Oops. Yeah, this would be useful.

>>>Plus, it raises the question of what they return for
>>>associative arrays.
>>
>>The concept doesn't apply to associative arrays afaics, so they wouldn't exist.
> 
> It does apply to associative arrays IMO.  I iterate through the contents of such
> containers quite regularly in C++.  I've done something similar with an iterator
> wrapper for associative arrays in D, but it would be nice to have this built-in
> if we move towards the iterator methodology.
We're talking about pointers for low level iteration, this doesn't apply to associative arrays, who's data structure's opaque. I don't think we're moving towards iterators, just talking about pointers. The fact that iterators pretend to be pointers in their syntax is neither here nor threre ;)
If you really want "official" iterators, there's always (or will always be) the DTL...
Sam
June 30, 2004
On Wed, 30 Jun 2004 00:52:17 +0000 (UTC), Charlie <Charlie_member@pathlink.com> wrote:

> ---
>
> s = p.getValue("foo");
> if (s.length)
>
> ---
>
> Whats wrong with this way ?

an empty char[] has a length of 0.
the above would not see an empty value passed in a form.

Regan.

> Charlie
>
> In article <opsadsu8f75a2sq9@digitalmars.com>, Regan Heath says...
>>
>> On Mon, 28 Jun 2004 22:54:23 -0700, Andy Friesen <andy@ikagames.com> wrote:
>>
>>> Regan Heath wrote:
>>>> ... we need an array specialisation for strings, so I'll have to write
>>>> my own. This defeats the purpose of char[] in the first place, which
>>>> was, to be a better more consistent  string handling method than in
>>>> possible in c/c++.
>>>
>>> That would work, but it might be better to adjust your thinking to match
>>> the language instead of trying to shoehorn the way you're used to
>>> thinking onto an abstraction that clearly wasn't built for it.  Don't
>>> think in Java/C++/etc.  Think in D. :)
>>
>> You may be right, so in an effort to change my thinking, pls consider
>> this...
>>
>> struct Item {
>> 	char[] label;
>> 	char[] value;
>> }
>>
>> class Post {
>> 	Item[] items;
>>
>> 	char[] getValue(char[] label)
>> 	{
>> 		foreach(Item i; items)
>> 		{
>> 			if (item.label == label)
>> 				return item.value;
>> 		}
>> 		//return null; not allowed
>> 		return "";
>> 	}
>> }
>>
>> Web page...
>>
>> <form post.. >
>> <input type="text" name="foo" value="">
>> <input type="text" name="bar" value="">
>> </form>
>>
>> Code to do something with the post.
>>
>> char[] s;
>> Post p;
>>
>> s = p.getValue("foo");
>> if (s) ..
>> s = p.getValue("bar");
>> if (s) ..
>>
>> Right...
>>
>> If I cannot return null, then (using the code above) I cannot tell the
>> difference between whether foo or bar was passed or had an empty value.
>>
>> So I have to add a function, something like
>>
>> class Post {
>> 	bool isPresent(char[] label)
>> 	{
>> 		foreach(Item i; items)
>> 		{
>> 			if (item.label == label)
>> 				return true;
>> 		}
>> 		return false;
>> 	}
>> }
>>
>> and in my code..
>>
>> if (p.isPresent("foo")) {
>> 	s = p.getValue("foo");
>> 	..
>> }
>>
>> looks more complex. In addition I am searching for the label/value twice,
>> doing twice the work.
>>
>> To avoid that I can add a parameter to the getValue function i.e.
>>
>> class Post {
>> 	char[] getValue(char[] label, out bool isNull)
>> 	{
>> 		foreach(Item i; items)
>> 		{
>> 			if (item.label == label)
>> 				return item.value;
>> 		}
>> 		//return null; not allowed
>> 		isNull = true;
>> 		return "";
>> 	}
>> }
>>
>> then my code looks like...
>>
>> char[] s;
>> bool isn;
>>
>> s = p.getValue("foo",isn);
>> if (!isn) {
>> }
>>
>> more complex code again, less obvious, a 3rd option springs to mind,
>> instead of returning a char[] from getValue I could return existance and
>> fill a passed char[] i.e.
>>
>> class Post {
>> 	bool getValue(char[] label, out char[] value)
>> 	{
>> 		foreach(Item i; items)
>> 		{
>> 			if (item.label == label)
>> 			{
>> 				value = item.value;
>> 				return true;
>> 			}
>> 		}
>> 		return false;
>> 	}
>> }
>>
>> so my code now looks like...
>>
>> char[] s;
>>
>> if (getValue("foo",s)) {
>> }
>>
>> this is perhaps the best soln so far. But! lets consider if this were
>> extended to get 2 or more char[] values, (this is perfectly
>> reasonable/likely, say they are loaded from a file, why process the file
>> twice when you can do so once and get both values).
>>
>> bool getValue(out char[] val1, out char[] val2)
>> {
>> }
>>
>> what do we return if val1 exists but val2 does not? a set of flags? yuck.
>>
>> It just seems to me, that all this is done to emulate a reference type..
>> so why not have a reference type?
>>
>> We already have one, all it would take to make it consistent is 2 minor
>> changes.
>>
>> If you have a solution to the above that is both as simple, elegant and
>> easy to code as being able to return null.. pls educate me.
>>
>> Regan.
>>
>> --
>> Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
>
>



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 30, 2004
On Tue, 29 Jun 2004 18:16:25 -0700, Andy Friesen <andy@ikagames.com> wrote:
> Farmer wrote:
>
>> Andy Friesen <andy@ikagames.com> wrote in
>> news:cbpsi6$1u7d$1@digitaldaemon.com: [snip]
>>
>>> C++ containers cannot represent null either.  D will (and does) get along just fine if its array type works the same way.
>>
>> [snip]
>>
>> And probably that is one reason why programmers don't use std::vector.
>
> They don't?  Do you have a source to back that up?  As far as I've ever noticed, bigwig C++ people have always made it clear that std::vector is preferable over an array and that std::string is preferable to a char*.
>
> The concern for distinguishing empty vs null has quite honestly never even occurred to me until it was mentioned here.  Think about expressing the distinction a different way and move on.

Sure.. can you show me how. I am having trouble doing it, it must be my C fixated brain.
Pls use the example in the post I made to you earlier today..

> I do apologize if I sound naive, (I'll assume that comment was directed at me :) )

LOL.. I thought it was me..

> but I honestly can't comprehend a situation in which the distinction is going to have any measurable cost on clarity, let alone performance.

I think my example in my previous post does show a cost on either or both.
Basically I think a reference type allows me to *express* more than a value type does.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 30, 2004
> >In D, there is no such thing as a non-existent int; there is no such
thing as a
> >non-existent struct; and there is no such thing as a non-existent string.
>
> Something like that would be cool, just like option in SML. I think I have to write something like this.


Perhaps,

class Option(VALUE)
{
    VALUE Item;
}

template SOME(VALUE)
{
    Option!(VALUE) SOME(VALUE x)
    {
        Option!(VALUE) e = new Option!(VALUE)();
        e.Item = x;
        return e;
    }
}

alias Option!(uint) INDEX;


class Array(VALUE)
{
    ...

    INDEX Index(VALUE x)
    {
        foreach (uint i, VALUE z; Items)
        {
            if (x == z)
            {
                return SOME!(VALUE)(i);
            }
        }
        return null;
    }
}

Somewhat non-ideal though.


June 30, 2004
Regan Heath wrote:

> ... I could return existance and
> fill a passed char[]...  so my code now looks like...
> 
> char[] s;
> if (getValue("foo",s))

I like this.  It's simple and obvious.

> if this were extended to get 2 or more char[] values...
> bool getValue(out char[] val1, out char[] val2) {}

In this case, I would say that the best thing to do on failure is to throw an exception.  Asking for a number of values all at once looks (to me, anyhow) to be implying that you expect them all to be present.  If you don't, you'll have to test them all individually at some point anyway, in which case the previous form allows you to test and retrieve in one step.

It may also be useful to return all the attributes as an associative array.  They're easy to mutate and iterate through.

> It just seems to me, that all this is done to emulate a reference type.. so why not have a reference type?

You got me there, but it seems to me that things could get very weird if you need to express a non-null array of 0 length.

> If you have a solution to the above that is both as simple, elegant and easy to code as being able to return null.. pls educate me.

Exposing POST data as an associative array seems like a win to me; it's faster and can can be iterated over conveniently.  Also, as a language intrinsic, it's a bit more likely to plug into other APIs easily.

If you *really* need to, you could probably get away with doing something like:

    const char[] nadda = "nadda";
    if (s is not nadda) { ... }

 -- andy
June 30, 2004
On Tue, 29 Jun 2004 19:26:22 -0700, Andy Friesen <andy@ikagames.com> wrote:
> Regan Heath wrote:
>
>> ... I could return existance and
>> fill a passed char[]...  so my code now looks like...
>>
>> char[] s;
>> if (getValue("foo",s))
>
> I like this.  It's simple and obvious.

I agree.

>> if this were extended to get 2 or more char[] values...
>> bool getValue(out char[] val1, out char[] val2) {}
>
> In this case, I would say that the best thing to do on failure is to throw an exception. Asking for a number of values all at once looks (to me, anyhow) to be implying that you expect them all to be present.

Nope. This is taken from a real life example, I have a config file with 10 different settings, all optional, I want 3 or them at this point in the code, so I process the file once and load the 3 settings which may or may not be present, and may or may not have a zero length values.

> If you don't, you'll have to test them all individually at some point anyway

Yes, at that point I need to be able to tell if the setting was present, present with zero length value, or not present at all.

> , in which case the previous form allows you to test and retrieve in one step.

Which previous form? do you mean the one that takes only one parameter, if so, that would involve parsing the file 3 times, not acceptable.

> It may also be useful to return all the attributes as an associative array.  They're easy to mutate and iterate through.

It's the same problem all over again, say I have:

char[char[]] list;
char[] s1,s2,s3;

fn(list);
s1 = list["setting1"];
s2 = list["setting2"];
s3 = list["setting3"];

s needs to be null for setting3, empty for setting2 and "foobar" for setting1.

I believe this is currently the case, but!, as Farmer has shown if I then went

if (s2 == s3) //this would evaluate to true

and that's a problem.

>> It just seems to me, that all this is done to emulate a reference type.. so why not have a reference type?
>
> You got me there, but it seems to me that things could get very weird if you need to express a non-null array of 0 length.

char[] s = ""

s is a non-null array of 0 length.

>> If you have a solution to the above that is both as simple, elegant and easy to code as being able to return null.. pls educate me.
>
> Exposing POST data as an associative array seems like a win to me;

I agree, it's a more D thing to do also :)
I believe the same problem still applies (see above)

> it's faster and can can be iterated over conveniently.  Also, as a language intrinsic, it's a bit more likely to plug into other APIs easily.
>
> If you *really* need to, you could probably get away with doing something like:
>
>      const char[] nadda = "nadda";
>      if (s is not nadda) { ... }

True, but this is yucky and what if a setting actually had a value of "nadda"?

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 30, 2004
Regan Heath wrote:
> This is taken from a real life example, I have a config file with 10 different settings, all optional, I want 3 or them at this point in the code, so I process the file once and load the 3 settings which may or may not be present, and may or may not have a zero length values.

I guess it's just a matter of preference.  I don't have a problem with something like this:

    char[][char[]] attribs = ...;

    if ("a" in attribs && "b" in attribs && "c" in attribs) {

If nonexistence is an alias for some default, fill the array before parsing the file.  Attributes that are present will override those which are not.

Python offers a get() method which takes two arguments: a key, and a default value which is returned should the key not exist.  I use this a lot.

>> things could get very weird if you need to express a non-null array of 0 length.
> 
> char[] s = ""
> 
> s is a non-null array of 0 length.

What about non-char types?

>> If you *really* need to, you could probably get away with doing something like:
>>
>>      const char[] nadda = "nadda";
>>      if (s is not nadda) { ... }
> 
> 
> True, but this is yucky and what if a setting actually had a value of "nadda"?

That's why you use 'is' and not ==.  'is' performs a pointer comparison.   The array has to point into that exact string literal for the comparison to be true.  The only catch is string pooling.  It'd be okay as long as the string literal "nadda" isn't declared anywhere in the source code.

Come to think of it, this is better:

   char[] nonString = new char[1]; // don't mutate me!  Just compare with 'is'!

I'm officially out of ideas now.  heh.

 -- andy
June 30, 2004
In article <cbsufo$a8u$1@digitaldaemon.com>, Sam McCall says...

>Okay, suppose java had a 21- or 32-bit char type.

I'm led to believe there was a lot of debate about this. Some folk said that Java's char could NOT be anything other that 16 bits wide because it was defined that way and changing it would break things. Other folk looked under the hood of the JVM and decided that actually it probably wouldn't break anything after all. I don't know the ins and outs of it, but I gather the first lot won. The way it's going to go is UTF-16 support, with functions like isLetter() taking an int rather than a char.




>Glyphs aren't really a practical option as the logical element type of strings if they can't be easily represented as a fixed-width number, I'd imagine.

Well, they can, with a bit of sneaky manipulation. The trick is to map only those ones you actually USE to the unused codepoints between 0x110000 and 0xFFFFFFFF. So long as such a mapping stays within the application (like, don't try to export it), you can indeed have one dchar per glyph. But it would be a temporary one - not one you could write to a file, for example.

In general, you're right.



>But you can't do obvious "list-of-characters" things like index by character or even slice at any offset.

True.




>> It /is/ okay to use ASCII. All valid ASCII also happens to be valid UTF-8. UTF-8 was designed that way.
>So this means a char[] has two purposes depending on the app?

I'm not sure I follow that. If you say char[] a = "hello world"; then you will get a string containing eleven chars, and it will be both valid ASCII and valid UTF-8. It's not like you have to choose.


>On the one hand, ASCII/Unicode being a per-app decision is fair enough.

That isn't what I said. It's possible we may be misunderstanding each other somehow.



>Also, if people are going to use char[] as ASCII, they may write libraries that assume char[] is ASCII

Well, that would be a bug, of course. It's perfectly ok to choose only to store ASCII characters in chars, but NOT perfectly okay to assume that chars will only contain ASCII characters. Anyone writing a library containing such a bug should simply be press-ganged into fixing it.



>or worse, "a character in some unknown encoding".

Again, that would be a bug, and at odds with D's definition of what a char is.


>If it were documented as only working for ASCII, sure, otherwise you might assume it was a UTF-8 encoded character list. And I'm still not sure it'd be reasonable unless a wchar/dchar version was provided, how good is a language's unicode support if string manipulation functions only work on ascii?

I'm not completely clear what functions you're talking about, as I haven't read the source code for std.string. Am I correct in assuming that the quote below is an extract?



>Anyway:
>/************************************
>  * Construct translation table for translate().
>  */
>
>char[] maketrans(char[] from, char[] to)
>     in
>     {
>	assert(from.length == to.length);
>     }
>     body
>     {
>	char[] t = new char[256];
>	int i;
>
>	for (i = 0; i < 256; i++)
>	    t[i] = cast(char)i;
>
>	for (i = 0; i < from.length; i++)
>	    t[from[i]] = to[i];
>
>	return t;
>     }
>

This is a bug. ASCII stops at 0x7F. Characters above 0x7F are not ASCII. If this function is intended as an ASCII-only function then (a) it should be documented as such, and (b) it should leave all bytes >0x7F unmodified. Char values between 0x80 and 0xFF are resevered for the role they play in UTF-8. You CANNOT mess with them (unless you're a UTF-8 engine).

You're right. I'd prefer to see a dchar version of this routine. Of course, you wouldn't want a lookup table with 0x1100000 entries in it, but an associative array should do the job.

Assuming this is from std.string, I guess one of us should report this as a bug.

Arcane Jill