June 29, 2004
Matthias Becker wrote:
> In C++ I never had the wish to pass a container/collection as a pointer. I
> allways pass them as C++-reference. So I'm sure there allways is a collection
> and I don't have to check for this.
> If there are no values to pass in, I just pass an empty collection.
> 
> 
> Could you please make some example where it makes sense not to pass a collection
> instead of passing an empty collection?
To request default behaviour a la optional arguments, without restrictions on the number or position of the arguments.

Sam
June 29, 2004
"Arcane Jill" <Arcane_member@pathlink.com> escribió en el mensaje
news:cbre1b$15j0$1@digitaldaemon.com
|
| ...
|
| Yeah - but nobody has yet answered WHY? Why would ANYONE want to allow
| uninitialized array handles (as opposed to array content) to exist in D.
It
| makes no sense.
|
| Please, can someone who is arguing in favor of allowing a distinction
between
| initialized and unintialized dynamic array handles, explain exactly why
you want
| such a distinction to exist?
|
|
| Arcane Jill

Regan already said why:

"Regan Heath" <regan@netwin.co.nz> escribió en el mensaje
news:opr99w0st25a2sq9@digitalmars.com
|
| ...
|
| We *need* to have *both* null and empty arrays. The reason is pretty
| simple:
|    - null means does not exist
|    - emtpy means exists, but has no value (or empty value)
|
| This is important in situations like the original poster mentioned and in
| my experience for example... When reading POST input from a web page, you
| get a string like so:
|
|    Setting1=Regan+Heath&Setting2=&&
|
| when requesting items you might have a function like:
|
|    char[] getFormValue(char[] label);
|
| the code to get the values for the above form might go:
|
|    char[] s;
|
|    s = getFormValue("Setting1"); // s is "Regan Heath"
|    s = getFormValue("Setting2"); // s is ""
|    s = getFormValue("Setting3"); // s is null
|
| It is important the above code can tell that Setting3 was not passed in
| the form, so it can decide not to overwrite whatever current value that
| setting has, whereas it can tell Setting2 was passed and will overwrite
| the current value with a new blank one.
|
| ...
|

Personally, I would use an associative array to represent such a thing (instead of using a function), but it's an implementation difference, and the language should let Regan do the way he wants.

I've ran into such cases before ("" !== null), I know that. I just can't remember any of them right now :D

Two more things: I don't think this should only be for strings, but for any array. And I'm 100% sure this has been raised before.

-----------------------
Carlos Santander Bernal


June 29, 2004
Arcane Jill wrote:

> In article <cbs0bj$1vhf$1@digitaldaemon.com>, Sam McCall says...
> 
>>I'm not saying it shouldn't be well-defined, but Java doesn't require the user to understand the intricacies of unicode encodings to manipulate strings.
> 
> 
> Yes it does. Java chars operate in UTF-16. If you want to store the character
> U+012345 in a Java string, you need to worry about UTF-16.
Whoops. Having never had to deal with this case (and taken a series of CS courses where we've iterated over chars countless times and they never mentioned this once :-\) I hadn't thought about this.
Okay, suppose java had a 21- or 32-bit char type.

>>Probably not, although if reading an encoded string and then writing it again doesn't produce the same byte-output, I'm sure I could find a contrived example... copy-pasting text invalidating a digital signature?
> 
> That's what normalization is for. We'll have that soon in a forthcoming version
> of etc.unicode.
Of course... so no, the program shouldn't care, but...

>>Am I right in assuming a glyph can be fairly complicated?
> Very much so. Especially if you're a font designer, since Unicode allows you to
> munge any two glyphs together into a bigger glyph (a ligature). In practice,
> fonts only provide a small subset of all possible ligatures (as you can
> imagine!).
Glyphs aren't really a practical option as the logical element type of strings if they can't be easily represented as a fixed-width number, I'd imagine.

>>Yeah. It's just a bit disappointing after hearing "Strings are character arrays and everything about them makes sense" to realise that you either have to grok UTF-N or treat these "characters" as opaque... the advantages over a class are gone, and a class has reference semantics and member functions.
> 
> 
> Not really. So long as you remember that characters <= 0x7F are OK in a char,
> and that characters <= 0xFFFF are fine in a wchar, you're sorted.
But you can't do obvious "list-of-characters" things like index by character or even slice at any offset.

>>Yeah, it's the partly-there that's frustrating... my selfish side would be happy with just ASCII ;-). It just seems sometimes that if it's not easy and consistent to make things unicode-friendly, it won't happen. 
> 
> 
> Right, but it's a question of where that support comes from. To demand it all of
> the language itself is asking /a lot/ from poor old Walter. If we can add it,
> piece by piece, in libraries, I'd say we're not doing too badly.
A decent unicode string class could be almost entirely library based, and would only require a little magic language support (for string literals). I might have a play around with one, on the assumption that if people find it useful, the horribly inefficient/incorrect bits could be fixed by people who know what they're doing ;)

> It /is/ okay to use ASCII. All valid ASCII also happens to be valid UTF-8. UTF-8
> was designed that way.
So this means a char[] has two purposes depending on the app?
On the one hand, ASCII/Unicode being a per-app decision is fair enough.
On the other hand, that's not what it looked like to me in the docs, and  I still think unicode should be the "default".
Also, if people are going to use char[] as ASCII, they may write libraries that assume char[] is ASCII or worse, "a character in some unknown encoding".

>>and assume chars are characters if you want. The standard library even does this, in std.string no less.
> So long as they make no assumptions about characters > 0x7F, that's perfectly
> reasonable.
If it were documented as only working for ASCII, sure, otherwise you might assume it was a UTF-8 encoded character list. And I'm still not sure it'd be reasonable unless a wchar/dchar version was provided, how good is a language's unicode support if string manipulation functions only work on ascii?
Anyway:
/************************************
 * Construct translation table for translate().
 */

char[] maketrans(char[] from, char[] to)
    in
    {
	assert(from.length == to.length);
    }
    body
    {
	char[] t = new char[256];
	int i;

	for (i = 0; i < 256; i++)
	    t[i] = cast(char)i;

	for (i = 0; i < from.length; i++)
	    t[from[i]] = to[i];

	return t;
    }

>>Yeah, fonts are a problem. My ideal world would have a (huge!) complete system default font (or one each for serif, sans, and mono) supplied with the OS, that would be the fallback for nonexistant characters.
> I absolutely agree. There are free fonts which do this, but they don't display
> well at small point-size because of something called "hinting", which apparently
> you can't do without paying someone royalties because of some stupid IP
> nonsense.
Ew, does that apply to creating fonts too? I thought most free fonts weren't manually hinted because it'd take forever, especially for unicode... I know freetype doesn't interpret hints by default, but there's a #define somewhere: "set this to 1 if you have permission from Apple Legal, or live somewhere sane". On my distro of choice, this was set by default :-D

>>Yes. What gets me is that in a 5 years we'll (hopefully) be far enough down the unicode road that D's approach will seem backward, and I'll have to wait for someone to reinvent a similar language, with a more thorough unicode integration.
> Yup. That's the way it goes. So what else shall we imagine for D++?
Fix C's broken precedence rules?

Sam
June 29, 2004
Farmer wrote:

> Arcane Jill <Arcane_member@pathlink.com> wrote in
> news:cbr53s$op8$1@digitaldaemon.com: 
> 
>>Maybe the real solution would be to make it a compile error to assign an
>>array with null, or to compare it with null. This would then force
>>people to say what they mean, and all such problems would go away.
> 
> 
> I agree, that would help to avoid some confusion. Unfortunately, people would be forced to either say 'I mean empty' or to shut up completely and use sth. completely different.
We don't have array literals, so we can't do this:
foo( [] );
At the moment we can do this:
foo( null );
If we outlawed using nulls as arrays, we'd be left with
foo( new int[0] )
which is maybe a bit messy?
Sam
June 30, 2004
On Mon, 28 Jun 2004 22:54:23 -0700, Andy Friesen <andy@ikagames.com> wrote:

> Regan Heath wrote:
>> ... we need an array specialisation for strings, so I'll have to write my own. This defeats the purpose of char[] in the first place, which was, to be a better more consistent  string handling method than in possible in c/c++.
>
> That would work, but it might be better to adjust your thinking to match the language instead of trying to shoehorn the way you're used to thinking onto an abstraction that clearly wasn't built for it.  Don't think in Java/C++/etc.  Think in D. :)

You may be right, so in an effort to change my thinking, pls consider this...

struct Item {
	char[] label;
	char[] value;
}

class Post {
	Item[] items;

	char[] getValue(char[] label)
	{
		foreach(Item i; items)
		{
			if (item.label == label)
				return item.value;
		}
		//return null; not allowed
		return "";
	}
}

Web page...

<form post.. >
<input type="text" name="foo" value="">
<input type="text" name="bar" value="">
</form>

Code to do something with the post.

char[] s;
Post p;

s = p.getValue("foo");
if (s) ..
s = p.getValue("bar");
if (s) ..

Right...

If I cannot return null, then (using the code above) I cannot tell the difference between whether foo or bar was passed or had an empty value.

So I have to add a function, something like

class Post {
	bool isPresent(char[] label)
	{
		foreach(Item i; items)
		{
			if (item.label == label)
				return true;
		}
		return false;
	}
}

and in my code..

if (p.isPresent("foo")) {
	s = p.getValue("foo");
	..
}

looks more complex. In addition I am searching for the label/value twice, doing twice the work.

To avoid that I can add a parameter to the getValue function i.e.

class Post {
	char[] getValue(char[] label, out bool isNull)
	{
		foreach(Item i; items)
		{
			if (item.label == label)
				return item.value;
		}
		//return null; not allowed
		isNull = true;
		return "";
	}
}

then my code looks like...

char[] s;
bool isn;

s = p.getValue("foo",isn);
if (!isn) {
}

more complex code again, less obvious, a 3rd option springs to mind, instead of returning a char[] from getValue I could return existance and fill a passed char[] i.e.

class Post {
	bool getValue(char[] label, out char[] value)
	{
		foreach(Item i; items)
		{
			if (item.label == label)
			{
				value = item.value;
				return true;
			}
		}
		return false;
	}
}

so my code now looks like...

char[] s;

if (getValue("foo",s)) {
}

this is perhaps the best soln so far. But! lets consider if this were extended to get 2 or more char[] values, (this is perfectly reasonable/likely, say they are loaded from a file, why process the file twice when you can do so once and get both values).

bool getValue(out char[] val1, out char[] val2)
{
}

what do we return if val1 exists but val2 does not? a set of flags? yuck.

It just seems to me, that all this is done to emulate a reference type.. so why not have a reference type?

We already have one, all it would take to make it consistent is 2 minor changes.

If you have a solution to the above that is both as simple, elegant and easy to code as being able to return null.. pls educate me.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 30, 2004
On Tue, 29 Jun 2004 09:50:35 +0000 (UTC), Arcane Jill <Arcane_member@pathlink.com> wrote:

> In article <cbr9e5$vai$1@digitaldaemon.com>, Derek Parnell says...
>
>> Because that's not what is being meant. I'd like to differentiate between
>> INITIALIZED and UNINITIALIZED vectors.
>
> Why?
>
> D's dynamic arrays are the same thing as C++ std::vectors (as I'm sure you
> realize). In C++, there is no such thing as an uninitialized vector. Why on
> Earth would you want them in D?
>
>
>
>> This non-existant thing is a
>> red-herring. 'empty' means initialized and length of zero. 'non-existant'
>> means not initialized yet.
>
> Yeah - but nobody has yet answered WHY? Why would ANYONE want to allow
> uninitialized array handles (as opposed to array content) to exist in D. It
> makes no sense.
>
> Please, can someone who is arguing in favor of allowing a distinction between
> initialized and unintialized dynamic array handles, explain exactly why you want
> such a distinction to exist?

Pls read the reply I just made to Andy's post that started this branch in this thread i.e. just go up a little bit in a threaded reader, or look for the post I made just prior to this one if viewing flat and sorting by date.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 30, 2004
On Tue, 29 Jun 2004 15:58:29 +0000 (UTC), Matthias Becker <Matthias_member@pathlink.com> wrote:

>>> In C++, there is no such thing as an uninitialized vector. Why on
>>> Earth would you want them in D?
>> For the same reason you use null in other situations with reference
>> types. I want accessing an uninitialised member array to give an error.
>> I want to be able to use a null argument to a function to trigger
>> special or default behaviour (optional arguments in any position).
>
> Nope, wrong.
>
> If you use reference-types that are allowed to be NULL (in C++ references
> aren't, e.g. in nice there are references, that aren't, too, ...) you want to
> show that there possibly is no object. At least in languages that allow you to
> use other kinds of references (e.g. C++ or nixe as mentiond above).
>
> In languages that don't have references that can't be null, you just can't
> express yourself in the code.
>
>
>
> In C++ I never had the wish to pass a container/collection as a pointer. I
> allways pass them as C++-reference. So I'm sure there allways is a collection
> and I don't have to check for this.
> If there are no values to pass in, I just pass an empty collection.
>
>
> Could you please make some example where it makes sense not to pass a collection
> instead of passing an empty collection?

pls read my post (2 prior to this one - sorted flat and by date, it is a response to Andy's post) it contains an example. I would like some feedback on how to achieve what I want to do...

Regan.

> -- Matthias Becker
>
>



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 30, 2004
On Wed, 30 Jun 2004 03:20:54 +1200, Sam McCall <tunah.d@tunah.net> wrote:

> Bent Rasmussen wrote:
>
>>> Frankly, yes, I use -1 as a "magic value" all the time, and do all sorts
>>> of ugly things when negative numbers are perfectly valid. This is
>>
>>
>> That's true. In Standard ML you could do
>>
>> val index : 'a -> int option
>>
>> Then if 'a exists return SOME(x), if not, return NONE. If a function has a
>> an option type as a domain it has to deal with both cases.
> McCall's Law the First:
> Every feature of a "traditional" language is a special case of a feature of every functional language.
> McCall's Law the Second:
> Every feature of every functional language is a special case of the only feature of Lisp.
>
>> In D, you'd either use a magic value like -1 or encapsulate values in a
>> class; then null is NONE and not null is SOME.
> But this isn't ML. I will get some weird looks, and nobody will touch my libraries ;-)
> Besides, that's exactly equivalent (AFAICS) to a reference type, assuming no pointer arithmetic and casting shenanigans. If this _is_ useful, is dereferencing one more pointer to access arrays really going to kill us? Or is there some case where the value-type-kinda nature of arrays is useful?

I think the current value-type-kinda nature of arrays is good, it just needs the 2 tweaks I mentioned to make it consistent.

>> But you can go ahead and create a class for lists, no problem at all.
>> Neither Phobos nor DTL has fully hatched yet, so we'll see what happens.
> I'm beginning to think this is the only answer. But lists are such a fundamental type, using a non-standard list type would be a pain. I can't see room for another list type, so I guess I'll end up using DTL's list everywhere, and hope everyone does the same. But it does seem a waste of such powerful arrays in the language.
>
> Sam



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 30, 2004
On Tue, 29 Jun 2004 15:39:15 +0000 (UTC), Matthias Becker <Matthias_member@pathlink.com> wrote:
>
>>>>> A 'null array' is a completely arbitrary concept that has been
>>>>> extrapolated from undefined behaviour. :)
>>>> It may be undefined, but I believe it is required.
>>>
>>> Why?  C++ gets along without them just fine, and every C derivant I know
>>> of gets along fine without allowing primitive type returns to signify
>>> nonexistence.
>>>
>>> Functions which returns structs cannot return null either.
>>
>> Thus why just about no-one ever does this (in C). They all return a
>> pointer to a struct.
>
> Because copying a struct costs much more than just copying a pointer to it. In
> C++ you have references for things like this, which can't be NULL.

Thus why I dont use references either when I need the ability to say it's NULL.

>>>> The soln IMO is either to make the current behaviour official and
>>>> consistent, or to change the behaviour, make that official and provide
>>>> another way to tell null apart from an empty string.
>>>
>>> Farmer's test reports pretty consistent results if you suppose that
>>> comparing arrays to null is ill-formed:
>>>
>>>      empty1.length == 0    is true
>>>      empty1 == ""          is true
>>>      empty2.length == 0    is true
>>>      empty2 == ""          is true
>>>      empty3.length == 0    is true
>>>      empty3 == ""          is true
>>>
>>> Don't compare arrays to null.  Don't try to differentiate between empty
>>> and nonexistent.
>>
>> Fine and dandy EXCEPT we *need* to differentiate between empty and
>> non-existant strings.
>>
>>> D arrays simply do not work that way.
>>
>> In that case we need an array specialisation for strings, so I'll have to
>> write my own. This defeats the purpose of char[] in the first place, which
>> was, to be a better more consistent  string handling method than in
>> possible in c/c++.
>>
>
> Could you please make some real world examples, where you need empty strings and
> null-strings?

Sure thing, pls see my reply to andy's post.. there has to be an easy way to direct you to a post but I dont know how.. I posted it 3 or 4 posts ago if you sort flat and by date.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 30, 2004
On Tue, 29 Jun 2004 16:15:58 +0000 (UTC), Sean Kelly <sean@f4.ca> wrote:

> In article <opsab6o5rl5a2sq9@digitalmars.com>, Regan Heath says...
>>
>> Fine and dandy EXCEPT we *need* to differentiate between empty and
>> non-existant strings.
>
> Why?  It seems to me that this behavior would also require arrays to be
> initialized with new rather than resizing from zero using the .length parameter.

Nope. It already works, except for 2 inconsistencies (see the original post)

> And this would result in a ton of extra coding--either in clauses that errored
> on null arrays or initialization code to handle both cases.  No thanks.

Not true. You can/could still simply check the length vs 0 if you want to treat null and empty the same.

> If this
> happened I'd stil using built-in arrays and write a class for the purpose.

? 'stil' == 'stop' ?

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/