July 06, 2004
Arcane Jill wrote:

> In article <ccdd9k$1h4v$1@digitaldaemon.com>, Norbert Nemec says...
> 
>>My only concern about $ is, that it really uses up one character that is so far unused and might find some much more important role sometimes in the future. (Just like # which we should really preserve until we find some worthy application for it.)
> 
> 
> Let's use the £ sign or the € sign then. (The dollar is not the only currency on this planet <g> ).

But the only currency sign that made into the ASCII character set. I guess we just have to live with the fact that it is the "American Standard Code for Information Interchange" that we are using internationally now.

And going Unicode for the basic language features would probably not be a good idea. It is nice to allow it for string constants and comments, but the language itself should better be editable with any good-old-editor.

> PS. The EURO SIGN character is U+20AC (not U+0080 as Windows would have us
> believe).

Is it true that Windows is misbehaving like that? In ISO-8859-15, the EURO SIGN is at position 80, but the Unicode table is based on ISO-8859-1. Maybe, Windows is just mixing up the names of the codes?

July 06, 2004
In article <ccdt8a$2b4n$1@digitaldaemon.com>, Norbert Nemec says...

>Is it true that Windows is misbehaving like that? In ISO-8859-15, the EURO SIGN is at position 80, but the Unicode table is based on ISO-8859-1. Maybe, Windows is just mixing up the names of the codes?

Depends what you mean by "misbehaving". Windows - at least in most English-speaking countries = prefers an encoding called "Windows code page 1252" (aka WINDOWS-1252), an 8-bit encoding which co-incides with ISO-8859-1 in the ranges 00-7F and A0-FF, and has some extra characters (which Microsoft imaagine will be commonly used by English speaking characters) in the range 80-9F, in place of the C1-controls.

By itself, this is relatively harmless. Where it all goes wrong is that text files, Word documents, web pages, and so on, end up getting written in this encoding, and Windows is very, very naughty in that it will often deliberately misrepresent the encoding (presumably in the belief that if it incorrectly declares the encoding to be ISO-8859-1 then other endpoints will get most of the characters right, but if it declares it as WINDOWS-1252 then other endpoints might reject the document because it has a unknown encoding).

Arcane Jill


July 06, 2004
Norbert Nemec wrote:

> Ben Hinkle wrote:
> 
>> Sam McCall wrote:
>> 
>>> It's an interesting idea, but we'd lose the ability to catch some accidental out-of-bounds accesses, which is IMHO enough reason not to do it.
>>> 
>>> Norbert Nemec wrote:
>>>> Gowron wrote:
>>>> The improved idea that was discussed lateron did appear in different
>>>> variations. The version I liked most, was to introduce the symbol $ as
>>>> a special symbol only within indexing expressions, meaning: the range
>>>> of this dimension.
>>>> 
>>>> This would allow you to write:
>>>>         a[1..$-1]
>>>> instead of your suggestion.
>>> Hmm, for normal arrays have array.length, for multidimensional arrays,
>>> we're going to need a nice way to get the nth dimension anyway.
>>> Sam
>> 
>> For multi-dim arrays .length could return the total number of elements and another property (for argument's sake call it "dims") could return a static array of the size of each dimension. Then length = dims[0] * dims[1] * ... * dims[N-1].
> 
> In my proposal, .length only exists for 1D-arrays. Here it is assignable with the magic of reallocating and copying the data if necessary.
> 
> What you call "dims" is called .range in my proposal. It has itself the semantics of a fixed array of length N. Assigning to it is possible, but unsafe and without any magic (it boils down to assigning to the raw entries in the array-reference structure)
> 
> The product of all ranges is implemented as a new, read-only property .volume
> 
> The .size property equals .volume time the size of one entry.

sounds ok to me. I hadn't checked again with your proposal before posting. MATLAB uses "size" for dims/range and "numel" for length/volume

>> Personally I'm not a huge fan of $ since the properties are more readable and general.
> 
> You are sure about "readable"?
> 
>         S1.A[2..S1.A.range[0]-1,2..S1.A.range[1]-1] =
>             S2.B[2..S2.B.range[0]-1,2..S2.B.range[1]-1];
> vs.
>         S1.A[2..$-1,2..$-1] = S2.B[2..$-1,2..$-1];

yup - more verbose but to me more readable if I don't already know what $
means in D. Readability can be improved in the more verbose one by doing
something like:
 int[2] end = S1.A.range;
 S1.A[2 .. end[0]-1, 2 .. end[1]-1] =
    S2.B[2 .. end[0]-1, 2 .. end[1]-1];

> And don't tell me this is constructed. I'm working with Matlab, where I have code like this all over the place.
> 
> Of course, the $ is not very general, but be honest: the range of and array is used so often in indexing it, that it would really make sense to think about a reasonable shorthand.

It's true that MATLAB uses "end" - and even lets you override it in class definitions. But I honestly think $ won't be used as much in D as end is in MATLAB. A shorthand would be nice but it shouldn't stick out too much, I think. Using $ just looks like a random character thrown into the code.

> My only concern about $ is, that it really uses up one character that is so far unused and might find some much more important role sometimes in the future. (Just like # which we should really preserve until we find some worthy application for it.)

July 06, 2004
Norbert Nemec wrote:

> Sam McCall wrote:
> 
> 
>>Hmm, you do make a good point.
>>Norbert Nemec wrote:
>>
>>>My only concern about $ is, that it really uses up one character that is
>>>so far unused and might find some much more important role sometimes in
>>>the future. (Just like # which we should really preserve until we find
>>>some worthy application for it.)
>>
>>Exactly, we already have a "perfectly functional" Perl ;-)
>>A keyword sounds like the answer, three alternatives:
>>1) Make "range" a keyword, and something like foo[2,range-1] would know
>>that foo[2,foo.range[1]-1] was intended.
> 
> 
> That might be an idea. Of course, with "range" being handled specially here,
> this would really be a mess for the syntax, which Walter definitely will
> not like (and I don't, either)
> 
> What would "range" be?
> 
> * A keyword? That would make it impossible to use it as identifier any other
> context. Ugly!
Yes, I was thinking a keyword, simply because (assuming a name like "range" that is a valid identifier) if it wasn't a keyword, then there would be ambiguity with identifiers. For example: make it a function, and then people who do
int range = max-min;
a[foo..foo+range];
get confusing behaviour/confusing error messages.

> * A argument-less function that is predefined only in this context? Maybe.
Personally, I think that's far more ugly - a function that doesn't obey scoping rules?

> There is no problem. "range" would of course always refer to that dimension
> where it is used for indexing.
I agree with you, but the point is, _directly_ used for indexing.
array[range-foo];	// fine
array[foo(range)];	// fine
array[foo[range]];	// bad, should be array[foo[array.length]]

Sam
July 06, 2004
Ben Hinkle wrote:

> MATLAB uses "size" for dims/range and "numel" for length/volume

True, but size is already taken for the physical size and "numel" - well probably just a matter of taste and getting used to it... I just think "volume" is easier to recognize.

> Readability can be improved in the more verbose one by doing
> something like:
>  int[2] end = S1.A.range;
>  S1.A[2 .. end[0]-1, 2 .. end[1]-1] =
>     S2.B[2 .. end[0]-1, 2 .. end[1]-1];

> But I honestly think $ won't be used as much in D as end is in MATLAB.

That's nonsense. Of course - if you don't use arrays, you don't need to access their range. Proportionally, arrays in D will certainly never be as important as in Matlab, since the latter is specialized on them. But what you are effectivly saying is similar to: "D is not specialized on array handling, so why should we worry about sophisticated syntax for arrays?

Anybody doing numerics or string handling will use arrays very intensely. And in all array-handling code, the density of range-accesses in indexing expressions certainly is just as high as in Matlab code...

> A shorthand would be nice but it shouldn't stick out too much,
> I think. Using $ just looks like a random character thrown into the code.

OK, forget about the "$" - how about that alternative proposal: "range" as an argumentless, unqualified indentifier defined only within indexing/slicing expressions. I don't know how it would fit into the parser to have this identifier stick out without being a keyword (maybe as argumentless function?) but there might be some simple way.

July 06, 2004
Norbert Nemec wrote:

> Ben Hinkle wrote:
> 
>>MATLAB uses "size" for dims/range and "numel" for length/volume
> 
> True, but size is already taken for the physical size
It's deprecated (and gives a fatal error IIRC?)

> and "numel" - well
> probably just a matter of taste and getting used to it... I just think
> "volume" is easier to recognize.
Volume is nice, I like the metaphor.

> OK, forget about the "$" - how about that alternative proposal: "range" as
> an argumentless, unqualified indentifier defined only within
> indexing/slicing expressions. I don't know how it would fit into the parser
> to have this identifier stick out without being a keyword (maybe as
> argumentless function?) but there might be some simple way.
If it is going to be used in this way (which I think is a good idea), I think that in practice that would mean making it a keyword, because as a de facto variable it could shadow or at least cause ambiguity with local variables (which we can't currently disambiguate) if they are allowed to have the same name.
I don't think making it a keyword is too harsh though, certainly no worse than in/out, and length/size/extent are all valid identifiers.

Sam
July 06, 2004
Sam McCall wrote:

> Norbert Nemec wrote:
> 
>> Ben Hinkle wrote:
>> 
>>>MATLAB uses "size" for dims/range and "numel" for length/volume
>> 
>> True, but size is already taken for the physical size
> It's deprecated (and gives a fatal error IIRC?)

In D? Why that? The specs say nothing about that.


July 06, 2004
On Wed, 07 Jul 2004 02:03:59 +1200, Sam McCall <tunah.d@tunah.net> wrote:
> Norbert Nemec wrote:
>
>> Ben Hinkle wrote:
>>
>>> MATLAB uses "size" for dims/range and "numel" for length/volume
>>
>> True, but size is already taken for the physical size
> It's deprecated (and gives a fatal error IIRC?)
>
>> and "numel" - well
>> probably just a matter of taste and getting used to it... I just think
>> "volume" is easier to recognize.
> Volume is nice, I like the metaphor.
>
>> OK, forget about the "$" - how about that alternative proposal: "range" as
>> an argumentless, unqualified indentifier defined only within
>> indexing/slicing expressions. I don't know how it would fit into the parser
>> to have this identifier stick out without being a keyword (maybe as
>> argumentless function?) but there might be some simple way.
> If it is going to be used in this way (which I think is a good idea), I think that in practice that would mean making it a keyword, because as a de facto variable it could shadow or at least cause ambiguity with local variables (which we can't currently disambiguate) if they are allowed to have the same name.
> I don't think making it a keyword is too harsh though, certainly no worse than in/out, and length/size/extent are all valid identifiers.

I was thinking, if the scope inside [] was the scope of the array, then you could reference it's properties/methods without the name of the array, kinda like the 'with' block/statement does.

The same sort of rules that apply to 'with' would apply to this.

eg.

char[] p;
p[0..length];

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
July 07, 2004
"Regan Heath" <regan@netwin.co.nz> wrote in message news:opsaqiouic5a2sq9@digitalmars.com...
> On Wed, 07 Jul 2004 02:03:59 +1200, Sam McCall <tunah.d@tunah.net> wrote:
> > Norbert Nemec wrote:
> >
> >> Ben Hinkle wrote:
> >>
> >>> MATLAB uses "size" for dims/range and "numel" for length/volume
> >>
> >> True, but size is already taken for the physical size
> > It's deprecated (and gives a fatal error IIRC?)
> >
> >> and "numel" - well
> >> probably just a matter of taste and getting used to it... I just think
> >> "volume" is easier to recognize.
> > Volume is nice, I like the metaphor.
> >
> >> OK, forget about the "$" - how about that alternative proposal: "range"
> >> as
> >> an argumentless, unqualified indentifier defined only within
> >> indexing/slicing expressions. I don't know how it would fit into the
> >> parser
> >> to have this identifier stick out without being a keyword (maybe as
> >> argumentless function?) but there might be some simple way.
> > If it is going to be used in this way (which I think is a good idea), I
> > think that in practice that would mean making it a keyword, because as a
> > de facto variable it could shadow or at least cause ambiguity with local
> > variables (which we can't currently disambiguate) if they are allowed to
> > have the same name.
> > I don't think making it a keyword is too harsh though, certainly no
> > worse than in/out, and length/size/extent are all valid identifiers.
>
> I was thinking, if the scope inside [] was the scope of the array, then you could reference it's properties/methods without the name of the array, kinda like the 'with' block/statement does.
>
> The same sort of rules that apply to 'with' would apply to this.
>
> eg.
>
> char[] p;
> p[0..length];

nifty. I like it. I'm also starting to like "length" in place of "range" to
be more like dynamic arrays. The length property for an n-dimensional
dynamic array is a static array of the lengths in each dimension. Continuing
your example:
 char[] p;
 p[0..length];
 char[[2]] q; // proposed syntax for N-D dynamic arrays
 q[0..length[0], 0..length[1]]

It looks very natural to me and nicely extends the existing dynamic array
concept.
The only technical challenge I can think of is that "with" blocks are
statements and indexing expressions are expressions so some funky bridging
might be needed inside the compiler.

>
> Regan
>
> -- 
> Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/


July 07, 2004
Ben Hinkle wrote:

>> I was thinking, if the scope inside [] was the scope of the array, then you could reference it's properties/methods without the name of the array, kinda like the 'with' block/statement does.
>>
>> The same sort of rules that apply to 'with' would apply to this.
>>
>> eg.
>>
>> char[] p;
>> p[0..length];
> 
> nifty. I like it. I'm also starting to like "length" in place of "range" to be more like dynamic arrays.

One problem I see: for 1-dimensional arrays, range is of type int[1] while length is of type int - this was the original reason for introducing the new name "range".

Furthermore, assignments to "range" are raw-assignments, while assignments to length contain the magic of reallocation.

Maybe "range" takes a bit to get used to, but mixing it with the existing "length" property would really be oversimplifying a bit.

Anyhow: for the problem at hand, "length" might actually do even better!

If we follow that implicit "with" idea, everything would be fine for 1-dim arrays. For N-dim arrays (with N!=1), the ".length" does not exist, so we could do a bit more magic and introduce it within indexing expressions with an automatic index. Ben Hinkles example:

>  char[] p;
>  p[0..length];
>  char[[2]] q; // proposed syntax for N-D dynamic arrays
>  q[0..length[0], 0..length[1]]

would thereby be simplified to:

char[] p;
p[0..length];
char[[2]] q; // proposed syntax for N-D dynamic arrays
q[0..length, 0..length]

Effectivly, length would have become a replacement for the hated $ in the original idea. Intuitively understandable and with clean syntactic and semantic meaning (based on the implicit "with")