View mode: basic / threaded / horizontal-split · Log in · Help
December 21, 2006
Re: DMD 0.177 release [Length in slice expressions]
Andrei Alexandrescu (See Website For Email) wrote:
> Another way out of it is to ban "length" but stick with "$". But "$" has 
> another bunch of problems. It's a special character used only once, and 
> only in a very particular situation. There is no general concept 
> standing behind its usage: it sticks out like a sore thumb. "$" isn't 
> the last index in an array. It's that only when used inside a slice, and 
> refers only to the innermost index of the array. Quite a waste of a 
> special character out there, and to little usefulness.
> 
> But if we made "$" into an operator identifying the last element of 
> _any_ array, which could refer to the last element of _the left-hand 
> side_ array if we so want, then all of a sudden it becomes useful in a 
> myriad of situations:

Provided that some such expansion path for "$" exists, it would seem to 
be adequate for D 1.0 to just remove "length". And this could be done by 
Jan 1.
December 21, 2006
Re: DMD 0.177 release [Length in slice expressions]
Andrei Alexandrescu (See Website For Email) wrote:

> But if we made "$" into an operator identifying the last element of 
> _any_ array, which could refer to the last element of _the left-hand 
> side_ array if we so want, then all of a sudden it becomes useful in a 
> myriad of situations:
> 
> int i = a[$ - 1]; // get last element
> int i = a[$b - 1]; // get a's element at position b.length - 1
> if (a[$ - 1] == x) { ... }
> if ($a > 0) { ... }
> if ($a == $b) { ... }
> swap(a[0], a[$ - 1]); // swap first and last element

Please give some thought to the case where a and b are of types not 
easily characterized by a single '.length'.  Matrix classes, or more 
generally multidimensional array classes being the canonical examples. 
 For those cases it is desirable to be able to have a '$' with 
different meaning "per axis".

For those cases a we could have a small extension to your proposal. Have 
$b translate to b.length, yes, but also have $[3]b and $(1)b translate 
to to b.length[3] and b.length(1), respectively.  Seeing that, it makes 
me think perhaps $ would be better as a post-fix unary operator.  Then 
we'd have b$ --> b.length  and b$[3] --> b.length[3].

Then of course the next step is to have a parameter number automatically 
passed to the length method given and expression like a[$-1,$-1] so that
    a[$-1,$-1]
    ==>  a[$[0]-1,$[1]-1]
    ==>  a[a$[0]-1,a$[1]-1]
    ==>  a[a.length[0],a.length[1]]

The compiler can decide whether to do indexing or not based on whether 
.length results in an indexable value.

Finally, in general I think the choice of name 'length' is unfortunate 
because of it's implication of linearity.  But it's not too late.  If $ 
becomes associated with .size rather than .length in user types then 
everything will be ok.  For built-in arrays .length can become a synonym 
for .size, just as it is with std::string in C++.  C++/STL got this one 
right.  For generic containers .size is a much better name.

--bb
December 21, 2006
Re: DMD 0.177 release
Benji Smith wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> Let me illustrate further why ident is important and what solution we 
>> should have for it. Consider C's response to ident:
>>
>> #define IDENT(e) (e)
>>
>  > ...
>  >
>> ...leading to the following implementation of ident:
>>
>> auto ident(auto x) {
>>   return x;
>> }
> 
> I don't get it.
> 
> Why is it necessary (or even desirable) for functions to return lvalues?
> 
> I can see how it'd be an interesting trick, and I can appreciate the 
> experimental curiosity about how the language (and the implementation) 
> should cope with the explicit handling of lvalues.
> 
> But I can't think of a real-world use case.
> 
> Are there languages where this is currently possible? How do they 
> implement it? And, much more importantly, what do people use it for?
> 
> --benji
FWIW,
In PL/1 the substring function could be used as an lvalue. 
(Analogously, in D one can use array slices as lvalues.)

I can't remember whether PL/1 allowed one to use this feature 
to delete characters, or only to alter them, but in Python one 
can use this feature to delete characters (or replace them 
with something that isn't a character?).
December 21, 2006
Re: DMD 0.177 release
Pragma wrote:
> Stewart Gordon wrote:
>> Pragma wrote:
>> <snip>
>>> But I'd like to echo the other comments in this thread regarding 
>>> structs.  IMO, we're not there yet.  I think folks are looking for a 
>>> solution that does this:
>>>
>>> - A ctor like syntax for creating a new struct
>>> - No more forced copy of the entire struct on creation
>>
>> What do you mean by this?
> 
> I'm glad you asked. :)
> 
> Static opCall() is not a ctor.  It never was.  People have been 
> clamoring to be able to use this() inside of a struct, much like they 
> can with classes and modules.  But the desire here goes beyond mere 
> symmetry between type definitions.
> 
> The forced copy issue is something that is an artifact of emulating a 
> constructor for a struct.  Take the standard approach for example:
> 
> struct Foo{
>   int a,b,c;
> }
> 
> Foo f = {a:1, b:2, c:3};
> Foo f = {1,2,3}; // more succinct version
> 
> So here we create a struct in place, and break encapsulation in the 
> process.  What we really want is an opaque type, that has a little more 
> smarts on creation.  Taking advantage of in/body/out would be nice too. 
>  No problem, we'll just use opCall():
> 
> struct Foo{
>   int a,b,c;
>   static Foo opCall(int a,int b,int c){
>     Foo _this;
>     _this.a = a;
>     _this.b = b;
>     _this.c = c;
>     return _this;
>   }
> }
> 
> Foo f = Foo(1,2,3);
> 
> That's better, but look at what's really happening here.  Inlining and 
> compiler optimization aside, the 'constructor' here creates a Foo on the 
> stack which is then returned and *copied* to the destination 'f'.

If that's not compiled into a direct write (even to the point of keeping 
the value virtual unless if it's actually needed in contiguous memory) 
then there's something wrong with the optimiser.

> To most, that won't ever seem like a problem. But for folks who are 
> working with Vector types or Matrix implementations, that's something to 
> scream about.  In a nutshell, any struct wider than a register that is 
> populated in the 100's to 1000's is wasting cycles needlessly.

I've never liked doing that - if you're going to have very large vectors 
or matrices, it's usually better just to switch to a programmatic model 
(run-time sized) rather than keeping it parametric (templated). At some 
point you're spending more time compiling than you are executing.

This requires duplication of a lot of code, which is unacceptable. I've 
been thinking about this problem for a long time and I still have no 
solution to it (mixins are so very not the right way); perhaps we're 
misconsidering how parametric and programmatic types should interact.

Let's say that the values used in the constructor of a type are 
recorded. So we might have (excuse the language):

	#!/usr/bin/moki --version=1

	#using: #moki.(size, static array);

	Matrix := #class
	{
		data : static array of (type, rows, cols);

		// No content needed.
		#this (type, rows : size, cols : size);
	};

Now "rows" and "cols" are both automatically constant properties of any 
created Matrix. If our algorithm requires certain limitations on the 
matrix, we can "specialise" it:

	transpose (matrix : Matrix (type, rows, rows)) : Matrix (type, rows, rows)
	...

And the compiler can generally optimise the Matrix like it were 
parametric - it could even apply discretionary optimisation so that it 
doesn't waste compile time on what doesn't even matter - but you could 
still use it programmatically.

> So that brings us to something like this:
> 
> struct Foo{
>   int a,b,c;
>   this(int a,int b,int c){
>     this.a = a;
>     this.b = b;
>     this.c = c;
>   }
> }
> 
> Foo f = Foo(1,2,3);
> 
> Ambiguity aside, this fixes encapsulation, gives a familiar syntax, and 
> almost fixes the allocation issues.  (see below)

There's still an implied copy, but again, that shouldn't be relevant for 
any properly-working optimiser.

>>
>>> - Something that is disambiguated from static opCall
>> <snip>
>>
>> Do you mean that constructors for structs should have a notation
>> distinct from S(...)?
>>
> 
> Well, I think it's one of the reasons why we don't have ctors for 
> structs right now.  The preferred syntax for a "struct ctor" would 
> probably be this:
> 
> S foo = S(a,b,c);
> 
> Which is indistinct from "static opCall".  Throwing 'new' in there 
> wouldn't work either, since that would be a dynamic allocation:
> 
> S* foo = new S(a,b,c);

"new <struct>" didn't used to parse, and I argued then that it shouldn't 
because everywhere else the "new" operator exactly described the type it 
would create. If this special case were removed, it could work as a 
constructor call.

But really I think static opCall should just be killed off.

> So that leaves us with "something else" that provides both a way to 
> invoke a ctor, yet allocates the data on the stack and doesn't force you 
> to create an additional copy:
> 
> S foo(a,b,c);  // c++ style
> S foo = stackalloc S(a,b,c); // alloca() style (in place of new)
> S foo = new(stack) S(a,b,c): // another idea
>
December 21, 2006
Re: DMD 0.177 release [Length in slice expressions]
Pragma wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>>
>> A simpler grammar would have been to simply allow:
>>
>> UnaryExpression:
>>     PostfixExpression
>>     & UnaryExpression
>>     ... etc. etc. ...
>>     $ PostfixExpression
>>
>> But this would have been ambiguous. If the compiler sees "$-1", then 
>> the bad grammar says that's a unary use of $ because -1 is a 
>> PostfixExpression. But that's not what we wanted! We wanted $ to be 
>> nullary. That's why I needed to put all the cases in UnaryExpression.
>>
> 
> Nice post, and one heck of an argument!
> 
> FWIW, I advocated something similar during the last round of debates 
> before the '$' operator was introduced.  What I wanted to see was '$' to 
> become like 'this' within slice and array expressions, so that the 
> issues regarding 'length' could be resolved.  In essence one could 
> simply say '$.length' and mean 'the length of the current array':
> 
> b[0 .. $.length];
> a[0 .. $.getIndexOf(';')];
> 
> So in essence, every use of '$' would be a 'nullary' operator - an alias 
> if you will.
> 

I rather like this.  And I think I liked it then, too... if not, oh well.

> I'd imagine that extending things in this manner would simplify things 
> grammatically while allowing for a wider category of uses.  However, it 
> doesn't solve the issue that you brought up, and that I've quoted above.
> 
> c[$-1];
> 
> It looks like it should be an implicit cast of the '$' to a size_t 
> (length), via it's use in an expression.  Any thoughts on this?

If $ is like a 'this', then it ought to be have semantically the same, so if $ is a 
class/struct with an opCast to size_t defined, the obvious happens.  If its anything else, 
it ought to be a compile time error,  perhaps suggesting you had meant '$.length' instead.

-- Chris Nicholson-Sauls
December 21, 2006
Re: DMD 0.177 release
Russ Lewis wrote:
> Walter Bright wrote:
> 
>> More ABI changes, and implicit [] => * no longer allowed.
>>
>> http://www.digitalmars.com/d/changelog.html
>>
>> http://ftp.digitalmars.com/dmd.175.zip
> 
> Looks like casts from void* to struct* is broken.
> 
> Russ

Not sure what I did wrong or what I'm doing right now, but they seem to 
be working.  Sorry for any confusion I caused.
December 22, 2006
Re: DMD 0.177 release [Length in slice expressions]
Bill Baxter wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
> 

> Then of course the next step is to have a parameter number automatically 
> passed to the length method given and expression like a[$-1,$-1] so that
>     a[$-1,$-1]
>     ==>  a[$[0]-1,$[1]-1]
>     ==>  a[a$[0]-1,a$[1]-1]
>     ==>  a[a.length[0],a.length[1]]

Slight typo there.  Last line should of course have been:

      ==>  a[a.length[0]-1,a.length[1]-1]

> The compiler can decide whether to do indexing or not based on whether 
> .length results in an indexable value.
> 
> Finally, in general I think the choice of name 'length' is unfortunate 
> because of it's implication of linearity.  But it's not too late.  If $ 
> becomes associated with .size rather than .length in user types then 
> everything will be ok.  For built-in arrays .length can become a synonym 
> for .size, just as it is with std::string in C++.  C++/STL got this one 
> right.  For generic containers .size is a much better name.

Another thing which occurred to me is that if the meaning of $ becomes 
tied to "size" rather than "length", then then you also have the 
mnemonic of $ looking like an 's' as in 'size'.

I also still think making it a postfix operator makes sense.

--bb
December 22, 2006
Re: DMD 0.177 release [Length in slice expressions]
Chris Nicholson-Sauls wrote:
> Pragma wrote:
>> Andrei Alexandrescu (See Website For Email) wrote:
>>>
>>> A simpler grammar would have been to simply allow:
>>>
>>> UnaryExpression:
>>>     PostfixExpression
>>>     & UnaryExpression
>>>     ... etc. etc. ...
>>>     $ PostfixExpression
>>>
>>> But this would have been ambiguous. If the compiler sees "$-1", then 
>>> the bad grammar says that's a unary use of $ because -1 is a 
>>> PostfixExpression. But that's not what we wanted! We wanted $ to be 
>>> nullary. That's why I needed to put all the cases in UnaryExpression.
>>>
>>
>> Nice post, and one heck of an argument!
>>
>> FWIW, I advocated something similar during the last round of debates 
>> before the '$' operator was introduced.  What I wanted to see was '$' 
>> to become like 'this' within slice and array expressions, so that the 
>> issues regarding 'length' could be resolved.  In essence one could 
>> simply say '$.length' and mean 'the length of the current array':
>>
>> b[0 .. $.length];
>> a[0 .. $.getIndexOf(';')];
>>
>> So in essence, every use of '$' would be a 'nullary' operator - an 
>> alias if you will.

In both of those cases the use seems rather silly to me because a and b 
are both single characters to begin with.  Might as well just type
  b[0 .. b.length];
  a[0 .. a.getIndexOf(';')];
instead.  But I get the point.  Sometimes you have
  g_openSocketHandles[0 .. g_openSocketHandles.getIndexOf()]

But maybe just allowing 'this' in the brackets is enough there, without 
going on and abbreviating it to $.  The $==.length proposal at least has 
the advantage of being backwards compatible.

>>
> 
> I rather like this.  And I think I liked it then, too... if not, oh well.
> 
>> I'd imagine that extending things in this manner would simplify things 
>> grammatically while allowing for a wider category of uses.  However, 
>> it doesn't solve the issue that you brought up, and that I've quoted 
>> above.
>>
>> c[$-1];
>>
>> It looks like it should be an implicit cast of the '$' to a size_t 
>> (length), via it's use in an expression.  Any thoughts on this?
> 
> If $ is like a 'this', then it ought to be have semantically the same, 
> so if $ is a class/struct with an opCast to size_t defined, the obvious 
> happens.  If its anything else, it ought to be a compile time error,  
> perhaps suggesting you had meant '$.length' instead.
> 
> -- Chris Nicholson-Sauls

Not sure I like $==this as must as $==.length.  I have pressing need for 
a brief syntax for specifying the length, but no such thing for a 
shorter form of 'this'.  But anyway, if you're going to allow '$' to 
mean 'this' inside brackets, first you first need the language feature 
that allows 'this' to be used inside brackets in the first place.  And 
maybe if you have that you'll find it's sufficient.

Another thing is if you're going to allow 'this' in brackets, then you 
should take the idea to its logical conclusions and allow it in member 
function call parameter lists too.  That might be nice for things like 
enum paramters.

Of course if $ gets translated into a call to a method/property, you 
could have it your way if you prefer for your classes.  Just use
    opDollar() { return this; }
and voila! You can use your $.getIndexOf(';').

--bb
December 22, 2006
Re: DMD 0.177 release [Length in slice expressions]
Bill Baxter wrote:
> Chris Nicholson-Sauls wrote:
>> Pragma wrote:
>>> Andrei Alexandrescu (See Website For Email) wrote:
>>>>
>>>> A simpler grammar would have been to simply allow:
>>>>
>>>> UnaryExpression:
>>>>     PostfixExpression
>>>>     & UnaryExpression
>>>>     ... etc. etc. ...
>>>>     $ PostfixExpression
>>>>
>>>> But this would have been ambiguous. If the compiler sees "$-1", then 
>>>> the bad grammar says that's a unary use of $ because -1 is a 
>>>> PostfixExpression. But that's not what we wanted! We wanted $ to be 
>>>> nullary. That's why I needed to put all the cases in UnaryExpression.
>>>>
>>>
>>> Nice post, and one heck of an argument!
>>>
>>> FWIW, I advocated something similar during the last round of debates 
>>> before the '$' operator was introduced.  What I wanted to see was '$' 
>>> to become like 'this' within slice and array expressions, so that the 
>>> issues regarding 'length' could be resolved.  In essence one could 
>>> simply say '$.length' and mean 'the length of the current array':
>>>
>>> b[0 .. $.length];
>>> a[0 .. $.getIndexOf(';')];
>>>
>>> So in essence, every use of '$' would be a 'nullary' operator - an 
>>> alias if you will.
> 
> In both of those cases the use seems rather silly to me because a and b 
> are both single characters to begin with.  Might as well just type
>   b[0 .. b.length];
>   a[0 .. a.getIndexOf(';')];
> instead.  But I get the point.  Sometimes you have
>   g_openSocketHandles[0 .. g_openSocketHandles.getIndexOf()]
> 
> But maybe just allowing 'this' in the brackets is enough there, without 
> going on and abbreviating it to $.  The $==.length proposal at least has 
> the advantage of being backwards compatible.
> 
>>>
>>
>> I rather like this.  And I think I liked it then, too... if not, oh well.
>>
>>> I'd imagine that extending things in this manner would simplify 
>>> things grammatically while allowing for a wider category of uses.  
>>> However, it doesn't solve the issue that you brought up, and that 
>>> I've quoted above.
>>>
>>> c[$-1];
>>>
>>> It looks like it should be an implicit cast of the '$' to a size_t 
>>> (length), via it's use in an expression.  Any thoughts on this?
>>
>> If $ is like a 'this', then it ought to be have semantically the same, 
>> so if $ is a class/struct with an opCast to size_t defined, the 
>> obvious happens.  If its anything else, it ought to be a compile time 
>> error,  perhaps suggesting you had meant '$.length' instead.
>>
>> -- Chris Nicholson-Sauls
> 
> Not sure I like $==this as must as $==.length.  I have pressing need for 
> a brief syntax for specifying the length, but no such thing for a 
> shorter form of 'this'.  But anyway, if you're going to allow '$' to 
> mean 'this' inside brackets, first you first need the language feature 
> that allows 'this' to be used inside brackets in the first place.  And 
> maybe if you have that you'll find it's sufficient.

The problem with actually using the 'this' keyword in place of $ is one of ambiguity. 
Given a collection class 'Set' and some other class 'Foo', what to do if a 'this' is used 
within a slice of a 'Set' instance within a member of 'Foo'?  Does it evaluate to the Foo 
referance it would in all other cases?  Or to a Set referanc?  And if the latter, how to 
get the Foo referance if that really is what I wanted?

The $ would have to be different from 'this' in the classes' sense.  Perhaps it would be 
better to call it a 'self' or even a 'with' than a 'this'.

-- Chris Nicholson-Sauls
December 22, 2006
Re: DMD 0.177 release [Length in slice expressions]
Pragma wrote:
> b[0 .. $.length];
> a[0 .. $.getIndexOf(';')];
> 
> So in essence, every use of '$' would be a 'nullary' operator - an alias 
> if you will.

This isn't going to be agreeable to most since the purpose of $ in the 
first place was to save typing.

> I'd imagine that extending things in this manner would simplify things 
> grammatically while allowing for a wider category of uses.  However, it 
> doesn't solve the issue that you brought up, and that I've quoted above.
> 
> c[$-1];
> 
> It looks like it should be an implicit cast of the '$' to a size_t 
> (length), via it's use in an expression.  Any thoughts on this?

I'd rather have $ defined everywhere to mean length, which is useful 
outside [] as well.

Andrei

P.S. Maybe there's a misunderstanding? The grammar I sent does not have 
a problem w.r.t. unary vs. nullary; it's just a tad more complicated to 
avoid ambiguity.
16 17 18 19 20 21 22
Top | Discussion index | About this forum | D home