No household is perfect (page 4)

December 03, 2013

Re: No household is perfect

Posted by H. S. Teoh
in reply to Brad Anderson

Permalink

H. S. Teoh

Posted in reply to Brad Anderson

Permalink

On Tue, Dec 03, 2013 at 09:19:34PM +0100, Brad Anderson wrote:
> On Tuesday, 3 December 2013 at 20:06:49 UTC, Walter Bright wrote:
> >On 12/3/2013 4:41 AM, Russel Winder wrote:
> >>Yes.
> >>
> >>	a + b
> >>
> >>could be set union, logic and, string concatenation. The + is just a message to the LHS object, it determines what to do. This is the whole basis for DSLs.

Ugh. Ugh, ugh, ugh. This beckons to that horrid decision in C++'s <iostream> of overloading << to mean "output" and >> to mean "input". The only redeeming quality about this is that << and >> are relatively rarely used in their original sense (bitwise shifts), so it doesn't cause as much of a cognitive dissonance as it otherwise might. But still. Ugh. There are just so many things wrong with this choice, not the least of which is the fact that the operator precedence of << and >> makes no sense when used as I/O operators -- because said operators were never intended to be I/O in the first place!! This leads to such fun as:

	int a, b;
	cout << a < b;	// what does this do?
			// (hint: it does NOT output the value of a < b)

Ugh!

> >Using operator overloading to create a DSL is just wrong. Part of the design of operator overloading in D is to deliberately frustrate such attempts.

+1.

> >+ should mean addition, not union, concatenation, etc. Overloading is there to support addition on user defined types, not to invent new meanings for it.

There's a C++ library that overloads the *comma operator* (!!) to allow
you to do things like this:

	// Creates a 3x4 matrix (!)
	A = 1, 2,  3, 4,
	    5, 6,  7, 8,
	    9, 10, 11, 12;

Now, this particular example looks rather cute, but let's say we want to compute matrix elements as we construct it:

	// Creates a 3x4 matrix (what, really?!)
	A = x++, y++, z++, f(x+y),
	    y+2*x-z, 4*y, 5*(z-y*x),
	    f(x)-f(y), f(z), g(x), 0;

Seriously?? Anyone who understands what a comma operator is (which is itself already a Bad Idea) might imagine this is but a needlessly obscure way of setting A to 0 while performing a whole bunch of side-effects, in a way fitting for an IOCCC entry.

(And just in case you wonder: the dimensions of the matrix are determined beforehand. So technically, you *could* create a 3x4 matrix using this code:

	// Yes this is still a 3x4 matrix... and yes the first row
	// contains 1, 2, 3, 4, and the second row starts with 5.
	// Obvious, isn't it?
	A = 1,  2,  3,
	    4,  5,  6,
	    7,  8,  9,
	    10, 11, 12;

Or, indeed, this:

	// This is a 3x4 matrix too, even though it sure doesn't look
	// anything like it!!
	A = 1, 2, 3, 4,  5,  6,
	    7, 8, 9, 10, 11, 12;

Please, somebody tell me how this can even remotely be construed to be a good thing.)

Not to mention, the meaning of such code depends entirely up the type of A. What if I have another custom type that also overloads the comma operator, in a slightly different way? Then the semantics of the above snippets would be *completely* different yet again.

Now tell me again, why is C++ code so hard to maintain? Hmmm...

> >Embedded DSLs should be visually distinct, and D provides the ability for that with string mixins and CTFE.

String mixins + CTFE = teh r0ckz when it comes to DSLs.

After having experienced C++ for a decade or two, I've come to decide that operator overloading is a Bad Idea(tm), except when it applies strictly to custom numerical types that are intended to behave like built-in numerical types. All other uses of operator overloading are, strictly speaking, abusive, and lead to unmaintainable code. Yes, it's cute and clever, and lets you write things not supported by the language "directly", but the next person to inherit your code will curse your name when they spend 5 hours trying to figure out exactly why x+y didn't do what they thought it did. And that's just with *one* library that overloads operators in an unusual way. Now add a second, third, fourth library, each of which overloads the operators in an unusual way, and you might as well be submitting your code as IOCCC entries (except that they don't take C++ entries).

OTOH, I completely understand the desire for infix notation for
operators on custom types. If you're writing a set library, it sucks to
have to write a.union(b.intersection(c)) when what you *really* want is
to write: a ∪ (b ∩ c). Here is where D does it right: use a compile-time
string argument to a CTFE function that transforms this string into
code. Then you can write:

	Set a, b, c;
	auto d = mixin(SetExpr!"a ∪ (b ∩ c)");
		// The above line gets turned into:
		// auto d = a.union(b.intersection(c));
		// at compile-time.

So you can write your set expressions the "natural" way, *and* a new reader of your code will know to look for SetExpr's documentation to understand what the string argument does (not to mention it being amply clear that a DSL is involved here, rather than code that looks like normal numerical expressions but actually does something else).

This has even more benefits than fixing C++'s wrong approach, though:

For one thing, overloaded operators can't easily generate optimal code, because they just get translated into nested function calls. In order to be able to optimize, say, a ∪ a ∪ a into a no-op, in C++'s approach you'd have to resort to arcane black magic like expression templates to coax the compiler to do what you want. In D, you are parsing the expression as a *string*, which means you get to define how the string is parsed, and how it is to be transformed into code, *directly*. You can run the expression tree through an expression simplifier algorithm, for example, factor common subexpressions, reduce it using known identities, etc.. All of which, granted, can be done by expression templates, except with many more times the pain, proneness to bugs, and unmaintainability.

These string DSLs also let you define your own operators (like I did above) without needing to abuse existing operators like + and *, define your own operator precedence rules, define custom syntax without needing to twist and warp it to conform to host language syntax (like that C++ regex library, which honestly makes me cringe every time I look at its contorted syntax).

> >Part of my opinion for this comes from C++ regexes done using expression templates. It's cute and clever, but it's madness. For one, any sort of errors coming out of it if a mistake is made are awesomely incomprehensible. For another, there's no clue in the source code when one has slipped into DSL-land, and suddenly * doesn't mean pointer dereference, it means "0 or more".
> >
> >Utter madness.

Yeah, that library, while admittedly very clever, is total madness. It looks *nothing* like what regexen normally look like, does something completely unlike what its surface syntax might suggest, and is in pretty much every way very difficult to understand, and therefore hard to maintain and prone to bugs. In today's software development world, where there's too much code to comprehend and too little time to comprehend it, dissociating syntax from its usual meaning is just asking for maintenance nightmares.

> Indeed. I had a regex bottleneck in a C++ program so I figured I'd just convert it to Boost Xpressive as an easy solution. It took me half a day to convert the regular expression into the convoluted single line of code with dozens of operators it became. It did run faster (phew!) so it was worth it but the code is unrecognizable as a regular expression and I have to keep a comment with the original regular expression in the code because nobody (myself included) should have to spend an ungodly amount of time trying to decipher the cryptic source code it became.
> 
> If my program were written in D I would have just replaced "regex(" with "ctRegex!(" and moved on with my day.

Yeah!! Props to std.regex!

T

-- 
Why can't you just be a nonconformist like everyone else? -- YHL

On 12/3/13 7:23 PM, monarch_dodra wrote:
> On Tuesday, 3 December 2013 at 20:09:52 UTC, Ary Borenszweig wrote:
>> On 12/3/13 4:53 PM, Andrei Alexandrescu wrote:
>>> On 12/3/13 4:41 AM, Russel Winder wrote:
>>>> On Tue, 2013-12-03 at 13:29 +0100, Tobias Pankrath wrote:
>>>> […]
>>>>> Does scala have arbitrary operators like Haskell? Looks useless
>>>>> in D. If you have an operator '+' that should not be pronounced
>>>>> 'plus' you are doing it wrong.
>>>>
>>>> Yes.
>>>>
>>>>    a + b
>>>>
>>>> could be set union, logic and, string concatenation. The + is just a
>>>> message to the LHS object
>>>
>>> or RHS :o).
>>
>> How come?
>
> "opBinaryRight":
> http://dlang.org/operatoroverloading.html
>
> It's a "neat" feature that allows operators being member functions, yet
> still resolve to the right hand side if needed. For example:
> auto result = 1 + complex(1, 1);
>
> Will compile, and be re-written as:
> auto result = complex(1, 1).opBinaryRight!"+"(1);
>
> In contrast, C++ has to resort to non-member friend operators to make
> this work.

That's nice.

Of course, it's not needed if you overload "+" for the int type to receive a complex.

Forums