April 30, 2012
On 30-04-2012 05:34, Jonathan M Davis wrote:
> On Monday, April 30, 2012 05:20:21 Alex Rønne Petersen wrote:
>> Except we can't do:
>>
>> @test
>> void myTest()
>> {
>>       // ...
>> }
>>
>> due to the current lack of annotations and discovery-based reflection. ;)
>>
>> *hint hint...*
>
> What would that buy you? That's what unittest blocks are for. Yes, getting
> custom annotations would be great, and then you could use @test for whatever
> you need it for, but unittest blocks are the test functions, so I'm not quite
> sure what @test would buy you. You _could_ just put a particular string in a
> name though (e.g. start all unit test functions with test) and use compile-
> time reflection to find them if you really want to though (much as that isn't as
> nice as a user-defined attribute would be).
>
> - Jonathan M Davis

See my other reply to H. S. Teoh for why I'm not a fan of unittest blocks.

-- 
- Alex
April 30, 2012
On Sun, Apr 29, 2012 at 04:40:37PM +0200, Jacob Carlborg wrote: [...]
> * Do-while loops, how useful are those actually?

OK, that got me all riled up about looping constructs, and now I'm provoked to rant:

Since the advent of structured programming, looping constructs have always been somewhat b0rken. Yes they worked very well, they are clean, powerful, gets rid of almost all cases of needing goto's, etc.. But the point of a loop is that *the entry point does not always correspond with the exit point (where the loop condition is tested)*. The problem with both do-loops and while-loops is that they are just straitjacketed versions of a more general looping construct.

At its most fundamental level, a loop consists of an entry point, a loop body, and some point within the loop body where the loop exits. Sometimes you need multiple exits, but most cases only need a single exit point. So a loop (unrolled) might look like this:

	codeBeforeLoop();
	// Loop body begins here
	doSomething();
	if (!loopCondition)
		goto exitLoop;
	doSomethingElse();
	doSomething();
	if (!loopCondition)
		goto exitLoop;
	doSomethingElse();
	...
	exitLoop:
	codeAfterLoop();

Note that I deliberately split the loop body into two parts, before the loop condition and after the loop condition. This is because you often have code like this:

	auto line = nextLine();
	while (!eof()) {
		processLine(line);
		line = nextLine();
	}

Notice the duplicated nextLine() call. That's stupid, the initial call to nextLine() is actually already the beginning of the loop. Why do you need to repeat it again inside the loop? Because while-loops always tests the loop condition at the beginning of the loop. OK, what about using a do-loop instead?

	do {
		auto line = nextLine();
		if (!eof())
			processLine(line);
	} while (!eof());

OK, we got rid of the duplicated nextLine() call, but now we introduced a duplicated eof() test. This is also stupid. The only reason it has to be written in this stupid way is because do-loops always test the loop condition at the end of the loop body, but you need to evaluate the condition BEFORE the second half of the loop body (processLine) is run. But once you've evaluated that condition, the test at the end of the loop body is redundant.

So you might say, OK, just write this then:

	for(;;) {
		auto line = nextLine();
		if (eof())
			break;
		processLine(line);
	}

Well, finally you have something sane. The loop condition now correctly appears in the middle of the loop body, which is where it should've been all along. Only problem is, writing for(;;) is misleading, because you're not looping indefinitely, there's precisely one exit condition. Conveying intent is very important in writing good code, and this code breaks that principle.

So really, what is needed is a sane looping construct that unifies while loops, do-loops, and exit-in-the-middle loops. Something like this:

	loop {
		// first part of loop body
	} exitWhen(!loopCondition) {
		// second part of loop body
	}

It doesn't have to be this exact syntax, but the point is that you need to be able to express loop exits from the middle of a loop body without needing to resort to for(;;) and break, when the loop is a good ole simple loop with a single entry point and a single exit point. (More complex loops that need continue's and break's are, well, more complex, so it's OK to sprinkle if conditions in them like we do now.)

</rant>


[...]
> * Infix notation for calling any method taking one argument * Basically any symbol is allowed in method names
> 
> That is:
> 
> 1 + 2
> foo bar foo_bar
> 
> Would be translated to:
> 
> 1.+(2)
> foo.bar(foo_bar)
> 
> That is a very general way to handle operators and let the user create new operators, not just overloading existing ones.
[...]

While personally, I like the idea of being able to create new infix operators, it will cause too big a change to D syntax, and probably cause lots of breakages with existing code, as well as make the lexer/parser much harder to implement (given the existing D features).

You also then have to deal with operator precedence between arbitrary user-defined operators, which is non-trivial in general (though it's workable if you impose some constrains -- but it's probably way beyond the scope of D2 or even D3).


T

-- 
Trying to define yourself is like trying to bite your own teeth. -- Alan Watts
April 30, 2012
On Mon, Apr 30, 2012 at 01:57:58AM +0200, Alex Rønne Petersen wrote:
> On 30-04-2012 01:54, Era Scarecrow wrote:
> >On Sunday, 29 April 2012 at 15:07:26 UTC, David Nadlinger wrote:
> >>On Sunday, 29 April 2012 at 14:40:38 UTC, Jacob Carlborg wrote:
[...]
> >>We'd still need a solution for continue and break, though.
> >
> >A thought coming to mind regarding this, is a special exception. If the compiler recognizes it's part of a foreach (or other loop) then continue gets converted to return, and and break throws an exception (caught by the foreach of course)
> >
> >But that involves more compiler magic.
> >
> 
> And forced EH which is unacceptable in e.g. a kernel.
[...]

Here's a wild idea: introduce the concept of the multi-return function (or, in this case, delegate). Unlike normal functions which returns to a single point when they end, usually by pushing the return address onto the runtime stack, a multi-return function is called by pushing _multiple_ return addresses onto the runtime stack. The function body decides which return address to use (it can only use one of them).

Implementing break/continue then can be done like this: the loop body delegate will be a multi-return delegate, i.e., a delegate whose caller will provide multiple return addresses: one for each possible break/continue point.  For example:

	outerLoop: foreach (a; containerA) {
		innerLoop: foreach (b; containerB) {
			if (condition1())
				continue outerLoop;

			if (condition2())
				break /* innerLoop */;

			if (condition3())
				break outerLoop;
		}
		if (condition4())
			break /* outerLoop */;
	}

When containerA.opApply is called, it calls the outer loop body with two return addresses: the first is the usual return address from a function call, the second is the cleanup code at the end of opApply that performs any necessary cleanups and then returns. I.e., it simulates break. So when condition4 triggers, the outerLoop delegate returns to the second address.

Now the outerLoop delegate itself calls containerB.opApply with a list of outer loop return addresses, i.e., (1) return to containerA.opApply's caller, that is, break outerLoop, (2) return to outerLoop delegate's cleanup code, i.e. continue OuterLoop. Then containerB.opApply prepends two more return addresses to this list, the usual return address to containerB.opApply, and its corresponding break return (prepend because the last return addressed pushed must correspond with the immediate containing scope of the called delegate, but if the delegate knows about outer scopes then it can return to those).

Now the innerLoop delegate has four return addresses: return to
containerB.opApply normally (continue innerLoop), return to
containerB.opApply's cleanup code (break innerLoop), return to
containerA.opApply normally (continue outerLoop), and return to
containerA.opApply's cleanup code (break outerLoop). So break/continue
works for all cases.

(Of course, containerB.opApply doesn't actually just prepend return addresses to what the outerLoop delegate passes in, since when the innerLoop delegate wants to break outerLoop, it needs to cleanup the stack frames of the intervening call to containerB.opApply too. So containerB.opApply needs to insert its own cleanup code into the return address chain.)

This is, at least, the *conceptual* aspect of things. In the actual implementation, the compiler may use the current scheme (return an int to indicate which return is desired) instead of doing what amounts to reinventing stack unwinding, but the important point is, this should be transparent to user code. As far as the user is concerned, they just call the delegate normally, no need to check return codes, etc., with the understanding that the said delegates have multiple return addresses. The compiler inserts the necessary scaffolding, checking for int returns, etc., to actually implement the concept.


T

-- 
WINDOWS = Will Install Needless Data On Whole System -- CompuMan
April 30, 2012
On Sunday, 29 April 2012 at 20:16:16 UTC, Andrej Mitrovic wrote:
> On 4/29/12, Alex Rønne Petersen <xtzgzorex@gmail.com> wrote:
>> Next up is the issue of op-assign operations. In D, you can't do:
>>
>> obj.foo += 1;
>> obj.foo++;
>>
>> while in C#, you can (it results in a get -> add 1 -> set and get -> inc
>> -> set, etc).
>
> It's great to see another (successful) language implemented this. Do
> we have a proposal open for this somewhere?

I believe DIP 4 is the proposal

http://www.prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs

But don't forget to check out DIP 5 too.
April 30, 2012
On Sun, Apr 29, 2012 at 10:26:40PM +0200, Alex Rønne Petersen wrote:
> On 28-04-2012 22:43, H. S. Teoh wrote:
[...]
> >with statements. They make code hard to read, and besides you can (or should be able to) alias long expressions into a short identifier for this purpose anyway.
> 
> I don't think I agree entirely here. If you have very long sequences of statements operating on the same object, with can be very useful. That said, I recognize that the current implementation of with needs some work.
[...]

I think the correct solution here is to use alias. (If that doesn't work, then it should be made to work. It's a lot cleaner and doesn't introduce potentially nasty ambiguities into code, as well as make code more readable without needing to implement nested symbol tables in your brain.)


T

-- 
Help a man when he is in trouble and he will remember you when he is in trouble again.
April 30, 2012
On Mon, Apr 30, 2012 at 05:16:55AM +0200, Alex Rønne Petersen wrote:
> On 30-04-2012 05:03, H. S. Teoh wrote:
[...]
> >Also, unittest is just that: for _unit_ tests. If you start needing an entire framework for them, then you're no longer talking about _unit_ tests, you're talking about module- or package-level testing frameworks, and you should be using something more suitable for that, not unittest.
[...]
> 
> The problem with D's unit test support is that it doesn't integrate well into real world build processes. I usually have debug and release configurations, and that's it. No test configuration; not only does that over-complicate things, but it also isn't really useful. I want my unit testing to be exhaustive; i.e. I want to test my code in debug and release builds, because those are the builds people are going to be using. Not a test build.

I see. Personally, I just use a more flexible build system where I can specify an argument to indicate whether or not to compile with -unittest. But you may have other reasons why this is not a good thing.


> So, this means that writing unit tests inline is a no-go because that
> would require either always building with unit tests in all
> configurations (madness) or having a test configuration (see above).

I wonder if dmd (or rdmd) should have a mode where it *only* compiles
unittest code (i.e., no main() -- the resulting exe just runs unittests
and nothing else).

But then again, it sounds like you have an extensive testing framework in place, and if you have that already, then might as well use it.


> Given the above, I've resorted to having a "tester" executable which links in all libraries in my project and tests every module. This means that I have to write my unit tests inside this helper executable, making much of the gain in D's unittest blocks go away.
> 
> And no, the fact that I link libraries into the helper executable doesn't mean that I can just write the unit tests in the libraries in the first place. Doing so would require building them twice: Once for the normal build and once for the "tester" executable.
> 
> (And yes, build times matter when your project gets large enough, even
> in D.)
[...]

True.


T

-- 
"I speak better English than this villain Bush" -- Mohammed Saeed al-Sahaf, Iraqi Minister of Information
April 30, 2012
On Monday, 30 April 2012 at 01:08:28 UTC, Jonathan M Davis wrote:
> On Sunday, April 29, 2012 17:50:48 Don wrote:
>> * package. I have no idea how a failed Java experiment got incorporated
>> into D.
>
> Really? In some cases, it's indispensable. For instance, once std.datetime has
> been split up, it will require it, or it would have duplicate a bunch of
> implementation-specific stuff which has no business in the public API.

But what happens if std.datetime grows so large that you want to have e.g. a std.datetime.system package, the content of which is accessible from std.datetime.*, but not the rest of the world?

This is not an artificial problem, e.g. consider Thrift, where I have e.g. thrift.internal.endian (predating endian stuff in Phobos) which is used from modules in thrift.async, thrift.server, and thrift.protocol, or thrift.internal.socket (containing some more OS abstractions than std.socket does), which is used from modules in thrift.async, thrift.server, and thrift.transport, but both are not part of the public API.

The logical »package« to restrict access to would be »thrift.*« here, but there is no way to restrict access to that, I have to resort to hoping users understand that they should not use »thrift.internal.xyz« directly. Phobos has the same problem with std.internal.

I think in Java this problem was the reason for »super packages« to be discussed which (I think, haven't followed it the developments closely) ended up being incorporated in the upcoming Module System feature.

David
April 30, 2012
On Sunday, April 29, 2012 21:56:08 H. S. Teoh wrote:
> I wonder if dmd (or rdmd) should have a mode where it *only* compiles
> unittest code (i.e., no main() -- the resulting exe just runs unittests
> and nothing else).

It wouldn't make sense. It's nowhere near as bad as C++, but dmd has to recompile modules all the time unless you compile the entire program at once. When you run a build, every single module on the command line and all of the imported modules get compiled. Object code is generated only for those on the command line, but the others are still compiled. Any imported module which uses a .di file won't have as much to compile, and any templated code that doesn't get used in those modules won't get compiled, but there's still lots of recompilation going on if you compile your program incrementally. And D just isn't set up to compile only a portion of a module.

- Jonathan M Davis
April 30, 2012
On Monday, April 30, 2012 06:58:18 David Nadlinger wrote:
> On Monday, 30 April 2012 at 01:08:28 UTC, Jonathan M Davis wrote:
> > On Sunday, April 29, 2012 17:50:48 Don wrote:
> >> * package. I have no idea how a failed Java experiment got
> >> incorporated
> >> into D.
> > 
> > Really? In some cases, it's indispensable. For instance, once
> > std.datetime has
> > been split up, it will require it, or it would have duplicate a
> > bunch of
> > implementation-specific stuff which has no business in the
> > public API.
> 
> But what happens if std.datetime grows so large that you want to have e.g. a std.datetime.system package, the content of which is accessible from std.datetime.*, but not the rest of the world?
> 
> This is not an artificial problem, e.g. consider Thrift, where I have e.g. thrift.internal.endian (predating endian stuff in Phobos) which is used from modules in thrift.async, thrift.server, and thrift.protocol, or thrift.internal.socket (containing some more OS abstractions than std.socket does), which is used from modules in thrift.async, thrift.server, and thrift.transport, but both are not part of the public API.
> 
> The logical »package« to restrict access to would be »thrift.*« here, but there is no way to restrict access to that, I have to resort to hoping users understand that they should not use »thrift.internal.xyz« directly. Phobos has the same problem with std.internal.
> 
> I think in Java this problem was the reason for »super packages« to be discussed which (I think, haven't followed it the developments closely) ended up being incorporated in the upcoming Module System feature.

I'm not claiming that package solves all such issues (and you do give some good examples where it would be a problem), but not having it would make the problem even worse. I think that this is one area though where the Windows folks take advantage of their export nonsense and just don't export the internal stuff.

- Jonathan M Davis
April 30, 2012
On Monday, 30 April 2012 at 03:16:09 UTC, H. S. Teoh wrote:
> On Sun, Apr 29, 2012 at 02:26:12PM +0200, David Nadlinger wrote:
>> […]
>>  - Built-in arrays and AAs: They are convenient to use, but as far as
>>  I can see the single biggest GC dependency in the language. Why not
>>  lower array and AA literals to expression tuples (or whatever) to
>>  make the convenient syntax usable with custom (possibly non-GC safe)
>>  containers as well? A GC'd default implementation could then be
>>  provided in druntime, just like today's arrays and AAs.
> [...]
>
> AA's are moving into druntime. Yours truly is supposed to make that
> happen eventually, but lately time hasn't been on my side. :-/

This moves the _implementation_ to druntime, but there is still realistically no way to use AA literals with my own non-GC'd version of hash maps without shipping a custom druntime (and thus modifying the semantics of an existing language construct). What I'm talking about would let you do things like

MyVector!int stuff = [1, 2, 3, 4, 5];

without needing a (temporary) GC'd allocation, and thus please the GC-hater crowd because they can still have all the syntax candy with their own containers, even if they can't resp. don't want to use the default GC'd constructs.

David