April 22, 2009
Don wrote:
> bearophile wrote:
>> Don:
>>> Yes. Actually, marking a nested function as pure doesn't make much sense.
>>> It's entirely equivalent to moving it outside the function; [...]
>>> I'm not sure that nested pure member functions should be legal.
>>
>> It's not fully equivalent to moving it out of the function because once you pull it out you add a name to the outer namespace: nested functions are useful to keep namespaces tidy too.
>> So I'd like to have nested pure functions too.
>>
>> pure int foo(int y) { return y + y; } // outer foo
>> pure void bar(int x) {
>>   pure int foo(int y) { return y * y; }
>>   return foo(x) * .foo(x);
>> }
>>
>> Thank you,
>> bye,
>> bearophile
> 
> That's true, but it seems quite difficult to get right. A pure nested function can in theory access immutable members in the outer function -- but must not access the parameters of the outer function.
> If there are no immutable members in the outer function, the compiler would ideally convert it into an external pure function, so that it
> doesn't need a frame pointer to the outer function. But it would need error messages for any use of mutable outer function members. Etc.
> It seems quite a lot of work for something of very limited use.
> Making it into a external, private pure function is almost the same.
> 
> 
Actually it's not so difficult. I've created a patch for bug 2807 -- it's only 5 lines long! It gives an error message if a nested pure function accesses a mutable variable from an outer scope.
April 22, 2009
Don:
> Actually it's not so difficult. I've created a patch for bug 2807 -- it's only 5 lines long! It gives an error message if a nested pure function accesses a mutable variable from an outer scope.

Thank you very much Don, your work helps a lot.

Every time I try a tiny program I find something I don't understand:

import std.stdio: writeln;
import std.math: sqrt;
import std.conv: to;

void main(string[] args) {
    double x = args.length == 2 ? to!(double)(args[1]) : 4.0;
    writeln(sqrt(x) + sqrt(x));
}

I have also tried with std.math.sin with similar results:

L0:		enter	8,0
		mov	EAX,8[EBP]
		cmp	EAX,2
		jne	L30
		cmp	EAX,1
		ja	L1B
		mov	EAX,6
		call	near ptr _D6test7__arrayZ
L1B:		mov	EDX,0Ch[EBP]
		mov	EAX,8[EBP]
		mov	EAX,8[EDX]
		mov	EDX,0Ch[EDX]
		push	EDX
		push	EAX
		call	near ptr _D3std4conv13__T2toTdTAyaZ2toFAyaZd
		jmp short	L36
L30:		fld	qword ptr FLAT:_DATA[00h]
L36:		fstp	qword ptr -8[EBP]
		fld	qword ptr -8[EBP]
		fsin
		fld	qword ptr -8[EBP]
		fsin
		faddp	ST(1),ST
		sub	ESP,0Ch
		fstp	tbyte ptr [ESP]
		call	near ptr _D3std5stdio14__T7writelnTeZ7writelnFeZv
		xor	EAX,EAX
		leave
		ret

Isn't sin(x)+sin(x) pure? Even if the compiler doesn't want to replace x+x with x*2 because x is a floating point, it can do:

y = sin(x)
y+y

And that gives the same result even with FPs.

Note: with SSE2 it's even possible to perform two sqrt(double) at the same time, so a compiler can implement sin(x)+sin(y) with a single instruction (SQRTSD) (plus eventually some register loading/unloading).

Bye,
bearophile
April 22, 2009
bearophile:
> so a compiler can implement sin(x)+sin(y) with a single instruction (SQRTSD)

I meant sqrt(x)+sqrt(y), sorry.
April 23, 2009
Changed sender name from screen name to real name.

On Thu, 23 Apr 2009 02:55:37 +0900, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>
> On Wed, 22 Apr 2009 13:49:18 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>
>> tama wrote:
>>> On Mon, 20 Apr 2009 16:09:09 +0900, Walter Bright
>>> <newshound1@digitalmars.com> wrote:
>>>
>>>>
>>>> This is a major revision to Phobos, including Andrei's revolutionary
>>>> new range support.
>>>>
>>>> http://www.digitalmars.com/d/2.0/changelog.html
>>>> http://ftp.digitalmars.com/dmd.2.029.zip
>>>
>>> Range is so cool!
>>>
>>> Though...
>>> I tried following code:
>>>
>>> void main()
>>> {
>>>     writeln("Case1");
>>>     {
>>>         Mt19937 gen = Mt19937(0);
>>>         writeln(gen.front);
>>>         gen.popFront;
>>>         writeln(gen.front);
>>>     }
>>>     writeln("---");
>>>     {
>>>         Mt19937 gen = Mt19937(0);
>>>         advance(gen, 1);  // skip 1 element
>>>         writeln(gen.front);
>>>         gen.popFront;
>>>         writeln(gen.front);
>>>     }
>>>     writeln("\nCase2");
>>>     {
>>>         Mt19937 gen;
>>>         writeln(gen.front);
>>>         gen.popFront;
>>>         writeln(gen.front);
>>>     }
>>>     writeln("---");
>>>     {
>>>         Mt19937 gen;
>>>         advance(gen, 1);  // skip 1 element
>>>         writeln(gen.front);
>>>         gen.popFront;
>>>         writeln(gen.front);
>>>     }
>>> }
>>>
>>> Result:
>>>
>>> Case1
>>> 2357136044 (1)
>>> 2546248239 (2)
>>> ---
>>> 2546248239 (2)
>>> 3071714933 (3)
>>>
>>> Case2
>>> 581869302  (1)
>>> 3890346734 (2)
>>> ---
>>> 581869302  (1)?
>>> 3890346734 (2)?
>>>
>>> I think 'Case1' is correct, but 'Case2' is wrong.
>>> Mt19937's bug?
>>>
>>
>> If you initialize the Mersenne twister with no seed it will start with
>> seed 5489u.
>
> I think his point is in the second part of Case2, he skipped one element, but it appears that it didn't skip anything.
Thanks for your comment.
Lack of talk about this problem:|

-- 
tama <repeatedly@gmail.com>
http://profile.livedoor.com/repeatedly/
April 23, 2009
On Thu, 23 Apr 2009 03:12:16 +0900, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Steven Schveighoffer wrote:
>> On Wed, 22 Apr 2009 13:49:18 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>
>>> tama wrote:
>>>> On Mon, 20 Apr 2009 16:09:09 +0900, Walter Bright
>>>> <newshound1@digitalmars.com> wrote:
>>>>
>>>>>
>>>>> This is a major revision to Phobos, including Andrei's revolutionary
>>>>> new range support.
>>>>>
>>>>> http://www.digitalmars.com/d/2.0/changelog.html
>>>>> http://ftp.digitalmars.com/dmd.2.029.zip
>>>>
>>>> Range is so cool!
>>>>
>>>> Though...
>>>> I tried following code:
>>>>
>>>> void main()
>>>> {
>>>>     writeln("Case1");
>>>>     {
>>>>         Mt19937 gen = Mt19937(0);
>>>>         writeln(gen.front);
>>>>         gen.popFront;
>>>>         writeln(gen.front);
>>>>     }
>>>>     writeln("---");
>>>>     {
>>>>         Mt19937 gen = Mt19937(0);
>>>>         advance(gen, 1);  // skip 1 element
>>>>         writeln(gen.front);
>>>>         gen.popFront;
>>>>         writeln(gen.front);
>>>>     }
>>>>     writeln("\nCase2");
>>>>     {
>>>>         Mt19937 gen;
>>>>         writeln(gen.front);
>>>>         gen.popFront;
>>>>         writeln(gen.front);
>>>>     }
>>>>     writeln("---");
>>>>     {
>>>>         Mt19937 gen;
>>>>         advance(gen, 1);  // skip 1 element
>>>>         writeln(gen.front);
>>>>         gen.popFront;
>>>>         writeln(gen.front);
>>>>     }
>>>> }
>>>>
>>>> Result:
>>>>
>>>> Case1
>>>> 2357136044 (1)
>>>> 2546248239 (2)
>>>> ---
>>>> 2546248239 (2)
>>>> 3071714933 (3)
>>>>
>>>> Case2
>>>> 581869302  (1)
>>>> 3890346734 (2)
>>>> ---
>>>> 581869302  (1)?
>>>> 3890346734 (2)?
>>>>
>>>> I think 'Case1' is correct, but 'Case2' is wrong.
>>>> Mt19937's bug?
>>>>
>>>
>>> If you initialize the Mersenne twister with no seed it will start with
>>> seed 5489u.
>>  I think his point is in the second part of Case2, he skipped one element, but it appears that it didn't skip anything.
>
> Oh ok, thanks tama and Steve. I know where the problem is (on the Mersenne twister, if you call popFront without ever calling head there's a bug). Could you submit a bug? I'm on battery and on short time, and Walter is bugging me :o).
I submitted this problem to bugzilla.
http://d.puremagic.com/issues/show_bug.cgi?id=2882

-- 
tama <repeatedly@gmail.com>
http://profile.livedoor.com/repeatedly/
April 23, 2009
Don wrote:
> Georg Wrede wrote:
>> Don wrote:
>>> Georg Wrede wrote:
>>>> Walter Bright wrote:
>>>>> Lutger wrote:
>>>>>> what the hell...this code can't be human.
>>>>>
>>>>> I was replaced by Colossus years ago.
>>>>
>>>> Michael A. Jackson wouldn't approve 1175 gotos in 113 files.
>>>
>>> It'd be really funny to pass it through one of those "code quality" metrics, one of the ones with a ridiculously heavy penalty for using goto. I think it'd tell you that DMD source is almost the lowest-quality code on the planet. <g>
>>
>> Yeah. But now I'm getting a bad conscience, this is beginning to look like Walter-bashing... :-)
> 
> I think those code quality metrics are ridiculous. The prejudice against 'goto' is really just brainwashing and totally without basis.

Well, at the time, they had no choice. Unstructured code was the norm, and people who tangled themselves in spaghetti never noticed it's because of goto usage. Instead they just preceived programming as hard.

The language on my first computer was old fashioned BASIC, with line numbers and no named subroutines or structural elements. I felt that the effort needed to program was exponential to the number of lines. In hindsight, that must have been entirely because I'd had no higher education in programming, no notion of structuring the code (in any particular way), etc. But the absolutely worst thing was that one couldn't pass parameters to subroutines. (gosub <lineNumber>) And all variables were global.

In those days most programming was done in COBOL, by (essentially) non-programmers. (The idea of COBOL was, after all, to be usable by business people, not MSc programmers.) Jackson had to change not only the methodology, but before that, the concept and the motivation for using structured programming. That was the hard part, and it needed some serious evangelising and brainwashing, to overcome the inertia.

Unfortunately, that meant murdering, dismembering, and pulverizing any least hint of even thinking about using goto. And as with any paradigm shift, the disciples took it to religious extremes.

> Here's a recent link from Andrew Koenig, who really should know better.
> 
> http://dobbscodetalk.com/index.php?option=com_myblog&show=What-Dijkstra-said-was-harmful-about-goto-statements.html&Itemid=29 
> 
> Can you see his fundamental mistake? He talks about "the program" (just as Dijkstra said), but it's just not relevant for C/C++/D/Pascal/...
> or any other structured programming language. Wherever he says "program" you should substitute "function". And then the force of the argument totally disappears.

Yes.

> The problem with goto in an unstructured language is that when you see a label, you don't know where it came from. It could be anywhere in the entire program (maybe a million lines of code!) And that's a complete disaster. But in a structured language, you know it's from somewhere in the function.
> And this is no different from any other control structure. You ALWAYS have to look at the whole scope you're in.
> 
> void foo(){
>  int n;
> bool b = true;
>   while (b) {
>     if (n==4) {   /* when do we get here? You have to read the whole function to find out. Where does n get modified? */ }
> :
> :
> :
>   }
> }
> 
> Heck, even in inline ASM in D, you can't write the kind of spaghetti code Dijkstra was complaining about. You always have local scope.

OTOH, getting the algorithm may become tedious even in small snippets. For example, here's Algorithm M, from /Fundamental Algorithms/, Knuth 1969, page 95:

    M1: int j=n; int k=n; int m=X[n];
    M2: if(k==0) goto M6;
    M3: if(X[k] <= m) goto M5;
    M4: j=k; m=X[k];
    M5: k--; goto M2;
    M6: ;

That's three gotos for six lines. How long did it take to fully figure out what it's doing? Way longer than if it were written in structured D, anyway.

> Yet, I personally have been so affected by anti-goto brainwashing that I've never used goto in C, C++, or D. But I gradually realised that it was just brainwashing. I've never experienced any problem with goto in ASM.

Same here. Last time I used it was in Pascal, and boy, did I have a bad conscience. It felt like I was doing something Mom has forbidden.

>>> Actually, looking through the DMD source it becomes obvious that goto is really not a problem at all. The lack of comments is much more of a problem. (Especially with files with names like "e2ir.c". What the heck is "fltables.c", "cdxxx.c", "elxxx.c" ?). Even so, it's mostly not that difficult to understand.
>>
>> I guess Walter has to keep alternating between ASM, C and D. And a lot of ASM coding is nothing more than a bunch of MOV and JMP stuff. And file naming conventions here look like what one typically finds in development code (before some pre-publishing guy has tidyed it up with a lot of global search&replaces). And, after all, the C files never were meant to be public anyway.
>>
>> Actually, the one interesting question might be, would rewriting this code in a structured fashion (I mean, removing the gotos) make it slower? (Not that I'd be suggesting Walter should do it. Just an academic question.)
> 
> I doubt it'd affect the speed at all. It'd probably make it longer. There are cases in DMD where it seems to make the control flow simpler.

Simpler, yes. Then again, I'd hate to see folks actually start using goto in D!!!! If Walter uses it, it's different, he's qualified. :-)



Oh, (for interested readers,) here's the above code in a test bench:

import std.stdio;
void main()
{
    enum {DUMMY};
    int[] X = [DUMMY,2,3,4,19,18,17,5,1,6];
    int n = X.length-1;

    M1: int j=n; int k=n; int m=X[n];
    M2: if(k==0) goto M6;
    M3: if(X[k] <= m) goto M5;
    M4: j=k; m=X[k];
    M5: k--; goto M2;
    M6: ;

    writeln("max: ", m, " pos: ", j);
}

The dummy and length-1 are needed because, in the old days, especially text books had arrays 1-based. Also numbering things in general used to be 1-based, because "computing culture" hadn't spread to the general public yet. You didn't want to explain all over again the merits of zero based indexing every time you chalked up an example.

The interesting part is, still today, this algorithm is essentially what the computer ends up doing, no matter how "structured" our source code was.
April 23, 2009
bearophile wrote:
> Don:
>> Actually it's not so difficult. I've created a patch for bug 2807 -- it's only 5 lines long! It gives an error message if a nested pure function accesses a mutable variable from an outer scope.
> 
> Thank you very much Don, your work helps a lot.
> 
> Every time I try a tiny program I find something I don't understand:
> 
> import std.stdio: writeln;
> import std.math: sqrt;
> import std.conv: to;
> 
> void main(string[] args) {
>     double x = args.length == 2 ? to!(double)(args[1]) : 4.0;
>     writeln(sqrt(x) + sqrt(x));
> }
> 
> I have also tried with std.math.sin with similar results:
> 
> L0:		enter	8,0
> 		mov	EAX,8[EBP]
> 		cmp	EAX,2
> 		jne	L30
> 		cmp	EAX,1
> 		ja	L1B
> 		mov	EAX,6
> 		call	near ptr _D6test7__arrayZ
> L1B:		mov	EDX,0Ch[EBP]
> 		mov	EAX,8[EBP]
> 		mov	EAX,8[EDX]
> 		mov	EDX,0Ch[EDX]
> 		push	EDX
> 		push	EAX
> 		call	near ptr _D3std4conv13__T2toTdTAyaZ2toFAyaZd
> 		jmp short	L36
> L30:		fld	qword ptr FLAT:_DATA[00h]
> L36:		fstp	qword ptr -8[EBP]
> 		fld	qword ptr -8[EBP]
> 		fsin
> 		fld	qword ptr -8[EBP]
> 		fsin
> 		faddp	ST(1),ST
> 		sub	ESP,0Ch
> 		fstp	tbyte ptr [ESP]
> 		call	near ptr _D3std5stdio14__T7writelnTeZ7writelnFeZv
> 		xor	EAX,EAX
> 		leave
> 		ret
> 
> Isn't sin(x)+sin(x) pure? Even if the compiler doesn't want to replace x+x with x*2 because x is a floating point, it can do:
> 
> y = sin(x)
> y+y
> 
> And that gives the same result even with FPs.

Yes. From further investigation, I've found that:
(1) the optimisation of pure only happens for int returns, not for floating-point return types. It should convert purefunc(x)+purefunc(x) into 2*purefunc(x) if the return type of purefunc is int, float, or complex.

(2) the back-end isn't smart enough to convert f*2 into f+f.

It's difficult to work out where it's optimising the int return. I can see where it marks pure calls as subexpressions to be potentially eliminated: OTae(op) is true in gflow.c(660) if it's a pure call, false if it's an impure call.
The problem may simply be that the backend doesn't do much common subexpression elimination of floating-point expressions.

I really don't understand the backend. It's quite cryptic. Key acronyms are AE, CP and VBE. Then there's Bin, Bgen, Bkill, etc.
AE *might* be Available Expression (but what does that mean?)
CP might be Copy Propagation info
I've found that VBE = "Very Busy Expression"! (what does that mean?)

Fixing your problem is beyond me at present, I think.
April 23, 2009
Masahiro Nakagawa wrote:
> I submitted this problem to bugzilla. http://d.puremagic.com/issues/show_bug.cgi?id=2882

Thanks! I fixed it and checked in std.random.

Andrei
April 23, 2009
Don:
> I really don't understand the backend. It's quite cryptic. Key acronyms
> are AE, CP and VBE. Then there's Bin, Bgen, Bkill, etc.
> AE *might* be Available Expression (but what does that mean?)
> CP might be Copy Propagation info
> I've found that VBE = "Very Busy Expression"! (what does that mean?)

I suggest to add comments to the DMD source code every time a small mystery is revealed, to rename the files once a much better name is found, to improve and refactor the code every time it a good thing to do it. Every time you put your hand in the code you can improve the code locally and leave it better then before (this is the Scout principle in programming).
Then such patches/changes can be slowly folded back into DMD and with time the sources will improve and will become more readable. Once they can be understood better, further changes become faster and simpler.

Bye,
bearophile
April 23, 2009
bearophile wrote:
> Don:
>> I really don't understand the backend. It's quite cryptic. Key acronyms are AE, CP and VBE. Then there's Bin, Bgen, Bkill, etc.
>> AE *might* be Available Expression (but what does that mean?)
>> CP might be Copy Propagation info
>> I've found that VBE = "Very Busy Expression"! (what does that mean?)
> 
> I suggest to add comments to the DMD source code every time a small mystery is revealed, to rename the files once a much better name is found, to improve and refactor the code every time it a good thing to do it. Every time you put your hand in the code you can improve the code locally and leave it better then before (this is the Scout principle in programming).
> Then such patches/changes can be slowly folded back into DMD and with time the sources will improve and will become more readable. Once they can be understood better, further changes become faster and simpler.

I'd love to do that, but unfortunately it's not easy to get _any_ changes into DMD. Only Walter has write-access to it, and AFAIK he checks every line and enters it manually. Like Linus was famous for doing with the Linux kernel.

> 
> Bye,
> bearophile