April 14, 2011
> On 04/14/2011 01:00 AM, Andrej Mitrovic wrote:
> > I'm trying to understand the design of ranges. Why does popFront only set the front() property to return the next element in the range? Why not return the element in the call to popFront right away?
> > 
> > For example code like this (which doesn't work since popFront doesn't
> > return): void main()
> > {
> > 
> > int[] a = [1, 2];
> > auto b = a.popFront;
> > assert(a == [2]);
> > assert(b == 1);
> > 
> > }
> > 
> > Isn't it wasteful to have to call both popFront() and front() to
> > simultaneously remove an element from a range and return it? I mean it's
> > an extra function call, right?
> 
> I like to have three members (even if not quite necessary, this cleanly separates notions). Why I don't understand is why empty and front are methods, not simple data members.

They're properties. They could be member variables or they could be member functions. It's up to the implementation of a particular range as to whether they're member variables or member functions, though they're likely to be member functions in most cases. In most cases, there wouldn't be any member variable corresponding to empty (more likely, it's a check against the length of the range or does something to determine whether there's more left in the range rather than being a member variable indicating whether the range is empty), and even with front, there could easily be a calculation of some kind or additional checks (e.g. to make it throw if empty is true) in front.

There is no requirement that front or empty be either functions or variables. They're properties, so they can be whatever makes the most sense for that particular range type. And as for arrays, they _have_ to be functions, since they're added by Phobos.

- Jonathan M Davis
April 14, 2011
> On Thu, 14 Apr 2011 12:57:10 -0400, Andrej Mitrovic
> 
> <andrej.mitrovich@gmail.com> wrote:
> > This leads me to another question I've always wanted to ask. A call such as:
> > 
> > auto b=map!foo(map!bar1(map!bar2(a));
> > 
> > This constructs a lazy range. What I'm wondering is if there are any performance issues when constructing long chains of ranges like that, since this basically routes one function to the next, which routes to the next
> 
> Of course. Ranges are very dependent on inlining for their performance benefit. Which means you are depending on the compiler inlining the code in order to achieve good performance. The compiler doesn't always inline, and I'm not sure how deep it can go.

IIRC, I believe that I brought that up at one point and Walter said that it could inline infinitely deep (assuming that that it makes sense of course). However, I don't think that there's much question that the inliner could be improved. If nothing else, there are too many things in the language which currently preclude inlining, and as I understand it, the inliner could use some work on how well it does at inlining what it does inline. But I haven't studied it, so I don't know how good it ultimately is.

- Jonathan M Davis
April 14, 2011
Andrei Mitrovic:
> Can the compiler optimize the second case and convert b.front to just
do one field access?

In simple cases, obviously yes:

import std.range;
import std.stdio;

void main(){
    int[] a=new int[1000];
    auto b=retro(retro(a));
    writeln(b.front);
}

Produces:

.text:08094B64                 public _Dmain
.text:08094B64 _Dmain          proc near               ; CODE XREF:
_D2rt6dmain24mainUiPPaZi7runMainMFZv+15p
.text:08094B64                 push    ebp
.text:08094B65                 mov     ebp, esp
.text:08094B67                 mov     eax, offset _D11TypeInfo_Ai6__initZ
.text:08094B6C                 push    3E8h
.text:08094B71                 push    eax
.text:08094B72                 call    _d_newarrayT
.text:08094B77                 add     esp, 8
.text:08094B7A                 mov     eax, offset _D3std5stdio6stdoutS3std5stdio4File
.text:08094B7F                 push    dword ptr [edx]
.text:08094B81                 push    0Ah
.text:08094B83                 call    _D3std5stdio4File14__T5writeTiTaZ5writeMFiaZv
.text:08094B88                 xor     eax, eax
.text:08094B8A                 pop     ebp
.text:08094B8B                 retn
.text:08094B8B _Dmain          endp ; sp = -8

DMD even recognizes the fact, that edx contains the pointer to the first element of b.

I do not know about more complex cases, I think it depends on DMD's inlining caps, which are (naturally) somewhat poorer than gcc's at the moment.

April 14, 2011
On Thu, 14 Apr 2011 13:36:07 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:

> Andrei Mitrovic:
>> Can the compiler optimize the second case and convert b.front to just
> do one field access?
>
> In simple cases, obviously yes:
>
> import std.range;
> import std.stdio;
>
> void main(){
>     int[] a=new int[1000];
>     auto b=retro(retro(a));
>     writeln(b.front);
> }
>
> Produces:
>
> .text:08094B64                 public _Dmain
> .text:08094B64 _Dmain          proc near               ; CODE XREF:
> _D2rt6dmain24mainUiPPaZi7runMainMFZv+15%19p
> .text:08094B64                 push    ebp
> .text:08094B65                 mov     ebp, esp
> .text:08094B67                 mov     eax, offset _D11TypeInfo_Ai6__initZ
> .text:08094B6C                 push    3E8h
> .text:08094B71                 push    eax
> .text:08094B72                 call    _d_newarrayT
> .text:08094B77                 add     esp, 8
> .text:08094B7A                 mov     eax, offset _D3std5stdio6stdoutS3std5stdio4File
> .text:08094B7F                 push    dword ptr [edx]
> .text:08094B81                 push    0Ah
> .text:08094B83                 call    _D3std5stdio4File14__T5writeTiTaZ5writeMFiaZv
> .text:08094B88                 xor     eax, eax
> .text:08094B8A                 pop     ebp
> .text:08094B8B                 retn
> .text:08094B8B _Dmain          endp ; sp = -8
>
> DMD even recognizes the fact, that edx contains the pointer to the first element of b.
>
> I do not know about more complex cases, I think it depends on DMD's inlining caps,
> which are (naturally) somewhat poorer than gcc's at the moment.
>

Nah, it's simpler than that:

assert(is(typeof(b) == int[]));

which is done by retro (it detects if you're retro-ing a retro range, and just returns the original).

-Steve
April 14, 2011
On 04/14/2011 06:57 PM, Andrej Mitrovic wrote:
> This leads me to another question I've always wanted to ask. A call such as:
>
> auto b=map!foo(map!bar1(map!bar2(a));
>
> This constructs a lazy range. What I'm wondering is if there are any
> performance issues when constructing long chains of ranges like that,
> since this basically routes one function to the next, which routes to
> the next, etc. E.g:
>
> auto a = [1, 2];
> auto b = retro(a);
> auto c = retro(b);
> auto d = retro(c);
>
> Under the surface, I assume calling d.front would mean the following calls:
> d.front()->c.back()->b.front()->a.back()
>
> Or another example:
> auto b = retro(retro(a));
>
> Can the compiler optimize the second case and convert b.front to just
> do one field access?

If it does optimise, then it is definitely a compiler bug. Since you *explicitely* ask for a double reverse, it *must* just do it. Suppressing them is here just breaking the language's semantics!

It is not the compiler's role to interpret, meaning to guess your reasons for requesting that; and the compiler is certainly not in position to do it. Even less to *judge* that request as stupid, and thus ignore it (unlike in the army ;-). You are the master, asking for stupid computations is both your right and your problem ;-)
Anyway, there are probably perfectly valid reasons to do a double reverse; at least (0) exploring (1) teaching (2) benchmarking.

In a similar vein:
{
    long n;
    static N = 10_000;
    foreach (_ ; 0..N) {
        n = factorial(9);
    }
}

Should the compiler optimise by computing n only once? (even possibly at compile-time)
Then, what if I'm doing that in purpose? (be it stupid or not)

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

April 14, 2011
On Thu, 14 Apr 2011 14:24:41 -0400, spir <denis.spir@gmail.com> wrote:

> On 04/14/2011 06:57 PM, Andrej Mitrovic wrote:
>> This leads me to another question I've always wanted to ask. A call such as:
>>
>> auto b=map!foo(map!bar1(map!bar2(a));
>>
>> This constructs a lazy range. What I'm wondering is if there are any
>> performance issues when constructing long chains of ranges like that,
>> since this basically routes one function to the next, which routes to
>> the next, etc. E.g:
>>
>> auto a = [1, 2];
>> auto b = retro(a);
>> auto c = retro(b);
>> auto d = retro(c);
>>
>> Under the surface, I assume calling d.front would mean the following calls:
>> d.front()->c.back()->b.front()->a.back()
>>
>> Or another example:
>> auto b = retro(retro(a));
>>
>> Can the compiler optimize the second case and convert b.front to just
>> do one field access?
>
> If it does optimise, then it is definitely a compiler bug. Since you *explicitely* ask for a double reverse, it *must* just do it. Suppressing them is here just breaking the language's semantics!

I feel like people aren't looking at my post :)

retro *specifically* returns the source range if you retro it again.

from std.range:

auto retro(Range)(Range r)
if (isBidirectionalRange!(Unqual!Range))
{
    // Check for retro(retro(r)) and just return r in that case
    static if (is(typeof(retro(r.source)) == Range))
    {
        return r.source;
    }
...

-Steve
April 14, 2011
On 4/14/11 8:24 PM, spir wrote:
> If it does optimise, then it is definitely a compiler bug. Since you
> *explicitely* ask for a double reverse, it *must* just do it.
> Suppressing them is here just breaking the language's semantics!
> […]

Sigh… First, as Steve already pointed out, retro() specifically shortcuts reversing a range twice to the original range.

Second, your remarks about optimization are not quite correct, at least for most of today's high-level languages. A compiler is free to optimize a program in whatever possible way, as long as it can prove the result equivalent to the original. And as burning CPU-cycles is not part of the program semantics, a modern optimizing compiler is free to optimize the loop and factorial call away altogether. If you don't believe me, have a look at the attached sample program and the asm generated for main() (some recent Clang version, clang -O3).

David


---
#include <stdio.h>

int factorial(int a) {
  int p = 1;
  for (int i = 2; i <= a; ++i) {
    p *= i;
  }
  return p;
}

int main (int argc, char const *argv[]) {
  int t;
  for (long i = 0; i < 10000000; ++i) {
    t = factorial(9);
  }
  printf("%d\n", t);
  return 0;
}


_main:
0000000100000ee0	pushq	%rbp
0000000100000ee1	movq	%rsp,%rbp
0000000100000ee4	leaq	0x00000041(%rip),%rdi
0000000100000eeb	movl	$0x00058980,%esi ; 9! in hex
0000000100000ef0	xorb	%al,%al
0000000100000ef2	callq	0x100000f02; symbol stub for: _printf
0000000100000ef7	xorl	%eax,%eax
0000000100000ef9	popq	%rbp
0000000100000efa	ret
April 14, 2011
On 04/14/2011 08:33 PM, Steven Schveighoffer wrote:
>> If it does optimise, then it is definitely a compiler bug. Since you
>> *explicitely* ask for a double reverse, it *must* just do it. Suppressing
>> them is here just breaking the language's semantics!
>
> I feel like people aren't looking at my post :)

Sorry, I wrote this reply before reading yours.

denis
-- 
_________________
vita es estrany
spir.wikidot.com

April 14, 2011
> Nah, it's simpler than that:
>
> assert(is(typeof(b) == int[]));
>
> which is done by retro (it detects if you're retro-ing a retro range, and
> just returns the original).
>
> -Steve

Makes sense. But still, I think inlining is responsible for the following code generating identical machine code:

import std.range;
import std.stdio;
import std.algorithm;

int id(int a){return a;}

void main(){
	int[] a=new int[1000];
	auto b=retro(map!id(map!id(retro(a))));
	writeln(b.front);
}

Correct me if I'm wrong, but I think phobos does only handle the special case
retro(retro(x)) explicitly.

-Timon
April 14, 2011
> Should the compiler optimise by computing n only once? (even possibly at
compile-time)
> Then, what if I'm doing that in purpose? (be it stupid or not)
>
> Denis

What is the difference? The only thing that is affected by this optimization is execution speed. Optimizing is about implementing the requested semantics in an efficient matter. Why would you actually want to suppress completely safe optimizations? Also, when benchmarking, you don't usually want to benchmark unoptimized code. Or am I misunderstanding something?

If absolutely necessary, you can always use the volatile keyword to disable all optimizations on a given variable, or call the function explicitly using inline ASM.