Jump to page: 1 2 3
Thread overview
So why doesn't popFront return an element?
Apr 13, 2011
Andrej Mitrovic
Apr 13, 2011
Jesse Phillips
Apr 13, 2011
Jonathan M Davis
Apr 13, 2011
Andrej Mitrovic
Apr 14, 2011
bearophile
Apr 14, 2011
Jonathan M Davis
Apr 14, 2011
spir
Apr 14, 2011
Timon Gehr
Apr 14, 2011
Andrej Mitrovic
Apr 14, 2011
Jonathan M Davis
Apr 14, 2011
Timon Gehr
Apr 14, 2011
Timon Gehr
un-requested compiler optimisations
Apr 14, 2011
spir
Apr 14, 2011
spir
Apr 14, 2011
David Nadlinger
Apr 14, 2011
Timon Gehr
Apr 14, 2011
bearophile
Apr 14, 2011
Jonathan M Davis
April 13, 2011
I'm trying to understand the design of ranges. Why does popFront only set the front() property to return the next element in the range? Why not return the element in the call to popFront right away?

For example code like this (which doesn't work since popFront doesn't return):
void main()
{
    int[] a = [1, 2];
    auto b = a.popFront;
    assert(a == [2]);
    assert(b == 1);
}

Isn't it wasteful to have to call both popFront() and front() to simultaneously remove an element from a range and return it? I mean it's an extra function call, right?
April 13, 2011
Andrej Mitrovic Wrote:

> Isn't it wasteful to have to call both popFront() and front() to simultaneously remove an element from a range and return it? I mean it's an extra function call, right?

Isn't also a waste to return something that isn't going to be used? There are times it is important to just advance and either skip it or use the value in another location.

In either situation I think the overhead is minimal.
April 13, 2011
> I'm trying to understand the design of ranges. Why does popFront only set the front() property to return the next element in the range? Why not return the element in the call to popFront right away?
> 
> For example code like this (which doesn't work since popFront doesn't
> return): void main()
> {
> int[] a = [1, 2];
> auto b = a.popFront;
> assert(a == [2]);
> assert(b == 1);
> }
> 
> Isn't it wasteful to have to call both popFront() and front() to
> simultaneously remove an element from a range and return it? I mean it's
> an extra function call, right?

Often, it would be wasteful to have popFront return an element, since it's perfectly legitimate to call popFront without wanting to ever see front. If the type of front is a struct, then it would have to be copied which could cost more than a function call. And both front and popFront would typically be inlined anyway (at least as long as you compile with -inline), so it's unlikely that it really costs you a function call anyway.

Really, however, I believe that it's a matter of functionality. front and popFront do two very different things. One of them gives you the first element; the other removes the first element. You can call them both separately without wanting to do the other. There are plenty of cases where you want to call popFront multiple times without ever calling front, and there are plenty of cases where you want to call front without calling popFront. It _definitely_ would be a problem if front were removed in favor of having popFront return the front element, but even if you kept front and just made popFront return an element, it's conflating two separate operations into one function, which often is a bad idea.

In the case of popFront, I often call popFront without wanting to look at front immediately if at all, and in most cases where you _do_ want to look at front every time, you're probably using a foreach loop and not calling popFront directly anyway. So, making popFront return an element wouldn't help you much.

So, really, there's no need to make popFront return front, it's more likely to be _less_ efficient to do that, rather than more, and it's conflating two separate operations.

I see little to no benefit to making popFront return anything, and it could be detrimental for it to.

- Jonathan M Davis
April 13, 2011
Oh, I thought the compiler could optimize calls to popFront if I'm not assigning its value.

Doesn't a technique exist where if a function is called and I'm not assigning its result anywhere it should simply exit the function without returning anything? It seems like a valuable compiler optimization, if that is even possible.
April 14, 2011
Andrej Mitrovic:

> Oh, I thought the compiler could optimize calls to popFront if I'm not assigning its value.
> 
> Doesn't a technique exist where if a function is called and I'm not assigning its result anywhere it should simply exit the function without returning anything? It seems like a valuable compiler optimization, if that is even possible.

The function may have different calling points, including having its pointer taken and passed around. So you generally need the full compiled function, that generates the output too.

If the compiler is able to perform some kind of "whole program optimization" (today both GCC and LLVM are able to do some of it), and the compiler is able to infer that the result of popFront is never used in the whole program, then it might be able to remove the dead code that generates the output.

If the compiler inlines popFront at a call point, and in this call point the result is not used, all compilers are usually able to remove the useless computations of the result.

Some time ago I have proposed a __used_result boolean flag, usable inside functions. If inside a function you use this value (inside a "static if"), the function becomes a template, and it generally gets compiled two times in the final binary. If at a calling point the result of the function doesn't get used, the compiler calls the template instantiation where __used_result is false, so with the static if you are able to avoid computing the result.

So this program:

int count = 0;

int foo(int x) {
    count++;
    static if (__used_result) {
        int result = x * x;
        return result;
    }
}

void main() {
	foo(10);
	int y = foo(20);
}


Gets rewritten by the compiler to something like:

int count = 0;

auto foo(bool use_return)(int x) {
    count++;
    static if (use_return) {
        int result = x * x;
        return result;
    }
}

void main() {
    foo!false(10);
    int y = foo!true(20);
}


That produces:

_D4test11__T3fooVb0Z3fooFiZv	comdat
		push	EAX
		mov	ECX,FS:__tls_array
		mov	EDX,[ECX]
		inc	dword ptr _D4test5counti[EDX]
		pop	EAX
		ret

_D4test11__T3fooVb1Z3fooFiZi	comdat
		push	EAX
		mov	ECX,FS:__tls_array
		mov	EDX,[ECX]
		inc	dword ptr _D4test5counti[EDX]
		imul	EAX,EAX
		pop	ECX
		ret

Bye,
bearophile
April 14, 2011
> Oh, I thought the compiler could optimize calls to popFront if I'm not assigning its value.
> 
> Doesn't a technique exist where if a function is called and I'm not assigning its result anywhere it should simply exit the function without returning anything? It seems like a valuable compiler optimization, if that is even possible.

That would depend on the type. For anything other than structs, if the function is inlined, it should be able to do that (if it's not inlined, it can't, because the returning is done inside of the function call). For structs, it can probably do that if there's no postblit constructor and none of its member variables have postblit constructors (or have member variables of their own which have postblit constructors, etc.). If there's a postblit constructor, however, it would have to be pure, otherwise it could have side effects, and not doing the return could then alter the behavior of the program. However, if it's pure, it probably can. I don't know what the compiler does exactly.

Regardless, in the general case, the return value can't be optimized out, because that's done in the function call, not at the call site. Inlined functions should be able to get the optimization as long as it doesn't change the behavior of the program to do so, which could require that the postblit constructor is pure if a struct is being returned, and if you're returning an expression, then that expression is still going to have to be evaluated unless the compiler can determine that it has no side effects, which again, could require any functions involved to be pure.

So, don't bet on such an optimization happening. dmd might manage it in some cases, but in the most obvious cases, the cost of the return is pretty low anyway (since it's either a primitive type or a reference).

- Jonathan M Davis
April 14, 2011
On 04/14/2011 01:00 AM, Andrej Mitrovic wrote:
> I'm trying to understand the design of ranges. Why does popFront only set the front() property to return the next element in the range? Why not return the element in the call to popFront right away?
>
> For example code like this (which doesn't work since popFront doesn't return):
> void main()
> {
>      int[] a = [1, 2];
>      auto b = a.popFront;
>      assert(a == [2]);
>      assert(b == 1);
> }
>
> Isn't it wasteful to have to call both popFront() and front() to simultaneously remove an element from a range and return it? I mean it's an extra function call, right?

I like to have three members (even if not quite necessary, this cleanly separates notions). Why I don't understand is why empty and front are methods, not simple data members.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

April 14, 2011
> I'm trying to understand the design of ranges. Why does popFront only set the
front() property to return the next element in the range? Why not return the element in the call to popFront right away?
>
> For example code like this (which doesn't work since popFront doesn't return):
> void main()
> {
>     int[] a = [1, 2];
>     auto b = a.popFront;
>     assert(a == [2]);
>     assert(b == 1);
> }
>
> Isn't it wasteful to have to call both popFront() and front() to simultaneously
remove an element from a range and return it? I mean it's an extra function call, right?

Those ranges are intended for functional programming!
Most of the time, not returning anything means much much less work. Most ranges
you'll encounter are _lazily_ evaluated. Eg:
int[]a=[1,2,...];
auto b=map!foo(map!bar1(map!bar2(a));
auto c=chain(a,b);
auto d=zip(a,b);
auto e=zip(take(repeat(d),c.length));//note infinite range here...
//etc, you get the idea

Note that this code NEVER actually calls any of foo or bar1/bar2. (if this was not
so, the code would enter an infinite loop at repeat(d)) It keeps not calling them
when you call popFront(). (why would you want to do this anyways? I think it has
only few applications, one of them is the implementation of opApply or new lazy
range returning  functions).

When you now do e.front(), suddenly, a lot of stuff gets computed, so that you get
a look at the first element of the range, again you usually don't really want to
do this. It is more likely that you will call "take" or a fold/reduce on that range.







April 14, 2011
This leads me to another question I've always wanted to ask. A call such as:

auto b=map!foo(map!bar1(map!bar2(a));

This constructs a lazy range. What I'm wondering is if there are any performance issues when constructing long chains of ranges like that, since this basically routes one function to the next, which routes to the next, etc. E.g:

auto a = [1, 2];
auto b = retro(a);
auto c = retro(b);
auto d = retro(c);

Under the surface, I assume calling d.front would mean the following calls:
d.front()->c.back()->b.front()->a.back()

Or another example:
auto b = retro(retro(a));

Can the compiler optimize the second case and convert b.front to just do one field access?
April 14, 2011
On Thu, 14 Apr 2011 12:57:10 -0400, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:

> This leads me to another question I've always wanted to ask. A call such as:
>
> auto b=map!foo(map!bar1(map!bar2(a));
>
> This constructs a lazy range. What I'm wondering is if there are any
> performance issues when constructing long chains of ranges like that,
> since this basically routes one function to the next, which routes to
> the next

Of course.  Ranges are very dependent on inlining for their performance benefit.  Which means you are depending on the compiler inlining the code in order to achieve good performance.  The compiler doesn't always inline, and I'm not sure how deep it can go.

> Or another example:
> auto b = retro(retro(a));
>
> Can the compiler optimize the second case and convert b.front to just
> do one field access?

from std.range:

auto retro(Range)(Range r)
if (isBidirectionalRange!(Unqual!Range))
{
    // Check for retro(retro(r)) and just return r in that case
    static if (is(typeof(retro(r.source)) == Range))
    {
        return r.source;
    }
...

So yeah, it can be optimized somewhat.

-Steve
« First   ‹ Prev
1 2 3