pure functions cannot be removed, actually: pure vs. total - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » pure functions cannot be removed, actually: pure vs. total

Thread overview

pure functions cannot be removed, actually: pure vs. total
Jun 05, 2018 FeepingCreature
Jun 05, 2018 Stefan Koch
Jun 05, 2018 Steven Schveighoffer
Jun 05, 2018 FeepingCreature
Jun 05, 2018 Steven Schveighoffer
Jun 07, 2018 Johan Engelen
Jun 07, 2018 Jonathan M Davis
Jun 07, 2018 Steven Schveighoffer
Jun 07, 2018 Stefan Koch
Jun 07, 2018 Steven Schveighoffer
Jun 11, 2018 Timon Gehr
Jun 11, 2018 Timon Gehr
Jun 11, 2018 Steven Schveighoffer
Jun 11, 2018 Timon Gehr
Jun 11, 2018 Steven Schveighoffer

June 05, 2018

pure functions cannot be removed, actually: pure vs. total

Posted by FeepingCreature

FeepingCreature

I'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return.

void foo() pure { while (true) { } }

By the way, this led to an amusing Phobos bug involving a pure function (not) returning a struct with an enum member: https://github.com/dlang/dmd/pull/8013#pullrequestreview-110250441 and assert(false.repeat.any == true); :)

When a strongly pure function is called multiple times with the same parameter, all but the first call can be removed; this is because if it was going to not return, it would already have inf-looped the first time. pure lets you go from n to 1, but not from 1 to 0.

A pure function that returns a value for every possible parameter is called a total function. Unfortunately, there is no way to enforce totality in the compiler, due to the halting problem.

June 05, 2018

Re: pure functions cannot be removed, actually: pure vs. total

Posted by Stefan Koch
in reply to FeepingCreature

Stefan Koch

Posted in reply to FeepingCreature

On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:
> I'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return.
>
> [...]

In that instance you can have false negatives. catgorizing total functions an non total.

But no false positives afaics.

June 05, 2018

Re: pure functions cannot be removed, actually: pure vs. total

Posted by Steven Schveighoffer
in reply to FeepingCreature

Steven Schveighoffer

Posted in reply to FeepingCreature

On 6/5/18 10:48 AM, FeepingCreature wrote:
> I'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return.
> 
> void foo() pure { while (true) { } }
> 
> By the way, this led to an amusing Phobos bug involving a pure function (not) returning a struct with an enum member: https://github.com/dlang/dmd/pull/8013#pullrequestreview-110250441 and assert(false.repeat.any == true); :)
> 
> When a strongly pure function is called multiple times with the same parameter, all but the first call can be removed; this is because if it was going to not return, it would already have inf-looped the first time. pure lets you go from n to 1, but not from 1 to 0.
> 
> A pure function that returns a value for every possible parameter is called a total function. Unfortunately, there is no way to enforce totality in the compiler, due to the halting problem.
> 

I'll repeat what I said in the PR where you made this similar comment, I don't think it is important to ensure a pure function that never returns is always called. Can you explain the benefit of such a thing?

We've all heard of optimizers that reduce "code that does nothing" down to just a return statement, foiling people who are expecting benchmarks to run properly, why is this any different?

I'm asking because it seems like we give up a really easy optimization for pure functions for the sake of pure infinite loop programs, which I suppose have some value, but not much value. Getting around the infinite loop elision would be pretty simple, just return an int.

On the flip side, if the compiler can figure out via introspection that some template function is strong pure and returns void, therefore it doesn't need to call it, that's a much bigger win than preserving the possible infinite loopiness that likely is a bug anyway.

Another observation: under the "infinite loops are important observable behavior" world-view, pure functions cannot be lazily evaluated either:

pure int foo() { /*infinite loop */}

void main(string[] args)
{
   auto a = foo;
   writeln("hi");
   if(args[1] == "printa")
      writeln(a);
}

With some optimizers, this can be rewritten:

writeln("hi");
if(args[1] == "printa")
   writeln(foo);

Which if foo is a *normally returning* function and not an infinite loop one, then this can save cycles in certain cases.

But under your world-view, the optimization is invalid, because foo might have an infinite loop, and then the observable behavior changes (instead of printing nothing and infinite looping, "hi" is printed, and infinite loops).

-Steve

June 05, 2018

Re: pure functions cannot be removed, actually: pure vs. total

Posted by FeepingCreature
in reply to Steven Schveighoffer

FeepingCreature

Posted in reply to Steven Schveighoffer

On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer wrote:
> Another observation: under the "infinite loops are important observable behavior" world-view, pure functions cannot be lazily evaluated either:
>
> pure int foo() { /*infinite loop */}
>
> void main(string[] args)
> {
>    auto a = foo;
>    writeln("hi");
>    if(args[1] == "printa")
>       writeln(a);
> }
>
> With some optimizers, this can be rewritten:
>
> writeln("hi");
> if(args[1] == "printa")
>    writeln(foo);
>
> Which if foo is a *normally returning* function and not an infinite loop one, then this can save cycles in certain cases.
>
> But under your world-view, the optimization is invalid, because foo might have an infinite loop, and then the observable behavior changes (instead of printing nothing and infinite looping, "hi" is printed, and infinite loops).
>

That's correct, this optimization is invalid. The only optimization that can arise from foo being pure is *subsequent* calls to foo being removed.

> -Steve

On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer wrote:
> I'll repeat what I said in the PR where you made this similar comment, I don't think it is important to ensure a pure function that never returns is always called. Can you explain the benefit of such a thing?
>
> We've all heard of optimizers that reduce "code that does nothing" down to just a return statement, foiling people who are expecting benchmarks to run properly, why is this any different?
>

Frankly speaking, we should not implement optimizations merely on the basis that we cannot immediately think of a case where they fail. For instance, a practical function that loops forever would be a find() call on an infinite range, such as a range returned by .repeat or .generate.

June 05, 2018

Re: pure functions cannot be removed, actually: pure vs. total

Posted by Steven Schveighoffer
in reply to FeepingCreature

Steven Schveighoffer

Posted in reply to FeepingCreature

On 6/5/18 5:03 PM, FeepingCreature wrote:
> On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer wrote:
>> Another observation: under the "infinite loops are important observable behavior" world-view, pure functions cannot be lazily evaluated either:
>>
>> pure int foo() { /*infinite loop */}
>>
>> void main(string[] args)
>> {
>>    auto a = foo;
>>    writeln("hi");
>>    if(args[1] == "printa")
>>       writeln(a);
>> }
>>
>> With some optimizers, this can be rewritten:
>>
>> writeln("hi");
>> if(args[1] == "printa")
>>    writeln(foo);
>>
>> Which if foo is a *normally returning* function and not an infinite loop one, then this can save cycles in certain cases.
>>
>> But under your world-view, the optimization is invalid, because foo might have an infinite loop, and then the observable behavior changes (instead of printing nothing and infinite looping, "hi" is printed, and infinite loops).
>>
> 
> That's correct, this optimization is invalid. The only optimization that can arise from foo being pure is *subsequent* calls to foo being removed.

I think Haskell would disagree with you: https://wiki.haskell.org/Lazy_evaluation

> On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer wrote:
>> I'll repeat what I said in the PR where you made this similar comment, I don't think it is important to ensure a pure function that never returns is always called. Can you explain the benefit of such a thing?
>>
>> We've all heard of optimizers that reduce "code that does nothing" down to just a return statement, foiling people who are expecting benchmarks to run properly, why is this any different?
>>
> 
> Frankly speaking, we should not implement optimizations merely on the basis that we cannot immediately think of a case where they fail. For instance, a practical function that loops forever would be a find() call on an infinite range, such as a range returned by .repeat or .generate.

But a call to find doesn't "do nothing". It takes a range and returns a range.

We are specifically talking about strong-pure functions that return void, or even strong pure functions whose return value is ignored.

And yes, we can actually prove that calls to pure functions do nothing based on the rules of pure functions, which is why the optimization is easy to prove correct. It's one of the reasons pure optimizations are much easier to reason about.

However, if we have a wrinkle of "we have to make sure infinite loops execute their thing", then many pure optimizations get thrown out the window.

-Steve

June 07, 2018

Re: pure functions cannot be removed, actually: pure vs. total

Posted by Johan Engelen
in reply to FeepingCreature

Johan Engelen

Posted in reply to FeepingCreature

On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:
> I'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return.
>
> void foo() pure { while (true) { } }

This is not a valid program, in my opinion. (Still only my opinion, because I have not found it in the D spec, so needs adding.)
A valid C++ program must make progress in each thread. C++ spec states:
"The implementation may assume that any thread will eventually do one of the following:
- terminate,
- make a call to a library I/O function,
- access or modify a volatile object, or
- perform a synchronization operation or an atomic operation."

-Johan

June 07, 2018

Re: pure functions cannot be removed, actually: pure vs. total

Posted by Jonathan M Davis
in reply to Johan Engelen

Jonathan M Davis

Posted in reply to Johan Engelen

On Thursday, June 07, 2018 20:14:19 Johan Engelen via Digitalmars-d wrote:
> On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:
> > I'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return.
> >
> > void foo() pure { while (true) { } }
>
> This is not a valid program, in my opinion. (Still only my
> opinion, because I have not found it in the D spec, so needs
> adding.)
> A valid C++ program must make progress in each thread. C++ spec
> states:
> "The implementation may assume that any thread will eventually do
> one of the following:
> - terminate,
> - make a call to a library I/O function,
> - access or modify a volatile object, or
> - perform a synchronization operation or an atomic operation."

I would be _very_ surprised if the D spec said anything like this simply because it doesn't do a lot of talking about conceptual stuff like this and talks a lot more about what the compiler actually does (or is supposed to do) - though that certainly doesn't mean that the spec shouldn't say something like that, and Walter has previously stated that anything that needs to be in the spec but isn't should be reported in bugzilla. So, if you have anything like this that you really think should be in the spec, please report it in bugzilla.

Regardless, I see zero value in supporting something like

void foo() pure { while(true) {} }

The compiler doesn't actually give an error when using a function like this (though it does give an error when calling a strongly pure function which returns a value, and the return value isn't used), but I could easily see it doing so on the grounds that the function cannot possible be doing any actual work. And I see _zero_ reason to support something like this just so that you can run a loop or do other work that gets thrown away. I suppose that it could be argued that it would be useful for benchmarks, but there are better ways to handle that that don't require allowing functions that clearly do no real work.

- Jonathan M Davis

June 07, 2018

Re: pure functions cannot be removed, actually: pure vs. total

Posted by Steven Schveighoffer
in reply to Jonathan M Davis

Steven Schveighoffer

Posted in reply to Jonathan M Davis

On 6/7/18 5:04 PM, Jonathan M Davis wrote:
> On Thursday, June 07, 2018 20:14:19 Johan Engelen via Digitalmars-d wrote:
>> On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:
>>> I'm just posting to clear up the misunderstanding that a call
>>> to a pure function can be removed. Actually, even calls to
>>> strongly pure functions cannot always be removed. This is
>>> because there is one thing that a pure function can do that
>>> will change program behavior if it is removed, even if it does
>>> not change any global state whatsoever: it can simply never
>>> return.
>>>
>>> void foo() pure { while (true) { } }
>>
>> This is not a valid program, in my opinion. (Still only my
>> opinion, because I have not found it in the D spec, so needs
>> adding.)
>> A valid C++ program must make progress in each thread. C++ spec
>> states:
>> "The implementation may assume that any thread will eventually do
>> one of the following:
>> - terminate,
>> - make a call to a library I/O function,
>> - access or modify a volatile object, or
>> - perform a synchronization operation or an atomic operation."
> 
> I would be _very_ surprised if the D spec said anything like this simply
> because it doesn't do a lot of talking about conceptual stuff like this and
> talks a lot more about what the compiler actually does (or is supposed to
> do) - though that certainly doesn't mean that the spec shouldn't say
> something like that, and Walter has previously stated that anything that
> needs to be in the spec but isn't should be reported in bugzilla. So, if you
> have anything like this that you really think should be in the spec, please
> report it in bugzilla.
> 
> Regardless, I see zero value in supporting something like
> 
> void foo() pure { while(true) {} }

OK, so FeepingCreature has come up with an interesting case that's *similar* to this, and I think it makes things a little less cut and dry.

Let's say you have an infinite range that returns a single value. But instead of that value being stored in the range itself, it's an enum.

Maybe something like this:

struct Range
{
   enum empty = false;
   enum front = true;
   void popFront() pure nothrow {}
}

Now, let's say you ran find on it:

auto result = Range.init.find(false);

This *should* do an infinite loop. If it returned, it means that find has found where `false` is!

But let's dissect the call to `find`:

* It's a template, and everything that it calls, given this range is pure nothrow, so it is inferred pure nothrow.
* It accepts a type that has no mutable data.
* It returns a type that has no mutable data.
* The return type can be proven to not depend on the input.
* Therefore, it is a strong-pure function.

So logically, the compiler could assume if it thinks infinite loops are invalid, that it must return. Since it doesn't really need to call the function to see what it returns, it can just replace the code with:

auto result = Range.init;

And here, we have found `false` in an infinite range that only contains `true`.

So the danger of assuming no infinite loops is that some convoluted case like true.repeat.find(false) might incorrectly return something. Note that the repeat call wouldn't cause this problem, since it stores a mutable element inside it.

I still think that for void-returning functions we can simply assume to not do anything. Literally, the only thing it can do is return or infinite loop.

But I'm definitely not as sure as I was when I first responded to this thread.

-Steve

June 07, 2018

Re: pure functions cannot be removed, actually: pure vs. total

Posted by Stefan Koch
in reply to Steven Schveighoffer

Stefan Koch

Posted in reply to Steven Schveighoffer

On Thursday, 7 June 2018 at 22:23:09 UTC, Steven Schveighoffer wrote:
> {...}

That a function could return does not mean it will.

int fn (int arg) /*strongly*/ pure
{
  if (arg == 42)
    return 42;
  else (arg < 43)
    return fn(--arg);
  else
    return fn(++arg);
}

do you see the problem?
If not you would you expect the compiler to ?

note: I would expected that the clang static analyzer
would be able to determine that depending on the value of arg
this function might never return.

June 07, 2018

Re: pure functions cannot be removed, actually: pure vs. total

Posted by Steven Schveighoffer
in reply to Stefan Koch

Steven Schveighoffer

Posted in reply to Stefan Koch

On 6/7/18 6:57 PM, Stefan Koch wrote:
> On Thursday, 7 June 2018 at 22:23:09 UTC, Steven Schveighoffer wrote:
>> {...}
> 
> That a function could return does not mean it will.
> 
> int fn (int arg) /*strongly*/ pure
> {
>    if (arg == 42)
>      return 42;
>    else (arg < 43)
>      return fn(--arg);
>    else
>      return fn(++arg);
> }
> 
> do you see the problem?
> If not you would you expect the compiler to ?
> 
> note: I would expected that the clang static analyzer
> would be able to determine that depending on the value of arg
> this function might never return.

I'm not talking about the compiler looking at the code to figure out if it will return.

I'm talking about this:

struct S
{
}

S fn() pure;

No matter what is inside fn, it will return S.init, if it returns. The question to answer is, can we assume all functions don't enter infinite loops, and if we can assume this, then why couldn't the compiler simply replace a call to fn with S.init?

And the case FeepingCreature gives is pretty compelling, because it's a real example, not a toy example.

-Steve

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation