Thread overview
D UFCS anti-pattern
Apr 25, 2014
Jonathan M Davis
Apr 25, 2014
monarch_dodra
Apr 25, 2014
Andrej Mitrovic
Apr 25, 2014
Brian Schott
Apr 25, 2014
monarch_dodra
Apr 25, 2014
Dmitry Olshansky
April 25, 2014
Recently, I observed a conversation happening on the github pull request system.

In phobos, we have the notion of output ranges. One is allowed to output to an output range by calling the function 'put'.

Here is the implementation of put:

void put(R, E)(ref R r, E e)
{
    static if(is(PointerTarget!R == struct))
        enum usingPut = hasMember!(PointerTarget!R, "put");
    else
        enum usingPut = hasMember!(R, "put");

    enum usingFront = !usingPut && isInputRange!R;
    enum usingCall = !usingPut && !usingFront;

    static if (usingPut && is(typeof(r.put(e))))
    {
        r.put(e);
    }
    else static if (usingPut && is(typeof(r.put((E[]).init))))
    {
        r.put((&e)[0..1]);
    }
    else static if (usingFront && is(typeof(r.front = e, r.popFront())))
    {
        r.front = e;
        r.popFront();
    }
    else static if ((usingPut || usingFront) && isInputRange!E && is(typeof(put(r, e.front))))
    {
        for (; !e.empty; e.popFront()) put(r, e.front);
    }
    else static if (usingCall && is(typeof(r(e))))
    {
        r(e);
    }
    else static if (usingCall && is(typeof(r((E[]).init))))
    {
        r((&e)[0..1]);
    }
    else
    {
        static assert(false,
                "Cannot put a "~E.stringof~" into a "~R.stringof);
    }
}

There is an interesting issue here -- put can basically be overridden by a member function of the output range, also named put. I will note that this function was designed and written before UFCS came into existence. So most of the machinery here is designed to detect whether a 'put' member function exists.

One nice thing about UFCS, now any range that has a writable front(), can put any other range whose elements can be put into front, via the pseudo-method put.

In other words:

void foo(int[] arr)
{
   int[] result = new int[arr.length];
   result.put(arr); // put arr into result.
}

But there is an issue with this. If the destination range actually implements the put member function, but doesn't implement all of the global function's niceties,
r.put(...) is not as powerful/useful as put(r,...). Therefore, the odd recommendation is to *always* call put(r,...)

I find this, at the very least, to be confusing. Here is a case where UFCS ironically is not usable via a function call that so obviously should be UFCS.

The anti-pattern here is using member functions to override or specialize UFCS behavior. In this case, we even hook the UFCS call with the member function, encouraging the name conflict!

As a possible solution, I would recommend simply change the name of the hook, and have the UFCS function forward to the hook. This way, calling put(r,...) and r.put(...) is always consistent.

Does this make sense? Anyone have any other possible solutions?

A relevant bug report (where I actually advocate for adding more of this horrible behavior): https://issues.dlang.org/show_bug.cgi?id=12583

-Steve
April 25, 2014
On Thu, 24 Apr 2014 22:21:32 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d@puremagic.com>
wrote:

> Recently, I observed a conversation happening on the github pull request system.
> 
> In phobos, we have the notion of output ranges. One is allowed to output to an output range by calling the function 'put'.
> 
> Here is the implementation of put:
> 
> void put(R, E)(ref R r, E e)
> {
> static if(is(PointerTarget!R == struct))
> enum usingPut = hasMember!(PointerTarget!R, "put");
> else
> enum usingPut = hasMember!(R, "put");
> 
> enum usingFront = !usingPut && isInputRange!R;
> enum usingCall = !usingPut && !usingFront;
> 
> static if (usingPut && is(typeof(r.put(e))))
> {
> r.put(e);
> }
> else static if (usingPut && is(typeof(r.put((E[]).init))))
> {
> r.put((&e)[0..1]);
> }
> else static if (usingFront && is(typeof(r.front = e,
> r.popFront()))) {
> r.front = e;
> r.popFront();
> }
> else static if ((usingPut || usingFront) && isInputRange!E &&
> is(typeof(put(r, e.front))))
> {
> for (; !e.empty; e.popFront()) put(r, e.front);
> }
> else static if (usingCall && is(typeof(r(e))))
> {
> r(e);
> }
> else static if (usingCall && is(typeof(r((E[]).init))))
> {
> r((&e)[0..1]);
> }
> else
> {
> static assert(false,
> "Cannot put a "~E.stringof~" into a "~R.stringof);
> }
> }
> 
> There is an interesting issue here -- put can basically be overridden by a member function of the output range, also named put. I will note that this function was designed and written before UFCS came into existence. So most of the machinery here is designed to detect whether a 'put' member function exists.
> 
> One nice thing about UFCS, now any range that has a writable front(), can put any other range whose elements can be put into front, via the pseudo-method put.
> 
> In other words:
> 
> void foo(int[] arr)
> {
> int[] result = new int[arr.length];
> result.put(arr); // put arr into result.
> }
> 
> But there is an issue with this. If the destination range actually
> implements the put member function, but doesn't implement all of the
> global function's niceties,
> r.put(...) is not as powerful/useful as put(r,...). Therefore, the
> odd recommendation is to *always* call put(r,...)
> 
> I find this, at the very least, to be confusing. Here is a case where UFCS ironically is not usable via a function call that so obviously should be UFCS.
> 
> The anti-pattern here is using member functions to override or specialize UFCS behavior. In this case, we even hook the UFCS call with the member function, encouraging the name conflict!
> 
> As a possible solution, I would recommend simply change the name of
> the hook, and have the UFCS function forward to the hook. This way,
> calling put(r,...) and r.put(...) is always consistent.
> 
> Does this make sense? Anyone have any other possible solutions?
> 
> A relevant bug report (where I actually advocate for adding more of this horrible behavior): https://issues.dlang.org/show_bug.cgi?id=12583

If it doesn't work to override a free function with a member
function, I honestly don't see much point to UFCS. The whole idea
behind it is to make it so that you don't have to care whether a
function is a free function or a member function. The current situation
essentially forces you to not use UFCS except in cases where you're
trying to add "member functions" to built-in types. And as such,
calling functions on user-defined types using UFCS runs a high risk of
not compiling, because all it takes is for the user-defined type to
define a function with the same name - even if it takes completely
different arguments - and now the compiler won't even try to use the
free function anymore.

I really think that we should fix it so that stuff like outputRange.put(foo) works - including when types define put themselves. AFAIK, that means changing the overload rules so that member functions conflict with free functions only when they take the same arguments - in which case the member function would be called, as it is now, except that the cases where a free function matches the arguments would also work, allowing us to override free functions with member functions where appropriate and prevent simple name collisions from making UFCS not work (i.e. when the member function takes completely different arguments, UFCS would still use the free function). Without a change along those lines, I'd be strongly inclined to argue against using UFCS in any situation except in those where you need to add "member functions" to the built-in types. And the only common case for that that I'm aware of is making it so that arrays can function as ranges.

But this issue goes far beyond put.

- Jonathan M Davis
April 25, 2014
On Friday, 25 April 2014 at 02:21:32 UTC, Steven Schveighoffer wrote:
> A relevant bug report (where I actually advocate for adding more of this horrible behavior): https://issues.dlang.org/show_bug.cgi?id=12583
>
> -Steve

See also: https://issues.dlang.org/show_bug.cgi?id=9074
Appender!string x;
x.put(repeat(" ").take(4)); //fails
put(x, repeat(" ").take(4)); //works
April 25, 2014
On 4/25/14, Steven Schveighoffer via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> Recently, I observed a conversation happening on the github pull request system.

Another case of hijacking is the std.conv.text function. I've seen two people so far that have accidentally called the global text function by mistake, e.g.:

-----
import std.conv;
import std.stdio;

class C
{
    this(string user, string text) { _user = user; _text = text; }
    string _user;
    string _text;
}

void main()
{
    auto c = new C("John", "this is my message");
    string input = c.text;
    writeln(input);  // test.C, oops!
}
-----

UFCS has its benefits but it also has its drawbacks. And yet I wouldn't want to ban the above code, but maybe we could figure out some system for protecting against UFCS calls in some contexts. The way I worked around the common `.text` pattern was to define a @disable'd field in the class, which would override the global text function and trigger a compilation error.
April 25, 2014
On Friday, 25 April 2014 at 07:52:20 UTC, Andrej Mitrovic via Digitalmars-d wrote:
> Another case of hijacking is the std.conv.text function. I've seen two
> people so far that have accidentally called the global text function
> by mistake, e.g.:

Make that three. The ability to call non-@property functions without parenthesis is another of my favorite anti-features.
April 25, 2014
25-Apr-2014 06:21, Steven Schveighoffer пишет:
> Recently, I observed a conversation happening on the github pull request
> system.
>
> In phobos, we have the notion of output ranges. One is allowed to output
> to an output range by calling the function 'put'.
>
[snip]
>
> The anti-pattern here is using member functions to override or
> specialize UFCS behavior. In this case, we even hook the UFCS call with
> the member function, encouraging the name conflict!
>
> As a possible solution, I would recommend simply change the name of the
> hook, and have the UFCS function forward to the hook. This way, calling
> put(r,...) and r.put(...) is always consistent.

+1



-- 
Dmitry Olshansky
April 25, 2014
On Fri, 25 Apr 2014 00:58:51 -0400, Jonathan M Davis via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

>
> If it doesn't work to override a free function with a member
> function, I honestly don't see much point to UFCS. The whole idea
> behind it is to make it so that you don't have to care whether a
> function is a free function or a member function. The current situation
> essentially forces you to not use UFCS except in cases where you're
> trying to add "member functions" to built-in types. And as such,
> calling functions on user-defined types using UFCS runs a high risk of
> not compiling, because all it takes is for the user-defined type to
> define a function with the same name - even if it takes completely
> different arguments - and now the compiler won't even try to use the
> free function anymore.

Right but the pattern is:

someGlobal(T t, U u)
{
   static if(is(typeof(t.someGlobal(u))))
     t.someGlobal(u);
   else
     // use default implementation.
}

Essentially, someGlobal is encouraging not only overriding itself for a type, but for only a PORTION of the implementation.

If you override put completely, allowing all the mechanisms put uses, that is absolutely fine. But for only implementing a single part, the hook put uses to interface with the type should NOT be named put. This means the member put is far less functional than the global function put.

What I'm saying is that the hook and the global function should not be named the same. Not that we should change the override rules.

> I really think that we should fix it so that stuff like
> outputRange.put(foo) works - including when types define put
> themselves. AFAIK, that means changing the overload rules so that
> member functions conflict with free functions only when they take the
> same arguments - in which case the member function would be called, as
> it is now, except that the cases where a free function matches the
> arguments would also work, allowing us to override free functions with
> member functions where appropriate and prevent simple name collisions
> from making UFCS not work (i.e. when the member function takes
> completely different arguments, UFCS would still use the free
> function). Without a change along those lines, I'd be strongly inclined
> to argue against using UFCS in any situation except in those where you
> need to add "member functions" to the built-in types. And the only
> common case for that that I'm aware of is making it so that arrays can
> function as ranges.

I don't think this is a good idea. A type first and foremost is in charge of its API.

Another possible option is to make sure put(R, ...) is as limited as the member R.put(...). This means, if R.put doesn't support the parameters, put(R, ...) should also reject them.

This at least is consistent, but I think code will break for not much of a good reason.

-Steve
April 25, 2014
On Fri, 25 Apr 2014 03:52:09 -0400, Andrej Mitrovic via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On 4/25/14, Steven Schveighoffer via Digitalmars-d
> <digitalmars-d@puremagic.com> wrote:
>> Recently, I observed a conversation happening on the github pull request
>> system.
>
> Another case of hijacking is the std.conv.text function. I've seen two
> people so far that have accidentally called the global text function
> by mistake, e.g.:
>
> -----
> import std.conv;
> import std.stdio;
>
> class C
> {
>     this(string user, string text) { _user = user; _text = text; }
>     string _user;
>     string _text;
> }
>
> void main()
> {
>     auto c = new C("John", "this is my message");
>     string input = c.text;
>     writeln(input);  // test.C, oops!
> }
> -----
>
> UFCS has its benefits but it also has its drawbacks. And yet I
> wouldn't want to ban the above code, but maybe we could figure out
> some system for protecting against UFCS calls in some contexts. The
> way I worked around the common `.text` pattern was to define a
> @disable'd field in the class, which would override the global text
> function and trigger a compilation error.

I think this really comes down to a poorly named function. textOf, toText, textify even, are better names that wouldn't cause this problem.

-Steve
April 25, 2014
On Friday, 25 April 2014 at 12:21:04 UTC, Steven Schveighoffer wrote:
> I think this really comes down to a poorly named function. textOf, toText, textify even, are better names that wouldn't cause this problem.
>
> -Steve

Well, in this case, yes, because it's a noun. For all functions that are verbs the issue still stands.