auto ref escaping local variable (page 2)

January 24, 2017
Re: auto ref escaping local variable
Posted by Jonathan M Davis
in reply to Ali Çehreli
Permalink
Jonathan M Davis
Posted in reply to Ali Çehreli
Permalink
On Tuesday, January 24, 2017 11:16:21 Ali Çehreli via Digitalmars-d wrote:
> On 01/24/2017 02:03 AM, Jonathan M Davis via Digitalmars-d wrote:
>  > On Tuesday, January 24, 2017 00:47:31 Ali Çehreli via Digitalmars-d

> Obviously, I know all of that and they are pretty complicated for new programmers.
>
> I just can't imagine what the semantics of a function could be. Do you have an example? So, we're talking about a function that will mutate its argument but the caller sometimes doesn't care. Oh, this sounds like functions from the C era, which take null when the caller does not care.
>
> So, is this the guideline? "Make the argument 'auto ref' when you have something to return in addition to the return value." If so, it's sub-obtimal because the 'auto ref' doesn't have the opportunity of bypassing operations like the C function could:
>
>      if (arg) {
>          // Do expensive operation
>      }
>
> If I guessed the semantics right, non-const 'auto ref' does not have that luxury.

In general, I think that the guideline is to not bother with ref at all if you're not explicitly trying to get a value back. If you have a struct that's expensive enough to copy around that you need ref, then maybe it shouldn't be a struct on the stack. And if you care about optimizing stuff enough to use ref to avoid copies, then you should understand it well enough to understand the consequences of using it.

In general, I think that the place for auto ref is when you're trying to forward ref-ness like when you're wrapping a range, and you want the ref-ness of the wrapped front to be passed on to the wrapper range _if_ it returns by ref, but you don't want it to be ref if it's not ref, and you don't want to do a bunch of static ifs to make it work for both.

And if you're looking to have a function that accepts both rvalues an lvalues, auto ref makes sense so long as the function doesn't mutate its arguments. Adding const is nice in that it then gurantees that it doesn't, but it's so restrictive that it usually makes no sense for generic code (at least not if it's dealing with arbitrary types as opposed to a specific group of known types like all integer types). So, using auto ref in place of const& in C++ makes sense so long as you're willing to be careful about not mutating the argument, and const auto ref _can_ make sense, but it's restrictive enough that it probably doesn't.

And as you indicated, using auto ref with a function that either might or will mutate its argument is likely to be rarely useful. It makes sense when you're looking to pass ref-ness along (which makes sense in some generic code but most code isn't going to want to do that), and it makes sense if you're paranoid about unnecessary copies and are willing to force the caller to make a copy if they don't want their variable mutated when it's passed in (since then copying only happens if the caller makes it happen), but that puts an unusual and arguably error-prone burden on the caller. So, in general, I would expect that auto ref would be used when either passing along refness or when the programmer wanted an equivalent to const& and was willing to risk mutation occuring by accident.

Skipping auto ref and manually overriding the function like you were suggesting doesn't fix any of these complications though. It just makes them more explicit (and thus possibly more clear to the programmer if they don't understand auto ref enough), and it makes it so that the functions can be virtual. Aside from when you need to pass on ref-ness, what you want in principle is auto ref const, but const is just too restrictive to work in the general case. So, I can't possibly recommend to anyone that they start slapping const on function parameters by default, auto ref or not.

But all of these complications are part of why I would simply recommond _not_ using ref unless you specifically _want_ the argument to be mutated and that's part of the function's API or if you know what you're doing and know that you need to avoid the cost of the copy. And just don't have structs that are expensive to copy. Phobos already tends to assume that - especially for ranges. And it's so incredibly easy to accidentally copy an object if you mess up with ref that relying on getting ref right doesn't seem like a great solution in general. Also, D has move semantics built into the language, making it so that expensive copying is not as big a problem in D as it is in C++ (particularly C++98).

So, I would start by just not using ref, and if profiling indicated that I had a struct that was too expensive to be copying around, I would then look at either putting it on the heap and avoiding the whole problem or using ref and auto ref to avoid copies, but then I'm taking upon myself the burden of making sure that I get ref right enough that I don't end up with unintended copies too frequently.

> const is still engrained in my programming mind due to long exposure to C and C++. I guess D is proving that it's not that essential to be const-correct. This is similar to how private is not as strong and in some cases public is the default.

const is great in principle, but it is so restrictive in D as to be borderline useless. If you're just dealing with built-in types, it works reasonably well. But as soon as you have user-defined types and indirections, then life gets disgusting fast. Postblit constructors don't work with const. Ranges don't work with const. const tends be viral in that once something is const, you can't get anything non-const out of it, and it's difficult to do anything like tail-const outside of arrays, which the language understands well enough that it makes tail-const work for them. Ref-counting doesn't work with const. If your container is const (or your reference to the container is const), it's going to be really hard to get a range over that container - even more so if you want the range to detect when the container is mutated out from under it and protect you like an iterator would in Java. The list of stuff that doesn't work with const just piles up as your program becomes more complicated.

In theory, we might be able to fix some of these problems - like if we could figure how to get postblit constructors to work with const or if we could figure out some way for a range to indicate how it could be converted to a tail-const variant of itself - but once you have transitive const, it locks everything down way more than occurs in C++, and I think that a number of the problems with const are simply insurmountable as long as const has no backdoors. You're basically getting immutable but without the benefits.

So, I'm all for using const where it works, but I'm not at all in a hurry to slap it on anything where the types aren't well-known, and it's the sort of thing that I expect to have to be removed at some point if I start using it on user-defined types. And if you're using ranges, then const pretty much goes out the window right there. So, while I would love to be able to use const more (if fact, the whole reason that I started off with D2 back in 2008 rather than D1 was because D2 had const, and D1 didn't), experience has shown that D's const is simply too restrictive to be useful in the general case. If you try very hard to use it, you can use it, but odds are that you're simply not going to be able to use it on most code, and I'm almost to the point that I simply wouldn't bother with it aside from making local variables of built-in types const. I do still try and use it on member functions where it's clear that it will work, but increasingly, I just don't bother.

- Jonathan M Davis
Forums