Jump to page: 1 2
Thread overview
"Best" way of passing in a big struct to a function?
Oct 10, 2012
Val Markovic
Oct 10, 2012
Val Markovic
Oct 10, 2012
Val Markovic
Oct 10, 2012
Jonathan M Davis
Oct 10, 2012
thedeemon
Oct 10, 2012
Jonathan M Davis
Oct 10, 2012
thedeemon
Oct 10, 2012
Don Clugston
Oct 10, 2012
H. S. Teoh
Oct 10, 2012
Jonathan M Davis
Oct 10, 2012
Val Markovic
Oct 10, 2012
Jonathan M Davis
October 10, 2012
TL;DR: what should I use if I need C++'s const& for a param?

Long version: I have a big struct provided by a library and I'm trying to pass instances of it to a function that will never need to modify the passed in value. Naturally I want to pass it efficiently, without incurring a copy. I know that I can use "const ref" in D, but is this the preferred way of doing it? How about "in ref"? Or something else?

Related background: I'm a D newbie, I've read TDPL and I loved it and I'm now working on a Markdown processor as a D learning exercise. I've written hundreds of thousands of C++ LOC and this is the perspective from which I look at D (and I love what I see).
October 10, 2012
Oh, and a related question: what is the best way to pass in an associative array like CustomStruct[string]? I can't say I'm too clear on how AA's are managed/implemented. Do they have value semantics or reference semantics? What about lists?

October 10, 2012
On Wednesday, 10 October 2012 at 04:55:48 UTC, Val Markovic wrote:
> Oh, and a related question: what is the best way to pass in an associative array like CustomStruct[string]? I can't say I'm too clear on how AA's are managed/implemented. Do they have value semantics or reference semantics? What about lists?

Ok, feel free to disregard this question; I just checked TDPL (should have done that first) and it clearly says that AA's  follow reference semantics. Dynamic arrays passed to functions actually pass in a light-weight object referring to the same underlying data (I'm guessing that a dynamic array is internally "nothing more" than a struct holding a pointer and a length, right?).


October 10, 2012
On Wednesday, October 10, 2012 06:27:52 Val Markovic wrote:
> TL;DR: what should I use if I need C++'s const& for a param?
> 
> Long version: I have a big struct provided by a library and I'm trying to pass instances of it to a function that will never need to modify the passed in value. Naturally I want to pass it efficiently, without incurring a copy. I know that I can use "const ref" in D, but is this the preferred way of doing it? How about "in ref"? Or something else?
> 
> Related background: I'm a D newbie, I've read TDPL and I loved it and I'm now working on a Markdown processor as a D learning exercise. I've written hundreds of thousands of C++ LOC and this is the perspective from which I look at D (and I love what I see).

Unlike in C++, const ref in D requires that the argument be an lvalue just like with ref, so if you define const ref, it won't work with rvalues. You'd need to create an overload which wasn't ref to do that. e.g.

auto foo(const S s)
{
    return foo(s);
}

auto foo(const ref S s)
{
    ...
}

And if you do that, make sure that the constness of the functions matches, or you'll get infinite recursion, because the constness gets matched before the refness of the type when the overload is selected when calling the function.

If your function is templated, then you can use auto ref

auto foo(S)(auto ref S s)
{
    ...
}

or

auto foo(S)(auto ref const S s)
{
    ...
}

If you want the function templated (to be able to use auto ref) but not its arguments, then do

auto foo()(auto ref const S s)
{
    ...
}

auto ref does essentially the same as the first example, but the compiler generates the overloads for you. But again, it only works with templated functions.

This whole topic is a bit of a thorny one in that D's design is trying to avoid some of the problems that allowing const T& to take rvalues in C++ causes, but it makes a situation like what you're trying to do annoying to handle. And auto ref doesn't really fix it (even though that's whole the reason that it was added), because it only works with templated functions. There have been some discussions on how to adjust how ref works in order to fix the problem without introducing the problems that C++ has with it, but nothing has actually be decided on yet, let alone implemented.

- Jonathan M Davis
October 10, 2012
> This whole topic is a bit of a thorny one in that D's design is trying to
> avoid some of the problems that allowing const T& to take rvalues in C++
> causes, but it makes a situation like what you're trying to do annoying to
> handle. And auto ref doesn't really fix it (even though that's whole the reason
> that it was added), because it only works with templated functions. There have
> been some discussions on how to adjust how ref works in order to fix the
> problem without introducing the problems that C++ has with it, but nothing has
> actually be decided on yet, let alone implemented.

So if I don't need to support accepting rvalues, is there an argument for "in ref" over "const ref"? "in ref" looks superior: it's more descriptive and from what the docs say, it gives even more guarantees about the behavior of the function.


October 10, 2012
On Wednesday, October 10, 2012 06:39:51 Val Markovic wrote:
> On Wednesday, 10 October 2012 at 04:55:48 UTC, Val Markovic wrote:
> > Oh, and a related question: what is the best way to pass in an associative array like CustomStruct[string]? I can't say I'm too clear on how AA's are managed/implemented. Do they have value semantics or reference semantics? What about lists?
> 
> Ok, feel free to disregard this question; I just checked TDPL (should have done that first) and it clearly says that AA's follow reference semantics. Dynamic arrays passed to functions actually pass in a light-weight object referring to the same underlying data (I'm guessing that a dynamic array is internally "nothing more" than a struct holding a pointer and a length, right?).

A dynamic array is effectively

DynamicArray(T)
{
    T* ptr;
    size_t length;
}

So, they're sort of reference types, sort of not. Passing a dynamic array by value will slice it, allowing the elements to still be mutated (because they point to the same memory) if they're mutable, but if you alter the array itself, it won't alter the original, and if you alter it enough, it could end up copying the array so that it's not a slice anymore (e.g. appending could require reallocating the array to make room for the new elements, thereby changing the ptr value from what it was originally).

You should read this article: http://dlang.org/d-array-article.html

Associative arrays on the other hand are entirely reference types. The one thing that you need to watch out for is that if you pass one to a function, and it's null, then when you add elements to it, it will create a new AA for the local variable in that function but not affect the one passed in (which is still null). But if it's non-null when it's passed in, then anything done to it in the function that it was passed to will affect the original (since they're one and the same).

- Jonathan M Davis
October 10, 2012
On Wednesday, October 10, 2012 06:51:50 Val Markovic wrote:
> So if I don't need to support accepting rvalues, is there an argument for "in ref" over "const ref"? "in ref" looks superior: it's more descriptive and from what the docs say, it gives even more guarantees about the behavior of the function.

In general, I'd advise aganist using in. in is an alias for const scope. scope is supposed to make it so that no references to that parameter can escape the function. This makes it utterly pointless for value types (as structs typically are). To make matters worse, it's not even properly implemented for anything beyond delegates. So, while something like

int[] foo(scope int[] arr)
{
    return arr;
}

is supposed to be illegal, the compiler currently allows it. So, if you use scope (or in) very much, you're going to get all kinds of compilation errors once scope has been fixed.

If you want const, use const, but aside from delegates, I wouldn't use scope at this point, so I wouldn't use in either. And for many cases, even if scope worked correctly, using in instead of const would be pointless, because the scope portion wouldn't be applicable and would be ignored.

Personally, I wish that in didn't exist at all. It just causes future bugs at this point and adds no value (since you can use const scope if that's what you want). But D1 had it (albeit with slightly different semantics), so it's still around in D2 for transitional purposes if nothing else.

- Jonathan M Davis
October 10, 2012
On Wednesday, 10 October 2012 at 04:55:48 UTC, Val Markovic wrote:
> Oh, and a related question: what is the best way to pass in an associative array like CustomStruct[string]? I can't say I'm too clear on how AA's are managed/implemented. Do they have value semantics or reference semantics?

Good question, I'd like to get some clarification on it too. Because it doesn't behave like, for example, class which surely has reference semantics.

When I've got a class

class C {
  int m;
}

and pass an object of this class to a function,

void mutate_C(C c)
{
  c.m = 5;
}

it follows reference semantics and its contents gets changed.

However if I pass an assoc. array to a function which changes its contents

void mutate_AA(string[int] aa)
{
  foreach(i; 0..10)
    aa[i*10] = "hi";
}

Then this code

  string[int] aa;
  mutate_AA(aa);
  writeln(aa);

outputs "[]" - changes are not applied.
It's only after I change parameter to "ref string[int] aa" its value get changed successfully.
October 10, 2012
On Wednesday, October 10, 2012 08:59:54 thedeemon wrote:
> On Wednesday, 10 October 2012 at 04:55:48 UTC, Val Markovic wrote:
> > Oh, and a related question: what is the best way to pass in an associative array like CustomStruct[string]? I can't say I'm too clear on how AA's are managed/implemented. Do they have value semantics or reference semantics?
> 
> Good question, I'd like to get some clarification on it too. Because it doesn't behave like, for example, class which surely has reference semantics.
> 
> When I've got a class
> 
> class C {
>    int m;
> }
> 
> and pass an object of this class to a function,
> 
> void mutate_C(C c)
> {
>    c.m = 5;
> }
> 
> it follows reference semantics and its contents gets changed.
> 
> However if I pass an assoc. array to a function which changes its contents
> 
> void mutate_AA(string[int] aa)
> {
>    foreach(i; 0..10)
>      aa[i*10] = "hi";
> }
> 
> Then this code
> 
>    string[int] aa;
>    mutate_AA(aa);
>    writeln(aa);
> 
> outputs "[]" - changes are not applied.
> It's only after I change parameter to "ref string[int] aa" its
> value get changed successfully.

The exact same thing would happen with a class. The problem is that the aa that you pass in is null, so if you assign anything to it within the function or otherwise mutate it, it doesn't affect the original. Making it ref fixes the problem, because then anything which affects the AA variable inside of the called function is operating on a reference to the original AA variable rather than just operating on what the original AA variable pointed to. Making sure that the aa has been properly initialized before passing it to a function (which would mean giving it at least one value) would make the ref completely unnecessary.

- Jonathan M Davis
October 10, 2012
On Wednesday, 10 October 2012 at 07:28:55 UTC, Jonathan M Davis wrote:
> Making sure that the aa has been properly initialized before passing it to a function (which would mean giving it at least one value) would make the ref completely unnecessary.
>
> - Jonathan M Davis

Ah, thanks a lot! This behavior of a fresh AA being null and then silently converted to a non-null when being filled confused me.
« First   ‹ Prev
1 2