April 27, 2012
On Friday, 27 April 2012 at 16:02:00 UTC, Steven Schveighoffer wrote:
> On Fri, 27 Apr 2012 11:33:39 -0400, Joseph Rushton Wakeling
>> On 27/04/12 17:18, Steven Schveighoffer wrote:

>>> const should not affect code generation *at all*, except for name mangling
>>> (const MyStruct is a different type from MyStruct), and generating an extra
>>> TypeInfo for const MyStruct and const MyStruct[]. Const is purely a compile-time
>>> concept.
>>>
>>> This cannot account for an extra 2 seconds. Something else is happening.

>> The code is here: https://github.com/WebDrake/Dregs/
>>
>> You'll see that there are about 8 different functions in there which receive as input an array of type Rating!(UserID, ObjectID, Reputation)[].  These were the inputs I was marking as const.
>
> Hm.. you have marked all your functions pure as well.  I don't think this will affect anything in the current implementation, but it might.  However, I'd expect the opposite (const version is faster) if anything.


 Perhaps I'm wrong, but seeing as your working with a struct I would think the following should be noted.

 First, pure and nothrow. Nothing wrong with labeling functions as such, but I would recommend removing them and once the whole thing is done and you get it in a finished state that you start adding them back in if you think they qualify. Recently I know I added a @safe token to a simple function which of course threw a range exception later. Oddly enough it was because I didn't check if the array was null before accessing it.

 Second, you are passing an array of structs. Either a fat pointer is being passed, or it's shallow copying; depending on the scenario either could be true. As an array it should be passing a fat pointer but if the compiler has to make certain guarantees based on your tags it may not. Depending on the size of the array, I would think this might be the case.

 Actually I _think_ it would in this case. Imagine a pure function that gives different results from the input when the input changes during execution. By making a local copy the input can't be changed from the outside and thereby returns that guarantee.

 Last try adding ref after const; At the off chance it's shallow copying, this should remove that.
April 27, 2012
On Sat, Apr 28, 2012 at 12:29:24AM +0200, Era Scarecrow wrote:
> On Friday, 27 April 2012 at 16:02:00 UTC, Steven Schveighoffer wrote:
[...]
> >Hm.. you have marked all your functions pure as well.  I don't think this will affect anything in the current implementation, but it might.  However, I'd expect the opposite (const version is faster) if anything.
> 
> 
>  Perhaps I'm wrong, but seeing as your working with a struct I would
>  think the following should be noted.
> 
>  First, pure and nothrow. Nothing wrong with labeling functions as
>  such, but I would recommend removing them and once the whole thing is
>  done and you get it in a finished state that you start adding them
>  back in if you think they qualify. Recently I know I added a @safe
>  token to a simple function which of course threw a range exception
>  later. Oddly enough it was because I didn't check if the array was
>  null before accessing it.

I recommend the opposite, actually. Most D code by default should be @safe (don't know about nothrow though). It's good to mark most things as @safe and pure, and let the compiler catch careless mistakes.


>  Second, you are passing an array of structs. Either a fat pointer is
>  being passed, or it's shallow copying; depending on the scenario
>  either could be true. As an array it should be passing a fat pointer
>  but if the compiler has to make certain guarantees based on your tags
>  it may not. Depending on the size of the array, I would think this
>  might be the case.

Dynamic arrays are always passed by reference (i.e. fat pointer). AFAIK the compiler does not change this just because of certain tags on the function.


> Actually I _think_ it would in this case. Imagine a pure function that gives different results from the input when the input changes during execution. By making a local copy the input can't be changed from the outside and thereby returns that guarantee.
[...]

No, that's wrong. The compiler checks the code at runtime to prevent impure code from slipping into the binary. It does not do anything to "patch over" impure code to make it pure.


T

-- 
Why are you blatanly misspelling "blatant"? -- Branden Robinson
April 27, 2012
On Friday, 27 April 2012 at 22:40:46 UTC, H. S. Teoh wrote:
> I recommend the opposite, actually. Most D code by default should be @safe (don't know about nothrow though). It's good to mark most things as @safe and pure, and let the compiler catch careless mistakes.

 Your probably right..

> Dynamic arrays are always passed by reference (i.e. fat pointer). AFAIK
> the compiler does not change this just because of certain tags on the function.

 That's why i wasn't sure... I was pretty sure it was passed via fat pointer but if adding @safe and pure makes it slower, something else is going on. Perhaps just a ton more checks to make sure it doesn't during debug mode?

> No, that's wrong. The compiler checks the code at runtime to prevent
> impure code from slipping into the binary. It does not do anything to
> "patch over" impure code to make it pure.

 I'm not going to argue, I don't know enough about the D compiler to know exactly what's going on; Plus I'd rather be wrong than right :)

 I guess use a profiler and check where your bottlenecks are. The results may surprise you. I've only glanced over the code so I can't offer anything more concrete.
April 28, 2012
On 28/04/12 00:29, Era Scarecrow wrote:
> Last try adding ref after const; At the off chance it's shallow copying, this
> should remove that.

Ahhh, that works.  Thank you!

Back story: originally the reputation() function just took the array ratings and made an internal copy, ratings_, which was used by the rest of the code.  I took that out in this commit: https://github.com/WebDrake/Dregs/commit/4d2a8a055321c2981a453fc4d82fb781da2ea5c7

... because I found I got about a 2s speedup.  It's exactly the speedup which was removed by adding "const" to the function input, so I presume it's as you say, that this was implicitly creating a local copy.
April 28, 2012
On Saturday, April 28, 2012 01:26:38 Era Scarecrow wrote:
> On Friday, 27 April 2012 at 22:40:46 UTC, H. S. Teoh wrote:
> > I recommend the opposite, actually. Most D code by default should be @safe (don't know about nothrow though). It's good to mark most things as @safe and pure, and let the compiler catch careless mistakes.
> 
> Your probably right..
> 
> > Dynamic arrays are always passed by reference (i.e. fat
> > pointer). AFAIK
> > the compiler does not change this just because of certain tags
> > on the function.
> 
> That's why i wasn't sure... I was pretty sure it was passed via fat pointer but if adding @safe and pure makes it slower, something else is going on. Perhaps just a ton more checks to make sure it doesn't during debug mode?

pure and @safe have _zero_ affect on the types of function parameters or how they're passed. They allow the compiler to provide additional compile-time checks. The _only_ case that I'm aware of where @safe affects code generation is that array bounds checking is not removed in @safe code with -release like it is in @system and @trusted code (-noboundscheck will remove it in all code). pure can affect code optimizations in _calling_ code (e.g. making it so that a strongly pure function is only called once in an expression when it's called multiple times with the same arguments within that expression), but it doesn't affect the function's code generation at all.

_Nothing_ affects how arrays are passed beyond ref, out, and in. The type of an array doesn't change due to attributes on the function that it's a parameter for. Arrays are literally structs that looks something like

struct DynamicArray(T)
{
 T* ptr;
 size_t length;
}

and their semantics for being passed to a function are identical to what you'd expect from any struct with such a definition.

http://dlang.org/d-array-article.html

- Jonathan M Davis
April 28, 2012
On Saturday, 28 April 2012 at 00:04:05 UTC, Joseph Rushton Wakeling wrote:
> On 28/04/12 00:29, Era Scarecrow wrote:
>> At the off chance it's shallow copying, this should remove that.
>
> Ahhh, that works.  Thank you!
>
> ... because I found I got about a 2s speedup.  It's exactly the speedup which was removed by adding "const" to the function input, so I presume it's as you say, that this was implicitly creating a local copy.

 Well glad you found it :) Probably the different 'type' made it force a (useless) conversion while inside the template.

 Odd that the wrong answer was also the right answer :)
April 28, 2012
On 27/04/12 20:26, Steven Schveighoffer wrote:
> No, it can't. There can easily be another non-const reference to the same data.
> Pure functions can make more assumptions, based on the types, but it would be a
> very complex determination in the type system to see if two parameters alias the
> same data.
>
> Real optimization benefits come into play when immutable is there.

Question on the const/immutable distinction.  Given that my function has inputs of (size_t, size_t, Rating[]), how come the size_t's can be made immutable, but the Rating[] can only be made const/const ref?

Is it because the size_t's can't conceivably be changed from outside while the function is running, whereas the values in the array in principle can be?
April 28, 2012
Joseph Rushton Wakeling:

> ... because I found I got about a 2s speedup.  It's exactly the speedup which was removed by adding "const" to the function input, so I presume it's as you say, that this was implicitly creating a local copy.

I suggest to take a look at the asm in both cases, and compare. Adding "const" doesn't cause local copies.

Bye,
bearophile
April 28, 2012
On 28/04/12 14:53, bearophile wrote:
> I suggest to take a look at the asm in both cases, and compare. Adding "const"
> doesn't cause local copies.

I'm afraid I have no idea how to do that, or what to look for.
April 28, 2012
On Saturday, April 28, 2012 13:17:36 Joseph Rushton Wakeling wrote:
> On 27/04/12 20:26, Steven Schveighoffer wrote:
> > No, it can't. There can easily be another non-const reference to the same data. Pure functions can make more assumptions, based on the types, but it would be a very complex determination in the type system to see if two parameters alias the same data.
> > 
> > Real optimization benefits come into play when immutable is there.
> 
> Question on the const/immutable distinction.  Given that my function has inputs of (size_t, size_t, Rating[]), how come the size_t's can be made immutable, but the Rating[] can only be made const/const ref?
> 
> Is it because the size_t's can't conceivably be changed from outside while the function is running, whereas the values in the array in principle can be?

size_t is a value type. Arrays are reference types. So, when you pass a size_t, you get a copy, but when you pass an array, you get a slice of the array. You could make that slice const, but you can't make something immutable, because immutable stuff must _always_ be immutable. If you want an immutable array (or array of immutable elements), you must make a copy. If you call idup on an array, you'll get a copy of that array where each of its elements are immutable.

And yes, because size_t is copied, it can't be changed from the outside, whereas because Rating[] is only sliced, it can be. const protects against it being altered within the function (for both of them), but doesn't protect the original array from being modified. immutable, on the other hand, is _always_ immutable and can never be mutated. But because of that, you can't convert something to immutable without making a copy (which happens automatically with value types but must be done explicitly with reference types).

- Jonathan M Davis