Thread overview
[Issue 8026] New: Fix or disallow randomShuffle() on fixed-sized arrays
May 03, 2012
Jonathan M Davis
May 04, 2012
Jonathan M Davis
May 11, 2012
Vidar Wahlberg
May 11, 2012
Jonathan M Davis
May 03, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8026

           Summary: Fix or disallow randomShuffle() on fixed-sized arrays
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody@puremagic.com
        ReportedBy: bearophile_hugs@eml.cc


--- Comment #0 from bearophile_hugs@eml.cc 2012-05-03 13:45:51 PDT ---
This comes after a report by Vidar Wahlberg: http://forum.dlang.org/thread/jnu1an$rjr$1@digitalmars.com

Several functions of std.algorithm don't work with fixed-sized arrays, and require you to slice them first to turn them into ranges. But std.random.randomShuffle() accepts fixed-sized arrays as well:


import std.stdio: writeln;
import std.random: randomShuffle;
void main() {
    int[6] data = [1, 2, 3, 4, 5, 6];
    randomShuffle(data);
    writeln(data);
}


This prints the unshuffled array:
[1, 2, 3, 4, 5, 6]


This is bug-prone, and in my opinion it's not acceptable.

I see two alternative solutions:
1) To make randomShuffle() properly support fixed-sized arrays, taking them by
reference;
2) To make randomShuffle() refuse a fixed-sized array at compile-time.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 03, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8026


Jonathan M Davis <jmdavisProg@gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg@gmx.com
           Severity|enhancement                 |normal


--- Comment #1 from Jonathan M Davis <jmdavisProg@gmx.com> 2012-05-03 14:07:29 PDT ---
Well, that's certainly weird. Range-based functions don't normally take static arrays, and I'd argue that they shouldn't, given the problems surrounding slicing static arrays (it's fine to do it, but you need to be aware of what you're doing) - though randomShuffle doesn't have the same problem as most range-based functions do with static arrays given that it's void. Still, I'd argue that it's probably better for it to require slicing like all the rest.

It looks like the problem is that randomShuffle doesn't have a template constraint (obviously not good), so I very much doubt that it was ever intended to work with static arrays without slicing. uninformDistribution (which is right above randomShuffle) appears to have the same problem.

I'd say that this is definitely a bug, not an enhancement. If I remember, I'll probably throw together a pull request for it tonight, since it should be a quick fix.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 04, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8026



--- Comment #2 from Jonathan M Davis <jmdavisProg@gmx.com> 2012-05-03 19:55:52 PDT ---
https://github.com/D-Programming-Language/phobos/pull/565

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 11, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8026


Vidar Wahlberg <canidae@exent.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |canidae@exent.net


--- Comment #3 from Vidar Wahlberg <canidae@exent.net> 2012-05-11 13:27:59 PDT ---
(In reply to comment #1)
> Well, that's certainly weird. Range-based functions don't normally take static arrays, and I'd argue that they shouldn't, given the problems surrounding slicing static arrays (it's fine to do it, but you need to be aware of what you're doing) - though randomShuffle doesn't have the same problem as most range-based functions do with static arrays given that it's void. Still, I'd argue that it's probably better for it to require slicing like all the rest.

Are the problems surrounding slicing static arrays easily explainable?

From my point of view (fairly new to the language) it's confusing that you can
pass a dynamic array to a Range-based function, while you'll need to slice a
static array if you want to pass that data to a such function.
To me it seems more user friendly if you could pass even static arrays to such
method, but I presume there's a good reason why this is avoided. I haven't
quite managed to figure out this yet, any pointers would be appreciated.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 11, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8026



--- Comment #4 from Jonathan M Davis <jmdavisProg@gmx.com> 2012-05-11 15:39:26 PDT ---
http://stackoverflow.com/questions/8873265/is-a-static-array-a-forward-range

A static array is a value type. It owns its own memory. It's a container, not a range.

A dynamic array is a reference type. It does _not_ own its own memory (the runtime does). It is _not_ a container. It _is_ a range.

If I do

int[] func()
{
    auto arr = [1, 2, 3, 4, 5];
    return find(arr, 3);
}

there is _zero_ difference between passing arr and passing arr[]. In either case, you're slicing the whole array. And the array being returned is a slice of arr ([3, 4, 5] to be exact) whose memory is on the heap, owned by the runtime.

If I do

int[] func()
{
    int[5] arr = [1, 2, 3, 4, 5];
    return find(arr[], 3);
}

I've now allocated a static array _on the stack_. Passing arr to a function would copy it (since it's a value type). Passing arr[] to a function would slice it, which means passing a reference to the static array. Now look at what func returns. It's returning a slice of arr, which is a _local variable_ _on the stack_. What you've done is the equivalent of

int* func()
{
    int i;
    return &i;
}

You've returned a reference/pointer to a local variable which no longer exists. Bad things will happen if you do this. So, the semantics of dealing with dynamic and static arrays are _very_ different. Slicing a static array in the wrong place can lead to major bugs, whereas slicing a dynamic array is perfectly safe.


Now, as to range-based functions in general. They're templated functions. That means that they use IFTI (implicit function template instantiation) - i.e. it infers the type. So, if you have

auto func(T)(T foo)
{
    ...
}

int[] dArr;
int[5] sArr;

foo(dArr);
foo(sArr);

T is going to be inferred as the _exact type_ that you pass in. So, in the first case, func gets instantiated as

auto func(int[] foo)
{
    ...
}

whereas in the second, it gets instantiated as

auto func(int[5] foo)
{
    ...
}

int[] is a range, int[5] is not. So, if func is a range-based function, it should have a template constraint on it, and the static array will fail that constraint.

auto func(T)(T foo)
    if(isInputRange!T)
{
    ...
}

Static arrays are _not_ ranges and _cannot_ be ranges. At their must basic level (the input range), ranges must implement empty, front, and popFront, and static arrays _cannot_ implement popFront, because you cannot change their length. The _only_ way that a static array can be passed to a range-based function is if it's sliced so that you have a dynamic array (which _is_ a range). And as templates take the _exact_ type, no templated function will automatically slice a static array for you. The language _could_ be changed so that IFTI treated static arrays as dynamic arrays and automatically sliced them, but then you'd have problems creating templated functions that actually wanted to take static arrays rather than dynamic ones.

You _could_ overload every range-based function with an overload specifically for dynamic arrays (with the idea that the static array would be sliced when it's passed to it as happens with non-templated functions which take dynamic arrays), but that would be a royal pain. It would also significantly increase the risk of using static arrays, because you'd end up returning ranges which reference the static array from whatever range-based function you called, and the fact that you're dealing with a static array could very quickly become buried (easily resulting in returning a range to a static array which then no longer exists, because it was a local variable). It's much better to require that the programmer explicitly slice the static array, because then they know that they're slicing it and then can know that they have to watch to make sure that no reference to that static array escapes.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8026



--- Comment #5 from github-bugzilla@puremagic.com 2012-05-20 15:40:25 PDT ---
Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/76def7af856dfa898b78c0cf67228dac87dd4d83 Fix Issue# 8026.

I added template constraints to all of the templates missing template constraints in std.random - including randomShuffle, per the bug.

https://github.com/D-Programming-Language/phobos/commit/28bca62304c0ec21a9062b4dc4fcb64b80c1c5d8 Merge pull request #565 from jmdavis/8026

Fix issue 8026

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 22, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8026


bearophile_hugs@eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


--- Comment #6 from bearophile_hugs@eml.cc 2012-05-22 15:29:22 PDT ---
I think few Phobos functions (like walkLength) should accept fixed-sized arrays
too, but now randomShuffle() statically refuses fixed sized arrays and this is
acceptable. So I close this bug report.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------