June 07, 2012
On 06/07/2012 10:53 PM, Peter Alexander wrote:
> On Thursday, 7 June 2012 at 20:04:56 UTC, Timon Gehr wrote:
>> On 06/07/2012 08:34 PM, Peter Alexander wrote:
>>>
>>> I find this very ugly. To be honest, I would be much happier without all
>>> that mostNegative and common type stuff.
>>> If I want to get the min
>>> between a short and an int I'll just cast them appropriately.
>>
>> The mostNegative and common type stuff is there to enable correct
>> semantics for taking the minimum of signed and unsigned values.
>> Minimum of short and int would work fine without that.
>
> I know what it's there for, I don't think it is necessary. I don't think
> people should be taking the min of signed and unsigned values.

What is important is that they can. ;)

> I don't think it's worth cluttering the implementation for this minor convenience.
>

There is not much there to clutter in this case. Therefore cluttering
does not have a significant drawback. Is it just about aesthetics?

>
>>> I'd much rather have a simpler standard library than a complicated
>>> one for the
>>> sake of a little convenience.
>>>
>>
>> 'min' is not complicated.
>
> It's more complicated than it could/should be:
>
> T min(T)(T x, T y) { return x < y ? x : y; }
>

This implementation is unstable. If you fix that, then I'll agree that
it is sufficient for most cases. There is not really a reason to
restrict the generality of the implementation to this though. What we
have is a strict superset.

> To be honest, I don't think the variadic argument versions are necessary
> either as it is just duplicating the functionality of reduce.
>

It is not really duplicating the functionality. Still, I haven't used
these versions so far either.
June 07, 2012
On 06/07/12 22:46, Dmitry Olshansky wrote:
> 2) Immutable strings have performance cost ... like tracking ownership and reference counting - unless you have GC by your side, in which case it's cheap and legal :)

Until you have enough of them and your program starts to spend most of its time trying to GC them. [1]

artur

[1[ OK, so you might need 13M+ strings totalling 450M+. Still.. ;)
June 07, 2012
On 6/7/12 3:04 PM, Timon Gehr wrote:
> On 06/07/2012 08:34 PM, Peter Alexander wrote:
>>
>> I find this very ugly. To be honest, I would be much happier without all
>> that mostNegative and common type stuff.
>> If I want to get the min
>> between a short and an int I'll just cast them appropriately.
>
> The mostNegative and common type stuff is there to enable correct
> semantics for taking the minimum of signed and unsigned values. Minimum
> of short and int would work fine without that.
>
>> I'd much rather have a simpler standard library than a complicated one
>> for the
>> sake of a little convenience.
>>
>
> 'min' is not complicated.

I agree. BTW Howard Hinnant's proposal N2199 (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2199.html) introduces correct C++ min/max functions that take 175 lines and introduce 10 types and 12 specializations. (Proposal was rejected.) The power of min/max is about on a par with D's.

Andrei
June 07, 2012
On Thursday, 7 June 2012 at 21:22:57 UTC, Timon Gehr wrote:
> There is not much there to clutter in this case. Therefore cluttering
> does not have a significant drawback. Is it just about aesthetics?

Aesthetics is part of it, but it's more than that.

It makes the code more difficult to read. I would need to step through the definition of MinType to find out what min(uint, int) returns.

It adds to compile times and compiler memory usage (it all adds up...)

It adds more code paths that need to be tested/debugged. Bugs like this for example: http://d.puremagic.com/issues/show_bug.cgi?id=8158 (note: it compiles with my simple min). Also note how the error message leaks all the clutter from min, making it difficult to tell exactly what the error is. There would be no complicated errors from my version, no matter what you did.

It makes it difficult to reason about how different types work with the function (e.g. with Date). With the complicated version in std.algorithm I have to now worry about what isIntegral returns for my type, and what mostNegative returns. If I create a BigInteger type and a BigUnsignedInteger type, how do I make sure it works correctly if I use those two as arguments to min? I'm sure the answer is simple, but finding that answer is non-trivial.


>> It's more complicated than it could/should be:
>>
>> T min(T)(T x, T y) { return x < y ? x : y; }
>>
>
> This implementation is unstable. If you fix that, then I'll agree that
> it is sufficient for most cases. There is not really a reason to
> restrict the generality of the implementation to this though. What we
> have is a strict superset.

Good spot on the stability.

There's nothing wrong with a strict superset, we just have to admit that there is a non-zero cost to this extra generality. I have listed some of the costs above. Some are arguable, but it is inarguable that at least one bug has arisen as a direct result of its complexity. I suspect there will be more over time. Multiply that with all the functions that try to be excessively general.
June 07, 2012
On 6/7/12 5:19 PM, Peter Alexander wrote:
> On Thursday, 7 June 2012 at 21:22:57 UTC, Timon Gehr wrote:
>> There is not much there to clutter in this case. Therefore cluttering
>> does not have a significant drawback. Is it just about aesthetics?
>
> Aesthetics is part of it, but it's more than that.
>
> It makes the code more difficult to read.
[snip]

Great points, example could be a lot better. Maybe it's time you do get started on find(). Destroy or be destroyed.

Andrei


June 07, 2012
On 08.06.2012 0:46, Dmitry Olshansky wrote:
> On 07.06.2012 20:04, bearophile wrote:
>> The slide packs of the conference C++Now! 2012 are available:
>> https://github.com/boostcon/cppnow_presentations_2012
>>
>
> Thanks, nice stuff for my brain to chew on.
>

Continuing observing pages, noting apparent C++ 11 decadence. Yeah, they did an amazing job, but there is a ton of problems yet to fix.
(most things are "we should have had this long ago but here it is...")

Now imagine the size of language once it has modules, new concepts and some fresh kitchen sink library additions... On top of all that a sign : "here be backwards compatibility".

Anyway, other thoughts:
- C++11 allocators - interesting, how do our upcoming compare with it? At least we should have more freedom in design.

- It's fun to see M$ guy give a talk about regex. Last time I checked their STL it was a bleak shadow of boost ones in terms of performance (why not just copy good stuff if you can't get it right?)

- about metaparse - It's quite delightful to see how these guys have to bend themselves in a suicide acrobatics to get C++ to parse DSL at compile time. Best slides ever :)


-- 
Dmitry Olshansky
June 07, 2012
On Thursday, 7 June 2012 at 22:08:10 UTC, Andrei Alexandrescu wrote:
> On 6/7/12 3:04 PM, Timon Gehr wrote:
>> 'min' is not complicated.
>
> I agree.

Then how come it has a bug where it doesn't work with user-defined types?

Maybe it isn't complicated per your definition, but it's complicated enough that it has a bug for a prominent use case.
June 07, 2012
On 6/7/12 5:47 PM, Peter Alexander wrote:
> On Thursday, 7 June 2012 at 22:08:10 UTC, Andrei Alexandrescu wrote:
>> On 6/7/12 3:04 PM, Timon Gehr wrote:
>>> 'min' is not complicated.
>>
>> I agree.
>
> Then how come it has a bug where it doesn't work with user-defined types?

I just disagree. It's not complicated for what it does. It simply sets out to solve a number of situations, and it solves them in pretty much as many words as the English description thereof.

> Maybe it isn't complicated per your definition, but it's complicated
> enough that it has a bug for a prominent use case.

Complication is not necessarily the reason for bugs. Your one-liner had a bug.


Andrei
June 07, 2012
On Thursday, 7 June 2012 at 22:29:09 UTC, Andrei Alexandrescu wrote:
> Great points, example could be a lot better. Maybe it's time you do get started on find(). Destroy or be destroyed.

Ok...

This overload:

R find(alias pred = "a == b", R, E)(R haystack, E needle)
if (isInputRange!R &&
        is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{
    for (; !haystack.empty; haystack.popFront())
    {
        if (binaryFun!pred(haystack.front, needle)) break;
    }
    return haystack;
}


Is fine. In fact it's practically perfect. It's the obvious solution to a simple problem. The biggest problem is the "is typeof" syntax, but that's not your fault.

Second overload:

R1 find(alias pred = "a == b", R1, R2)(R1 haystack, R2 needle)
if (isForwardRange!R1 && isForwardRange!R2
        && is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool)
        && !isRandomAccessRange!R1)
{
    static if (is(typeof(pred == "a == b")) && pred == "a == b" && isSomeString!R1 && isSomeString!R2
            && haystack[0].sizeof == needle[0].sizeof)
    {
        //return cast(R1) find(representation(haystack), representation(needle));
        // Specialization for simple string search
        alias Select!(haystack[0].sizeof == 1, ubyte[],
                Select!(haystack[0].sizeof == 2, ushort[], uint[]))
            Representation;
        // Will use the array specialization
        return cast(R1) .find!(pred, Representation, Representation)
            (cast(Representation) haystack, cast(Representation) needle);
    }
    else
    {
        return simpleMindedFind!pred(haystack, needle);
    }
}

As far as I can tell, this is a proxy for various substring search implementations.

Problem 1: Why is find overloaded to do substring search? Why does it do substring and not subset or subsequence? substring is a completely different algorithm from linear search and even has different asymptotic running time. This is needlessly overloaded, it adds nothing.

Problem 2: The attempted specialisation is atrocious. It compares the predicate string with "a == b". I just did a quick check, and this means that these two calls DO NOT use the same algorithm:

string a = ..., b = ...;
auto fast = find!"a == b"(a, b);
auto slow = find!"a==b"(a, b);

I have to be honest, I have only just noticed this, but that's really sloppy.

It's also a direct symptom of over-complicated code. As a user, I would 100% expect these calls to be the same. If I accidentally used the second version, I would have no warning: my code would just run slower and I'd be none the wiser. Only upon careful inspection of the source could you discover what this "simple" code does, and you would be horrified like I am now.

The next two overloads of find appears to implement a couple of reasonably fast substring. Again, it would take me a minute or so to figure out exactly what algorithm was being used.

After that there's a multi-range find. Seems simple enough, but it's yet another overload to consider and I'm not sure it's commonly used enough to even warrant existence. I'd hate to think what would happen if I wanted the single-range search but accidentally added an extra parameter.

In summary: the specialisation detection is shocking, which leads me to question what other static ifs are accidentally firing or failing. If the code was more simple, and less overloaded it would be easy to reason about all this, but it's not. The various find implementations span a few hundred lines, and all have numerous compile time branches and checks. The cyclomatic complexity must be huge.

How I'd do things:

- The first version of find would be the only one. No overloads, no specialisation.
- substring would be a separate, non-predicated function. It would be specialised for strings. I'm too tired now to think about the variety of range specialisations, but I'm sure there's a more elegant way to handle to combinations.
- Drop the variadic find altogether.
- Get rid of BoyerMooreFinder + find overload, replace with a boyerMoore function.

June 08, 2012
On Friday, June 08, 2012 00:47:06 Peter Alexander wrote:
> On Thursday, 7 June 2012 at 22:08:10 UTC, Andrei Alexandrescu
> 
> wrote:
> > On 6/7/12 3:04 PM, Timon Gehr wrote:
> >> 'min' is not complicated.
> > 
> > I agree.
> 
> Then how come it has a bug where it doesn't work with user-defined types?

Because it wasn't properly tested for that. Anything can have a bug in it. It's true that the more complicated something is, the more likely it is that it's going to have bugs, but the fact that something has a bug doesn't necessarily mean that it's overly complicated.

I admit that it surprised me how complicated std.algorithm.min is, but I don't think that it's all that bad. That extra complication does allow it to handle comparing signed and unsigned types more correctly. It's just a matter of fixing this one bug and adding proper tests so that it doesn't break like this again. And there's already a pull request to fix that (though it appears that I forgot to put a comment for it in the bug):

https://github.com/D-Programming-Language/phobos/pull/612

- Jonathan M Davis