May 16, 2012
Since this post is about inuition, I'll raise my voice even though I'm a total D newbie (with background in C++, C#, Haskell, etc.).

From my point of view, the AssumeSorted way makes sense, but without getting a hit on it when searching for BinarySearch it's a bit cryptic. On the other hand, with search hits on the subject in the documenation, anyone should be able to use it. It's a matter of choosing if you want to express what should be done or how it should be done.

The concern I can see is about the difference in contract. BinarySearch explains how the operation is being performed and _promises_ to do a binary search. Contain does not promise how it's performed (that's the whole point, right?) and you're at the mercy of the library writer. For carefully profiled code, that can be a disadvantage, since any implementation change could lead to hidden changes in performance characteristics. Then you might want the control over exactly how the operation is performed.

The way I see it, both AssumeSorted.Contains and BinarySearch are overlapping in functionality (i.e. a fast search) but not equivalent in contract.
May 16, 2012
Kirill:

> How about users who don't know what binary search is. binary search is an intuitive concept for people who have good programming experience but assumeSorted(r).contains(x) is more intuitive to higher level users.

Even if you don't know the name "binary search" (something that all CS students learn very early in their studies, a concept that is also taught in "Unplugged CS" to little children) how can you be sure that "assumeSorted(r).contains(x)" will perform a "fast search" like when you search people names in a sorted sequence?


> So in short, I support that Andrei aims for larger audience.

Then why is Andrei using the name std.algorithm.schwartzSort? For every 20 CS students and programmers that know what a "binary search" is, you probably get only 1-4 that know what a "Schwartz Sorting" is. So I think your explanation of Andrei aim doesn't hold water.

Bye,
bearophile
May 16, 2012
On Wednesday, 16 May 2012 at 07:05:29 UTC, Anders Sjögren wrote:
> For carefully profiled code, that can be a disadvantage

Either that, or just the plain algorithm. Then the issue isn't profiling -- it's the time complexity.

e.g. You can't simply "pretend" it's doing the best thing possible, because libraries never get it right *all* the time. So IMHO saying /how/ to do it is a lot more useful than /what/ to do, since it lets you choose whether your algorithm is O(log(n)) vs. O(n) vs. O(1).

(Yes, this means I don't like C++'s std::map or std::unordered_set either, because they tell you nothing about the time complexity. Java did it correct there.)
May 16, 2012
On 05/16/2012 09:29 AM, bearophile wrote:
> Kirill:
>
>> How about users who don't know what binary search is. binary search is
>> an intuitive concept for people who have good programming experience
>> but assumeSorted(r).contains(x) is more intuitive to higher level users.
>
> Even if you don't know the name "binary search" (something that all CS
> students learn very early in their studies, a concept that is also
> taught in "Unplugged CS" to little children) how can you be sure that
> "assumeSorted(r).contains(x)" will perform a "fast search" like when you
> search people names in a sorted sequence?
>

By assuming the library writer is sane, or by reading the documentation.

>
>> So in short, I support that Andrei aims for larger audience.
>
> Then why is Andrei using the name std.algorithm.schwartzSort? For every
> 20 CS students and programmers that know what a "binary search" is, you
> probably get only 1-4 that know what a "Schwartz Sorting" is. So I think
> your explanation of Andrei aim doesn't hold water.
>

I agree, that the explanation does not hold water, not because of other parts of the API, but simply because binary search is such a basic concept that I wouldn't trust a programmer who does not understand it.
May 16, 2012
On 5/16/12 2:29 AM, bearophile wrote:
> Kirill:
>
>> How about users who don't know what binary search is. binary search is
>> an intuitive concept for people who have good programming experience
>> but assumeSorted(r).contains(x) is more intuitive to higher level users.
>
> Even if you don't know the name "binary search" (something that all CS
> students learn very early in their studies, a concept that is also
> taught in "Unplugged CS" to little children) how can you be sure that
> "assumeSorted(r).contains(x)" will perform a "fast search" like when you
> search people names in a sorted sequence?
>
>
>> So in short, I support that Andrei aims for larger audience.
>
> Then why is Andrei using the name std.algorithm.schwartzSort? For every
> 20 CS students and programmers that know what a "binary search" is, you
> probably get only 1-4 that know what a "Schwartz Sorting" is. So I think
> your explanation of Andrei aim doesn't hold water.

Just trying my best to do the right thing by everyone. Not much else to read into it.

Andrei
May 16, 2012
Andrei you are against an in-operator for array, because it would provide a uniform interface for arrays and AA with different complexity.  Is contains with different complexity for ranges and for SortedRange not the same?


May 16, 2012
On Wed, 16 May 2012 10:47:23 -0400, Tobias Pankrath <panke@tzi.de> wrote:

> Andrei you are against an in-operator for array, because it would provide a uniform interface for arrays and AA with different complexity.  Is contains with different complexity for ranges and for SortedRange not the same?

No it's not the same.

It all depends on what the function advertises as its complexity.

It's OK for a function that advertises O(n) complexity to be applied to a type that's optimized into O(lgn) complexity.

But it's NOT OK for a function that advertises O(lgn) complexity to be applied to a type that requires O(n) complexity.

It all depends on what the existing situation is.  Generic code is written against the documentation, because it doesn't know what's actually going to be implemented underneath.

-Steve
May 16, 2012
On 16-05-2012 17:28, Steven Schveighoffer wrote:
> On Wed, 16 May 2012 10:47:23 -0400, Tobias Pankrath <panke@tzi.de> wrote:
>
>> Andrei you are against an in-operator for array, because it would
>> provide a uniform interface for arrays and AA with different
>> complexity. Is contains with different complexity for ranges and for
>> SortedRange not the same?
>
> No it's not the same.
>
> It all depends on what the function advertises as its complexity.
>
> It's OK for a function that advertises O(n) complexity to be applied to
> a type that's optimized into O(lgn) complexity.
>
> But it's NOT OK for a function that advertises O(lgn) complexity to be
> applied to a type that requires O(n) complexity.
>
> It all depends on what the existing situation is. Generic code is
> written against the documentation, because it doesn't know what's
> actually going to be implemented underneath.
>
> -Steve

People aren't using 'in' in generic code at all, so I'm not sure this comparison makes sense anyway.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
May 16, 2012
On 5/16/12 9:47 AM, Tobias Pankrath wrote:
> Andrei you are against an in-operator for array, because it would
> provide a uniform interface for arrays and AA with different complexity.
> Is contains with different complexity for ranges and for SortedRange not
> the same?

It is. That's why there's no contains with linear complexity.

Andrei

May 16, 2012
> It is. That's why there's no contains with linear complexity.
>
> Andrei

I missed that. However SortedRange needs better documentation.
Before I wrote this, I tried to look it up and it took some time,
because I was looking in std.algorithm but SortedRange resides in
std.range.

The only reference to SortedRange from std.algorithm I could
find, was the use of assumeSorted (with no link).

So I think the first line of the description of
std.algorithm.find should be sometihng like: for binarySearch
look at SortedRange. Or include SortedRange in the table at the
top, but link to std.range.