October 07, 2010
Tomek Sowiński <just@ask.me> wrote:

> __traits(getAttribute, opIn, @complexity).bigOh == O.constant

How does this test for things like N log M?

__traits(getAttribute, opIn, @complexity).bigOh == tuple( O.linear, O.logarithmic )

?

-- 
Simen
October 07, 2010
2010/10/8 Tomek Sowiński <just@ask.me>:
> bearophile napisał:
>
>> Another solution is to support the "in" operator for dynamic arrays too, and define a new attribute, like @complexity(), plus an Enum that allows to specify the worst case complexity. So associative arrays are annotated with @complexity(O.linear), while the function that searches for items/substrings in arrays/strings is @complexity(O.constant). At compile-time the generic code is then able to query the computational complexity of the "in" operator implementation. The problem is that the compiler today can't enforce functions annotated with @complexity(O.constant) to actually not perform a linear search (but I don't think it's a problem, today if the Range protocol asks an operation to not be worse than O(n ln n) the compiler doesn't enforce it).
>
> Good idea. It would make a nice use case for user-defined attributes in D3. Making the language aware of @complexity specifically doesn't buy much, all you need is:
>
> __traits(getAttribute, opIn, @complexity).bigOh == O.constant
>
> --
> Tomek
>

Doesn't the language have to be aware of this if it's supposed to work with ordinary arrays?
October 07, 2010
On Thursday, October 07, 2010 14:44:50 Stanislav Blinov wrote:
> Andrei Alexandrescu wrote:
> > Complexity composes very badly. Issues tend to manifest at moderate sizes and may make things unbearable at large sizes. I'm really grateful I'm at a workplace where the exclamation "Damn! I was waiting like an idiot for this quadratic append!" is met with understanding nods from workmates who've made the same mistake before.
> > 
> > As an example, only last week I was working on cramming a sort of an index of the entire Wikipedia on one machine. I was waiting for the indexer which ran slower and slower and slower. In the end I figured there was only _one_ quadratic operation - appending to a vector<size_t> that held document lengths. That wasn't even the bulk of the data and it was the last thing I looked at! Yet it made the run time impossible to endure.
> 
> But that is not a matter of library interface isn't it? It's a matter of algorithm/container choice. It's not the push_back that was slow in the end, it was std::vector (yes, that's arguable, but the point is that container defines rules for its methods, not vice-versa).

Except that when you're dealing with generic code which has to deal with multiple container types (like std.algorithm), you _need_ certain complexity guarantees about an operation since it could happen on any container that it's given. Using operator in in an algorithm could be perfectly reasonable if it had O(1) or O(log n) complexity but be completely unreasonable if it had O(n) complexity. So, the algorithm then is reasonable with some containers and not others when if in were restricted to O(1) or O(log n), it could be used by the algorithm without worrying that a container would be used with it which would make it an order of complexity greater, or worse.

If it were strictly a matter of writing code which directly used a particular algorithm and container type, then you could know what the complexities of the algorithm and operations on the container are, but once you're dealing with generic code, operations need to have complexity guarantees regardless of their container, or the complexity of the algorithm will vary with the container type. And if that algorithm is used in yet another algorithm, then it gets that much worse. It can quickly become easy to bury the place where what you're trying to do becomes an order of complexity greater, because the container that you selected was an order of complexity greater than other containers on an operation that an algorithm is using buried in code somewhere.

- Jonathan M Davis
October 07, 2010
Daniel Gibson napisał:

> 2010/10/8 Tomek Sowiński <just@ask.me>:
>> bearophile napisał:
>>
>>> Another solution is to support the "in" operator for dynamic arrays too, and define a new attribute, like @complexity(), plus an Enum that allows to specify the worst case complexity. So associative arrays are annotated with @complexity(O.linear), while the function that searches for items/substrings in arrays/strings is @complexity(O.constant). At compile-time the generic code is then able to query the computational complexity of the "in" operator implementation. The problem is that the compiler today can't enforce functions annotated with @complexity(O.constant) to actually not perform a linear search (but I don't think it's a problem, today if the Range protocol asks an operation to not be worse than O(n ln n) the compiler doesn't enforce it).
>>
>> Good idea. It would make a nice use case for user-defined attributes in D3. Making the language aware of @complexity specifically doesn't buy much, all you need is:
>>
>> __traits(getAttribute, opIn, @complexity).bigOh == O.constant
>>
>> --
>> Tomek
>>
> 
> Doesn't the language have to be aware of this if it's supposed to work with ordinary arrays?

I don't think so, there can always be uniform syntax wrappers (there's already a bunch of them in std.array):

@complexity(O.constant)
size_t length(T[] arr) {
     return arr.length;
}

or special cases similar to hasLength!string.

-- 
Tomek
October 07, 2010
On 10/7/2010 14:33, Steven Schveighoffer wrote:
> On Thu, 07 Oct 2010 16:23:47 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:
>> I can't say I've ever cared about the big-O complexity of an operation.
> 
> Then you don't understand how important it is.

Let me rephrase that.  I care about performance.  Big-O complexity can obviously have a significant effect on performance, so I so do care about it, but only to the extend that it affects performance.  Low big-O complexity is a means to an end, not a goal in and of itself.  If 'n' is low enough, then a O(2**n) algorithm may well be faster than an O(1) algorithm.

I also believe that, in the absence of a sophisticated system that actually verifies performance guarantees, the language and standard library should trust the programmer to know what he is doing.  The standard library should only provide transitive performance guarantees, e.g. this algorithm calls function 'f' 'n' times, so the algorithm's performance is O(n * complexity(f)).  If 'f' runs in constant time, the algorithm runs in linear time.  If 'f' runs in exponential time, the algorithm still runs.

> big O complexity is very important when you are writing libraries.  Not so much when you are writing applications -- if you can live with it in your application, then fine.  But Phobos should not have these problems for people who *do* care.
> 
> What I'd suggest is to write your own function that uses in when possible and find when not possible.  Then use that in your code.

The issue is that algorithms may use 'in' internally, so I may have to rewrite large parts of Phobos.  (And the issue isn't about 'in' specifically, but complexity guarantees in general.)


-- 
Rainer Deyke - rainerd@eldwood.com
October 07, 2010
Simen kjaeraas napisał:

> Tomek Sowiński <just@ask.me> wrote:
> 
>> __traits(getAttribute, opIn, @complexity).bigOh == O.constant
> 
> How does this test for things like N log M?
> 
> __traits(getAttribute, opIn, @complexity).bigOh == tuple( O.linear,
> O.logarithmic )

Or:

__traits(getAttribute, opIn, @complexity).bigOh = O.linear * O.log

bigOh could be e.g. a struct with an overloaded multiplier.

But you brought up something interesting -- how to bind N, M with different properties of function arguments; big oh expressions can get quite complex, e.g.

void completeSort(alias less = "a < b", SwapStrategy ss = SwapStrategy.unstable, Range1, Range2)(SortedRange!(Range1,less) lhs, Range2 rhs);

"[...] Performs Ο(lhs.length + rhs.length * log(rhs.length)) (best case) to Ο((lhs.length + rhs.length) * log(lhs.length + rhs.length)) (worst-case) evaluations of swap."

Even if the attribute properties could see the arguments, how to deal with things like lhs.length + rhs.length? It has to be inspectable at compile-time. One idea is to store the expression's abstract syntax tree (we want AST macros in D3 anyway)... But I got a feeling we're heading for an overkill :)

-- 
Tomek
October 07, 2010
Tomek Sowiński <just@ask.me> wrote:
> Even if the attribute properties could see the arguments, how to deal with things like
> lhs.length + rhs.length? It has to be inspectable at compile-time. One idea is to store the
> expression's abstract syntax tree (we want AST macros in D3 anyway)... But I got a feeling
> we're heading for an overkill :)

Just as soon as we solve the halting problem, eh?

-- 
Simen
October 07, 2010
Tomek Sowiński napisał:

> Simen kjaeraas napisał:
> 
>> Tomek Sowiński <just@ask.me> wrote:
>> 
>>> __traits(getAttribute, opIn, @complexity).bigOh == O.constant
>> 
>> How does this test for things like N log M?
>> 
>> __traits(getAttribute, opIn, @complexity).bigOh == tuple( O.linear,
>> O.logarithmic )
> 
> Or:
> 
> __traits(getAttribute, opIn, @complexity).bigOh = O.linear * O.log
> 
> bigOh could be e.g. a struct with an overloaded multiplier.
> 
> But you brought up something interesting -- how to bind N, M with different properties of function arguments; big oh expressions can get quite complex, e.g.
> 
> void completeSort(alias less = "a < b", SwapStrategy ss =
> SwapStrategy.unstable, Range1, Range2)(SortedRange!(Range1,less) lhs,
> Range2 rhs);
> 
> "[...] Performs Ο(lhs.length + rhs.length * log(rhs.length)) (best case)
> to Ο((lhs.length + rhs.length) * log(lhs.length + rhs.length))
> (worst-case) evaluations of swap."
> 
> Even if the attribute properties could see the arguments, how to deal with things like lhs.length + rhs.length? It has to be inspectable at compile-time. One idea is to store the expression's abstract syntax tree (we want AST macros in D3 anyway)... But I got a feeling we're heading for an overkill :)

If O.linear was a template, it could take an alias...

alias completeSort f;
__traits(getAttribute, f, @complexity).bigOh == (O.linear!(f.args.lhs.length) + O.linear!
(f.args.lhs.length)) * O.log!(O.linear!(f.args.lhs.length) + O.linear!(f.args.lhs.length))

:)

-- 
Tomek
October 07, 2010
Simen kjaeraas napisał:

> Tomek Sowiński <just@ask.me> wrote:
>> Even if the attribute properties could see the arguments, how to deal
>> with things like
>> lhs.length + rhs.length? It has to be inspectable at compile-time. One
>> idea is to store the
>> expression's abstract syntax tree (we want AST macros in D3 anyway)...
>> But I got a feeling
>> we're heading for an overkill :)
> 
> Just as soon as we solve the halting problem, eh?

What's the halting problem?

-- 
Tomek
October 07, 2010
On Thursday, October 07, 2010 16:39:15 Tomek Sowiński wrote:
> Simen kjaeraas napisał:
> > Tomek Sowiński <just@ask.me> wrote:
> >> Even if the attribute properties could see the arguments, how to deal
> >> with things like
> >> lhs.length + rhs.length? It has to be inspectable at compile-time. One
> >> idea is to store the
> >> expression's abstract syntax tree (we want AST macros in D3 anyway)...
> >> But I got a feeling
> >> we're heading for an overkill :)
> > 
> > Just as soon as we solve the halting problem, eh?
> 
> What's the halting problem?

http://en.wikipedia.org/wiki/Halting_problem

It's a classic problem in computer science. Essentially what it comes down to is that you can't determine when - or even if - a program will halt until it actually has. It's why stuff like file transfer dialogs can never be totally accurate. And best, you can estimate how long a file transfer will take based on the current progress, but you can't _know_ when it will complete. It's even worse for algorithms where you can't even estimate how much work there is left. And of course, you don't even necessarily know that the program _will_ halt. It could be an infinite loop for all you know.

- Jonathan M Davis