August 28, 2008
Chris R. Miller Wrote:

> superdan wrote:
> > Chris R. Miller Wrote:
> >> Furthermore, you're faulting linked lists as having a bad opIndex.  Why not implement a cursor (Java LinkedList-like iterator) in the opIndex function?  Thus you could retain the reference to the last indexed location, and simply use that instead of the root node when calling opIndex.  Granted that whenever the contents of the list are modified that reference would have to be considered invalid (start from the root node again), but it'd work with an O(1) efficiency for sequential accesses from 0 to length.  True, it'll add another pointer to a node in memory, as well as a integer representing the position of that node reference.
> > 
> > you: "this scent will make skunk farts stink less."
> > me: "let's kick the gorram skunk outta here!"
> 
> I would imagine that you'll have a hard time convincing others that linked-lists are evil when you apparently have two broken shift keys.
> 
> >> Wrong.  It works.  That it's not precisely what the spec for sort dictates (which is probably in error, since no spec can guarantee a precise efficiency if it doesn't know the precise container type).
> > 
> > sure it can. in big oh.
> 
> Which is simply identifying the algorithm used by its efficiency.  If you're not familiar with the types of algorithms, it tells you the proximate efficiency of the algorithm used.  If you are familiar with algorithms, then you can identify the type of algorithm used so you can better leverage it to do what you want.
> 
> >>  You
> >> are also misinterpreting the spec.  It is saying that it uses a specific
> >> efficiency of algorithm, not that you can arbitrarily expect a certain
> >> efficiency out of it regardless of how dumb you might be with the choice
> >> of container you use.
> > 
> > in stl the spec says as i say. in d the spec is not precise. it should.
> 
> Yes, it probably should explicitly say that "sort uses the xxxxx algorithm, which gives a proximate efficiency of O(n log n) when used with optimal data structures."
> 
> You honestly cannot write a spec for generic programming and expect uniform performance.

But this is what STL did. Sorry, Dee Girl
August 28, 2008
"Fawzi Mohamed" <fmohamed@mac.com> wrote in message news:g94k2b$2a1e$1@digitalmars.com...
> On 2008-08-27 23:27:52 +0200, "Nick Sabalausky" <a@a.a> said:
>
>> "superdan" <super@dan.org> wrote in message news:g94g3e$20e9$1@digitalmars.com...
>>> Nick Sabalausky Wrote:
>>>
>>>> "Dee Girl" <deegirl@noreply.com> wrote in message news:g943oi$11f4$1@digitalmars.com...
>>>>> Benji Smith Wrote:
>>>>>
>>>>>> Dee Girl wrote:
>>>>>>> Michiel Helvensteijn Wrote:
>>>>>>>> That's simple. a[i] looks much nicer than a.nth(i).
>>>>>>>
>>>>>>> It is not nicer. It is more deceiving (correct spell?). If you look
>>>>>>> at
>>>>>>> code it looks like array code.
>>>>>>>
>>>>>>> foreach (i; 0 .. a.length)
>>>>>>> {
>>>>>>> a[i] += 1;
>>>>>>> }
>>>>>>>
>>>>>>> For array works nice. But for list it is terrible! Many operations
>>>>>>> for
>>>>>>> incrementing only small list.
>>>>>>
>>>>>> Well, that's what you get with operator overloading.
>>>>>
>>>>> I am sorry. I disagree. I think that is what you get with bad design.
>>>>>
>>>>>> The same thing could be said for "+" or "-". They're inherently
>>>>>> deceiving, because they look like builtin operations on primitive
>>>>>> data
>>>>>> types.
>>>>>>
>>>>>> For expensive operations (like performing division on an
>>>>>> unlimited-precision decimal object), should the author of the code
>>>>>> use
>>>>>> "opDiv" or should he implement a separate "divide" function?
>>>>>
>>>>> The cost of + and - is proportional to digits in number. For small
>>>>> number
>>>>> of digits computer does fast in hardware. For many digits the cost
>>>>> grows.
>>>>> The number of digits is log n. I think + and - are fine for big
>>>>> integer. I
>>>>> am not surprise.
>>>>>
>>>>>> Forget opIndex for a moment, and ask the more general question about
>>>>>> all
>>>>>> overloaded operators. Should they imply any sort of asymptotic
>>>>>> complexity guarantee?
>>>>>
>>>>> I think depends on good design. For example I think ++ or -- for
>>>>> iterator.
>>>>> If it is O(n) it is bad design. Bad design make people say like you
>>>>> "This
>>>>> is what you get with operator overloading".
>>>>>
>>>>>> Personally, I don't think so.
>>>>>>
>>>>>> I don't like "nth".
>>>>>>
>>>>>> I'd rather use the opIndex. And if I'm using a linked list, I'll be
>>>>>> aware of the fact that it'll exhibit linear-time indexing, and I'll
>>>>>> be
>>>>>> cautious about which algorithms to use.
>>>>>
>>>>> But inside algorithm you do not know if you use a linked list or a
>>>>> vector.
>>>>> You lost that information in bad abstraction. Also abstraction is bad
>>>>> because if you change data structure you have concept errors that
>>>>> still
>>>>> compile. And run until tomorrow ^_^.
>>>>>
>>>>
>>>> A generic algoritm has absolutely no business caring about the
>>>> complexity
>>>> of
>>>> the collection it's operating on.
>>>
>>> absolutely. desperate. need. of. chanel.
>>>
>>>> If it does, then you've created a concrete
>>>> algoritm, not a generic one.
>>>
>>> sure you don't know what you're talking about. it is generic insofar as
>>> it
>>> abstracts away the primitives it needs from its iterator. run don't walk
>>> and get a dose of stl.
>>>
>>>> If an algoritm uses [] and doesn't know the
>>>> complexity of the []...good! It shouldn't know, and it shouldn't care.
>>>
>>> nonsense. so wrong i won't even address it.
>>>
>>>> It's
>>>> the code that sends the collection to the algoritm that knows and
>>>> cares.
>>>> Why? Because "what algoritm is best?" depends on far more than just
>>>> what
>>>> type of collection is used. It depends on "Will the collection ever be
>>>> larger than X elements?". It depends on "Is it a standard textbook
>>>> list,
>>>> or
>>>> does it use trick 1 and/or trick 2?". It depends on "Is it usually
>>>> mostly
>>>> sorted or mostly random?". It depends on "What do I do with it most
>>>> often?
>>>> Sort, append, search, insert or delete?". And it depends on other
>>>> things,
>>>> too.
>>>
>>> sure it does. problem is you have it backwards. types and algos tell you
>>> the theoretical properties. then you in knowledge of what's goin' on use
>>> the algorithm that does it for you in knowledge that the complexity
>>> would
>>> work for your situation. or you encode your own specialized algorithm.
>>>
>>> thing is stl encoded the most general linear search. you can use it for
>>> linear searching everything. moreover it exactly said what's needed for
>>> a
>>> linear search: a one-pass forward iterator aka input iterator.
>>>
>>> now to tie it with what u said: you know in your situation whether
>>> linear
>>> find cuts the mustard. that don't change the nature of that fundamental
>>> algorithm. so you use it or use another. your choice. but find remains
>>> universal so long as it has access to the basics of one-pass iteration.
>>>
>>>> Using "[]" versus "nth()" can't tell the algoritm *any* of those things.
>>>
>>> doesn't have to.
>>>
>>>> But
>>>> those things *must* be known in order to make an accurate decision of
>>>> "Is
>>>> this the right algoritm or not?"
>>>
>>> sure. you make them decision on the call site.
>>>
>>>> Therefore, a generic algoritm *cannot* ever
>>>> know for certain if it's the right algoritm *even* if you say "[]"
>>>> means
>>>> "O(log n) or better".
>>>
>>> utterly wrong. poppycock. gobbledygook. nonsense. this is so far off i
>>> don't have time to even address. if you want to learn stuff go learn
>>> stl.
>>> then we talk. if you want to teach me i guess we're done.
>>>
>>>> Therefore, the algorithm should not be designed to
>>>> only work with certain types of collections.
>>>
>>> it should be designed to work with certain iterator categories.
>>>
>>>> The code that sends the
>>>> collection to the algoritm is the *only* code that knows the answers to
>>>> all
>>>> of the questions above, therefore it is the only code that should ever
>>>> decide "I should use this algorithm, I shouldn't use that algorithm."
>>>
>>> correct. you just have all of your other hypotheses jumbled.
>>>
>>> sorry dood don't be hatin' but there's so much you don't know i ain't gonna continue this. last word is yours. call me a pompous prick if you want. go ahead.
>>
>> I'll agree to drop this issue. There's little point in debating with
>> someone
>> whose arguments frequently consist of things like "You are wrong", "I'm
>> not
>> going to explain my point", and "dood don't be hatin'".
>
> I am with dan dee_girl & co on this issue, the problem is that a generic algorithm "knows" the types he is working on and can easily check the operations they have, and based on this decide the strategy to use. This choice works well if the presence of a given operation is also connected with some performance guarantee.
>

IMO, a better way to do that would be via C#-style attributes or equivilent named interfaces. I'm not sure if this is what you're referring to below or not.

> Concepts (or better categories (aldor concept not C++), that are interfaces for types, but interfaces that have to be explicitly assigned to a type) might relax this situation a little, but the need for some guarantees will remain.
>

If this "guarantee" (or mechanism for checking the types of operations that a collection supports) takes the form of a style guideline that says "don't implement opIndex for a collection if it would be O(n) or worse", then that, frankly, is absolutely no guarantee at all.

If you *really* need that sort of guarantee (and I can imagine it may be useful in some cases), then the implementation of the guarantee does *not* belong in the realm of "implements vs doesn't-implement a particular operator overload". Doing so is an abuse of operator overloading, since operator overloading is there for defining syntactic sugar, not for acting as a makeshift contract.

The correct mechanism for such guarantees is with named interfaces or C#-style attribtes, as I mentioned above. True, that can still be abused if the collection author wants to, but they have to actually try (ie, they have to lie and say "implements IndexingInConstantTime" in addition to implementing opIndex). If you instead try to implement that guarantee with the "don't implement opIndex for a collection if it would be O(n) or worse" style-guideline, then it's far too easy for a collection to come along that is ignorant of that "psuedo-contract" and accidentially breaks it. Proper use of interfaces/attributes instead of relying on the existence or absense of an overloaded operator fixes that problem.


August 28, 2008
Dee Girl wrote:
> Chris R. Miller Wrote:
>> You honestly cannot write a spec for generic programming and expect uniform performance.
> 
> But this is what STL did. Sorry, Dee Girl

Reading back through the STL intro, it seems that all this STL power comes from the iterator.  Supposing I wrote a horrid iterator (sort of like the annoying O(n) opIndex previously discussed) I don't see why STL is "immune" to the same weakness of a slower data structure.

I can see how STL is more powerful in that you can pick and choose the algorithm to use, but at this point I think we're discussing changing the nature of the sort property in D at a fundamental level.

I still just don't see the (apparently obvious) advantage of STL. Disclaimer: I do not /know/ STL that well at all.  I came from Java with a ___brief___ dabbling in C/C++.  So I'm not trying to be annoying, stupid, or blind - I'm just ignorant of what you see that I don't.



August 28, 2008
"Dee Girl" <deegirl@noreply.com> wrote in message news:g94j7a$2875$1@digitalmars.com...
> Nick Sabalausky Wrote:
>
>> "Dee Girl" <deegirl@noreply.com> wrote in message news:g943oi$11f4$1@digitalmars.com...
>> > Benji Smith Wrote:
>> >
>> >> Dee Girl wrote:
>> >> > Michiel Helvensteijn Wrote:
>> >> >> That's simple. a[i] looks much nicer than a.nth(i).
>> >> >
>> >> > It is not nicer. It is more deceiving (correct spell?). If you look
>> >> > at
>> >> > code it looks like array code.
>> >> >
>> >> > foreach (i; 0 .. a.length)
>> >> > {
>> >> >     a[i] += 1;
>> >> > }
>> >> >
>> >> > For array works nice. But for list it is terrible! Many operations
>> >> > for
>> >> > incrementing only small list.
>> >>
>> >> Well, that's what you get with operator overloading.
>> >
>> > I am sorry. I disagree. I think that is what you get with bad design.
>> >
>> >> The same thing could be said for "+" or "-". They're inherently deceiving, because they look like builtin operations on primitive data types.
>> >>
>> >> For expensive operations (like performing division on an unlimited-precision decimal object), should the author of the code use "opDiv" or should he implement a separate "divide" function?
>> >
>> > The cost of + and - is proportional to digits in number. For small
>> > number
>> > of digits computer does fast in hardware. For many digits the cost
>> > grows.
>> > The number of digits is log n. I think + and - are fine for big
>> > integer. I
>> > am not surprise.
>> >
>> >> Forget opIndex for a moment, and ask the more general question about
>> >> all
>> >> overloaded operators. Should they imply any sort of asymptotic
>> >> complexity guarantee?
>> >
>> > I think depends on good design. For example I think ++ or -- for
>> > iterator.
>> > If it is O(n) it is bad design. Bad design make people say like you
>> > "This
>> > is what you get with operator overloading".
>> >
>> >> Personally, I don't think so.
>> >>
>> >> I don't like "nth".
>> >>
>> >> I'd rather use the opIndex. And if I'm using a linked list, I'll be aware of the fact that it'll exhibit linear-time indexing, and I'll be cautious about which algorithms to use.
>> >
>> > But inside algorithm you do not know if you use a linked list or a
>> > vector.
>> > You lost that information in bad abstraction. Also abstraction is bad
>> > because if you change data structure you have concept errors that still
>> > compile. And run until tomorrow ^_^.
>> >
>>
>> A generic algoritm has absolutely no business caring about the complexity
>> of
>> the collection it's operating on. If it does, then you've created a
>> concrete
>> algoritm, not a generic one.
>
> I appreciate your view point. Please allow me explain. The view point is in opposition with STL. In STL each algorithm defines what kind of iterator it operates with. And it requires what iterator complexity.
>
> I agree that other design can be made. But STL has that design. In my opinion is much part of what make STL so successful.
>
> I disagree that algorithm that knows complexity of iterator is concrete. I think exactly contrary. Maybe it is good that you read book about STL by Josuttis. STL algorithms are the most generic I ever find in any language. I hope std.algorithm in D will be better. But right now std.algorithm works only with array.
>
>> If an algoritm uses [] and doesn't know the
>> complexity of the []...good! It shouldn't know, and it shouldn't care.
>> It's
>> the code that sends the collection to the algoritm that knows and cares.
>
> I think this is mistake. Algorithm should know. Otherwise "linear find" is not "linear find"! It is "cuadratic find" (spell?). If you want to define something called linear find then you must know iterator complexity.
>

If a generic algorithm describes itself as "linear find" then I know damn well that it's referring to the behavior of *just* the function itself, and is not a statement that the function *combined* with the behavior of the collection and/or a custom comparison is always going to be O(n).

A question about STL: If I create a collection that, internally, is like a linked list, but starts each indexing operation from the position of the last indexing operation (so that a "find first" would run in O(n) instead of O(n*n)), is it possible to send that collection to STL's generic "linear find first"? I would argue that it should somehow be possible *even* if the STL's generic "linear find first" guarantees a *total* performance of O(n) (Since, in this case, it would still be O(n) anyway). Because otherwise, the STL wouldn't be very extendable, which would be a bad thing for a library of "generic" algorithms.

Another STL question: It is possible to use STL to do a "linear find" using a custom comparison? If so, it is possible to make STL's "linear find" function use a comparison that just happens to be O(n)? If so, doesn't that violate the linear-time guarantee, too? If not, how does it know that the custom comparison is O(n) instead of O(1) or O(log n)?


August 28, 2008
"Nick Sabalausky" wrote
>> Concepts (or better categories (aldor concept not C++), that are interfaces for types, but interfaces that have to be explicitly assigned to a type) might relax this situation a little, but the need for some guarantees will remain.
>>
>
> If this "guarantee" (or mechanism for checking the types of operations that a collection supports) takes the form of a style guideline that says "don't implement opIndex for a collection if it would be O(n) or worse", then that, frankly, is absolutely no guarantee at all.

The guarantee is not enforced, but the expectation and convention is implict.  When someone sees an index operator the first thought is that it is a quick lookup.  You can force yourself to think differently, but the reality is that most people think that because of the universal usage of square brackets (except for VB, and I feel pity for anyone who needs to use VB) to mean 'lookup by key', and usually this is only useful on objects where the lookup is quick ( < O(n) ).  Although there is no requirement, nor enforcement, the 'quick' contract is expected by the user, no matter how much docs you throw at them.

Look, for instance, at Tango's now-deprecated LinkMap, which uses a linked-list of key/value pairs (copied from Doug Lea's implementation). Nobody in their right mind would use link map because lookups are O(n), and it's just as easy to use a TreeMap or HashMap.  Would you ever use it?

> If you *really* need that sort of guarantee (and I can imagine it may be useful in some cases), then the implementation of the guarantee does *not* belong in the realm of "implements vs doesn't-implement a particular operator overload". Doing so is an abuse of operator overloading, since operator overloading is there for defining syntactic sugar, not for acting as a makeshift contract.

I don't think anybody is asking for a guarantee from the compiler or any specific tool.  I think what we are saying is that violating the 'opIndex is fast' notion is bad design because you end up with users thinking they are doing something that's quick.  You end up with people posting benchmarks on your containers saying 'why does python beat the pants off your list implementation?'.  You can say 'hey, it's not meant to be used that way', but then why can the user use it that way?  A better design is to nudge the user into using a correct container for the job by only supporting operations that make sense on the collections.

And as far as operator semantic meaning, D's operators are purposely named after what they are supposed to do.  Notice that the operator for + is opAdd, not opPlus.  This is because opAdd is supposed to mean you are performing an addition operation.  Assigning a different semantic meaning is not disallowed by the compiler, but is considered bad design.  opIndex is supposed to be an index function, not a linear search.  It's not called opSearch for a reason.  Sure you can redefine it however you want semantically, but it's considered bad design.  That's all we're saying.

-Steve


August 28, 2008
"Steven Schveighoffer" <schveiguy@yahoo.com> wrote in message news:g95b89$1jii$1@digitalmars.com...
>
> When someone sees an index operator the first thought is that it is a quick lookup.

This seems to be a big part of the disagreement. Personally, I think it's insane to look at [] and just assume it's a cheap lookup. That's was true in pure C where, as someone else said, it was nothing more than a shorthand for a specific pointer arithmentic operation, but carrying that idea over to more modern languages (especially one that also uses [] for associative arrays) is a big mistake.

The '+' operator means "add". Addition is typically O(1). But vectors can be added, and that's an O(n) operation. Should opAdd never be used for vectors?

> You can force yourself to think differently, but the reality is that most people think that because of the universal usage of square brackets (except for VB, and I feel pity for anyone who needs to use VB) to mean 'lookup by key', and usually this is only useful on objects where the lookup is quick ( < O(n) ).  Although there is no requirement, nor enforcement, the 'quick' contract is expected by the user, no matter how much docs you throw at them.
>
> Look, for instance, at Tango's now-deprecated LinkMap, which uses a linked-list of key/value pairs (copied from Doug Lea's implementation). Nobody in their right mind would use link map because lookups are O(n), and it's just as easy to use a TreeMap or HashMap.  Would you ever use it?
>
>> If you *really* need that sort of guarantee (and I can imagine it may be useful in some cases), then the implementation of the guarantee does *not* belong in the realm of "implements vs doesn't-implement a particular operator overload". Doing so is an abuse of operator overloading, since operator overloading is there for defining syntactic sugar, not for acting as a makeshift contract.
>
> I don't think anybody is asking for a guarantee from the compiler or any specific tool.  I think what we are saying is that violating the 'opIndex is fast' notion is bad design because you end up with users thinking they are doing something that's quick.  You end up with people posting benchmarks on your containers saying 'why does python beat the pants off your list implementation?'.  You can say 'hey, it's not meant to be used that way', but then why can the user use it that way?  A better design is to nudge the user into using a correct container for the job by only supporting operations that make sense on the collections.
>
> And as far as operator semantic meaning, D's operators are purposely named after what they are supposed to do.  Notice that the operator for + is opAdd, not opPlus.  This is because opAdd is supposed to mean you are performing an addition operation.  Assigning a different semantic meaning is not disallowed by the compiler, but is considered bad design.  opIndex is supposed to be an index function, not a linear search.  It's not called opSearch for a reason.  Sure you can redefine it however you want semantically, but it's considered bad design.  That's all we're saying.
>

Nobody is suggesting using [] to invoke a search (Although we have talked about using [] *in the search function's implementation*). Search means you want to get the position of a given element, or in other words, "element" -> search -> "key/index". What we're talking about is the reverse: getting the element at a given position, ie, "key/index" -> [] -> "element". It doesn't matter if it's an array, a linked list, or a super-duper-collection-from-2092: that's still indexing, not searching.


August 28, 2008
Chris R. Miller wrote:
> Dee Girl wrote:
>> Chris R. Miller Wrote:
>>> You honestly cannot write a spec for generic programming and expect
>>> uniform performance.
>> But this is what STL did. Sorry, Dee Girl
> 
> Reading back through the STL intro, it seems that all this STL power
> comes from the iterator.  Supposing I wrote a horrid iterator (sort of
> like the annoying O(n) opIndex previously discussed) I don't see why STL
> is "immune" to the same weakness of a slower data structure.
> 
> I can see how STL is more powerful in that you can pick and choose the
> algorithm to use, but at this point I think we're discussing changing
> the nature of the sort property in D at a fundamental level.
> 
> I still just don't see the (apparently obvious) advantage of STL.

Read Alexander Stepanov's notes. They are fantastic.

http://www.stepanovpapers.com/notes.pdf
August 28, 2008
Nick Sabalausky wrote:
> "Dee Girl" <deegirl@noreply.com> wrote in message news:g94j7a$2875$1@digitalmars.com...
>> Nick Sabalausky Wrote:
>>
>>> "Dee Girl" <deegirl@noreply.com> wrote in message
>>> news:g943oi$11f4$1@digitalmars.com...
>>>> Benji Smith Wrote:
>>>>
>>>>> Dee Girl wrote:
>>>>>> Michiel Helvensteijn Wrote:
>>>>>>> That's simple. a[i] looks much nicer than a.nth(i).
>>>>>> It is not nicer. It is more deceiving (correct spell?). If you look at
>>>>>> code it looks like array code.
>>>>>>
>>>>>> foreach (i; 0 .. a.length)
>>>>>> {
>>>>>>     a[i] += 1;
>>>>>> }
>>>>>>
>>>>>> For array works nice. But for list it is terrible! Many operations for
>>>>>> incrementing only small list.
>>>>> Well, that's what you get with operator overloading.
>>>> I am sorry. I disagree. I think that is what you get with bad design.
>>>>
>>>>> The same thing could be said for "+" or "-". They're inherently
>>>>> deceiving, because they look like builtin operations on primitive data
>>>>> types.
>>>>>
>>>>> For expensive operations (like performing division on an
>>>>> unlimited-precision decimal object), should the author of the code use
>>>>> "opDiv" or should he implement a separate "divide" function?
>>>> The cost of + and - is proportional to digits in number. For small number
>>>> of digits computer does fast in hardware. For many digits the cost grows.
>>>> The number of digits is log n. I think + and - are fine for big integer. I
>>>> am not surprise.
>>>>
>>>>> Forget opIndex for a moment, and ask the more general question about all
>>>>> overloaded operators. Should they imply any sort of asymptotic
>>>>> complexity guarantee?
>>>> I think depends on good design. For example I think ++ or -- for iterator.
>>>> If it is O(n) it is bad design. Bad design make people say like you "This
>>>> is what you get with operator overloading".
>>>>
>>>>> Personally, I don't think so.
>>>>>
>>>>> I don't like "nth".
>>>>>
>>>>> I'd rather use the opIndex. And if I'm using a linked list, I'll be
>>>>> aware of the fact that it'll exhibit linear-time indexing, and I'll be
>>>>> cautious about which algorithms to use.
>>>> But inside algorithm you do not know if you use a linked list or a vector.
>>>> You lost that information in bad abstraction. Also abstraction is bad
>>>> because if you change data structure you have concept errors that still
>>>> compile. And run until tomorrow ^_^.
>>>>
>>> A generic algoritm has absolutely no business caring about the complexity of
>>> the collection it's operating on. If it does, then you've created a concrete
>>> algoritm, not a generic one.
>> I appreciate your view point. Please allow me explain. The view point is in opposition with STL. In STL each algorithm defines what kind of iterator it operates with. And it requires what iterator complexity.
>>
>> I agree that other design can be made. But STL has that design. In my opinion is much part of what make STL so successful.
>>
>> I disagree that algorithm that knows complexity of iterator is concrete. I think exactly contrary. Maybe it is good that you read book about STL by Josuttis. STL algorithms are the most generic I ever find in any language. I hope std.algorithm in D will be better. But right now std.algorithm works only with array.
>>
>>> If an algoritm uses [] and doesn't know the
>>> complexity of the []...good! It shouldn't know, and it shouldn't care. It's
>>> the code that sends the collection to the algoritm that knows and cares.
>> I think this is mistake. Algorithm should know. Otherwise "linear find" is not "linear find"! It is "cuadratic find" (spell?). If you want to define something called linear find then you must know iterator complexity.
>>
> 
> If a generic algorithm describes itself as "linear find" then I know damn well that it's referring to the behavior of *just* the function itself, and is not a statement that the function *combined* with the behavior of the collection and/or a custom comparison is always going to be O(n).
> 
> A question about STL: If I create a collection that, internally, is like a linked list, but starts each indexing operation from the position of the last indexing operation (so that a "find first" would run in O(n) instead of O(n*n)), is it possible to send that collection to STL's generic "linear find first"? I would argue that it should somehow be possible *even* if the STL's generic "linear find first" guarantees a *total* performance of O(n) (Since, in this case, it would still be O(n) anyway). Because otherwise, the STL wouldn't be very extendable, which would be a bad thing for a library of "generic" algorithms.

Yes, it will work.

> Another STL question: It is possible to use STL to do a "linear find" using a custom comparison? If so, it is possible to make STL's "linear find" function use a comparison that just happens to be O(n)? If so, doesn't that violate the linear-time guarantee, too? If not, how does it know that the custom comparison is O(n) instead of O(1) or O(log n)?

This will work too.

IF you follow the conventions THEN the STL gives you the guarantees.
August 28, 2008
"Don" <nospam@nospam.com.au> wrote in message news:g95ks5$2aon$1@digitalmars.com...
> Nick Sabalausky wrote:
>> "Dee Girl" <deegirl@noreply.com> wrote in message news:g94j7a$2875$1@digitalmars.com...
>>> I appreciate your view point. Please allow me explain. The view point is in opposition with STL. In STL each algorithm defines what kind of iterator it operates with. And it requires what iterator complexity.
>>>
>>> I agree that other design can be made. But STL has that design. In my opinion is much part of what make STL so successful.
>>>
>>> I disagree that algorithm that knows complexity of iterator is concrete. I think exactly contrary. Maybe it is good that you read book about STL by Josuttis. STL algorithms are the most generic I ever find in any language. I hope std.algorithm in D will be better. But right now std.algorithm works only with array.
>>>
>>>> If an algoritm uses [] and doesn't know the
>>>> complexity of the []...good! It shouldn't know, and it shouldn't care.
>>>> It's
>>>> the code that sends the collection to the algoritm that knows and
>>>> cares.
>>> I think this is mistake. Algorithm should know. Otherwise "linear find" is not "linear find"! It is "cuadratic find" (spell?). If you want to define something called linear find then you must know iterator complexity.
>>>
>>
>> If a generic algorithm describes itself as "linear find" then I know damn well that it's referring to the behavior of *just* the function itself, and is not a statement that the function *combined* with the behavior of the collection and/or a custom comparison is always going to be O(n).
>>
>> A question about STL: If I create a collection that, internally, is like a linked list, but starts each indexing operation from the position of the last indexing operation (so that a "find first" would run in O(n) instead of O(n*n)), is it possible to send that collection to STL's generic "linear find first"? I would argue that it should somehow be possible *even* if the STL's generic "linear find first" guarantees a *total* performance of O(n) (Since, in this case, it would still be O(n) anyway). Because otherwise, the STL wouldn't be very extendable, which would be a bad thing for a library of "generic" algorithms.
>
> Yes, it will work.
>
>> Another STL question: It is possible to use STL to do a "linear find" using a custom comparison? If so, it is possible to make STL's "linear find" function use a comparison that just happens to be O(n)? If so, doesn't that violate the linear-time guarantee, too? If not, how does it know that the custom comparison is O(n) instead of O(1) or O(log n)?
>
> This will work too.
>
> IF you follow the conventions THEN the STL gives you the guarantees.

I'm not sure that's really a "guarantee" per se, but that's splitting hairs.

In any case, it sounds like we're all arguing more or less the same point:

Setting aside the issue of "should opIndex be used and when?", suppose I have the following collection interface and find function (not guaranteed to compile):

interface ICollection(T)
{
    T getElement(index);
    int getSize();
}

int find(T)(ICollection(T) c, T elem)
{
    for(int i=0; i<c.size(); i++)
    {
 if(c.getElement(i) == elem)
            return i;
    }
}

It sounds like STL's approach is to do something roughly like that and say:

"find()'s parameter 'c' should be an ICollection for which getElement() is O(1), in which case find() is guaranteed to be O(n)"

What I've been advocating is, again, doing something like the code above and saying:

"find()'s complexity is dependant on the complexity of the ICollection's getElement(). If getElement()'s complexity is O(m), then find()'s complexity is guaranteed to be O(m * n). Of course, this means that the only way to get ideal complexity from find() is to use an ICollection for which getElement() is O(1)".

But, you see, those two statements are effectively equivilent.


August 28, 2008
On 2008-08-28 06:06:11 +0200, "Nick Sabalausky" <a@a.a> said:

> "Fawzi Mohamed" <fmohamed@mac.com> wrote in message
> news:g94k2b$2a1e$1@digitalmars.com...
>> 
>> I am with dan dee_girl & co on this issue, the problem is that a generic
>> algorithm "knows" the types he is working on and can easily check the
>> operations they have, and based on this decide the strategy to use. This
>> choice works well if the presence of a given operation is also connected
>> with some performance guarantee.
>> 
> 
> IMO, a better way to do that would be via C#-style attributes or equivilent
> named interfaces. I'm not sure if this is what you're referring to below or
> not.

yes categories are basically named interfaces for types, unlike the constraint (that check if something is implemented) one has to explicitly say T implements Interface (and obviously T has to have all the requested functions/methods).
This should be available for all types, and you should be able to request also the existence of free functions, not only of methods.
Attributes are a simplified version of this (basically no checks for a given interface).
The important thing is that presence or absence of attributes in a given type is not automatically inferred from the presence of given functions.

>> Concepts (or better categories (aldor concept not C++), that are
>> interfaces for types, but interfaces that have to be explicitly assigned
>> to a type) might relax this situation a little, but the need for some
>> guarantees will remain.
>> 
> 
> If this "guarantee" (or mechanism for checking the types of operations that
> a collection supports) takes the form of a style guideline that says "don't
> implement opIndex for a collection if it would be O(n) or worse", then that,
> frankly, is absolutely no guarantee at all.

well if it is in the spec and everybody knows it, then breaking it and getting bad behavior is you own fault.

> If you *really* need that sort of guarantee (and I can imagine it may be
> useful in some cases), then the implementation of the guarantee does *not*
> belong in the realm of "implements vs doesn't-implement a particular
> operator overload". Doing so is an abuse of operator overloading, since
> operator overloading is there for defining syntactic sugar, not for acting
> as a makeshift contract.
> 
> The correct mechanism for such guarantees is with named interfaces or
> C#-style attribtes, as I mentioned above. True, that can still be abused if
> the collection author wants to, but they have to actually try (ie, they have
> to lie and say "implements IndexingInConstantTime" in addition to
> implementing opIndex). If you instead try to implement that guarantee with
> the "don't implement opIndex for a collection if it would be O(n) or worse"
> style-guideline, then it's far too easy for a collection to come along that
> is ignorant of that "psuedo-contract" and accidentially breaks it. Proper
> use of interfaces/attributes instead of relying on the existence or absense
> of an overloaded operator fixes that problem.

I fully agree that with interfaces (or categories or attributes) the correct thing is to use them to enforce extra constraints, so that overloading (or naming the functions) is really just syntactic sugar, but also in this case it can make reading (and writing from the beginning reasonably fast code) and understanding complexity (speed) of code reading it easier if some social contract about speed of operations is respected.

Pleas note that these "guarantees" are not such that one cannot break them, their purpose is to make the life of who knows them and enforces them easier, and also of the whole community if it chooses to adopt it. As Steve just argued much more fully than me.

STL, does it, and I think that also in D should do it, should D get categories or attributes this things could be relaxed a little, but I think there will still be cases where expecting a given function to have a given complexity will be a good thing to have.

It just makes thinking about the code easier, and simpler to stay at high level without having surprises that you have to find out by looking in detail at the code, and so it makes a programmer more productive.

Fawzi