View mode: basic / threaded / horizontal-split · Log in · Help
August 28, 2008
Re: Why Strings as Classes?
Chris R. Miller Wrote:

> superdan wrote:
> > Chris R. Miller Wrote:
> >> Furthermore, you're faulting linked lists as having a bad opIndex.  Why
> >> not implement a cursor (Java LinkedList-like iterator) in the opIndex
> >> function?  Thus you could retain the reference to the last indexed
> >> location, and simply use that instead of the root node when calling
> >> opIndex.  Granted that whenever the contents of the list are modified
> >> that reference would have to be considered invalid (start from the root
> >> node again), but it'd work with an O(1) efficiency for sequential
> >> accesses from 0 to length.  True, it'll add another pointer to a node in
> >> memory, as well as a integer representing the position of that node
> >> reference.
> > 
> > you: "this scent will make skunk farts stink less."
> > me: "let's kick the gorram skunk outta here!"
> 
> I would imagine that you'll have a hard time convincing others that
> linked-lists are evil when you apparently have two broken shift keys.
> 
> >> Wrong.  It works.  That it's not precisely what the spec for sort
> >> dictates (which is probably in error, since no spec can guarantee a
> >> precise efficiency if it doesn't know the precise container type).
> > 
> > sure it can. in big oh.
> 
> Which is simply identifying the algorithm used by its efficiency.  If
> you're not familiar with the types of algorithms, it tells you the
> proximate efficiency of the algorithm used.  If you are familiar with
> algorithms, then you can identify the type of algorithm used so you can
> better leverage it to do what you want.
> 
> >>  You
> >> are also misinterpreting the spec.  It is saying that it uses a specific
> >> efficiency of algorithm, not that you can arbitrarily expect a certain
> >> efficiency out of it regardless of how dumb you might be with the choice
> >> of container you use.
> > 
> > in stl the spec says as i say. in d the spec is not precise. it should.
> 
> Yes, it probably should explicitly say that "sort uses the xxxxx
> algorithm, which gives a proximate efficiency of O(n log n) when used
> with optimal data structures."
> 
> You honestly cannot write a spec for generic programming and expect
> uniform performance.

But this is what STL did. Sorry, Dee Girl
August 28, 2008
Re: Why Strings as Classes?
"Fawzi Mohamed" <fmohamed@mac.com> wrote in message 
news:g94k2b$2a1e$1@digitalmars.com...
> On 2008-08-27 23:27:52 +0200, "Nick Sabalausky" <a@a.a> said:
>
>> "superdan" <super@dan.org> wrote in message
>> news:g94g3e$20e9$1@digitalmars.com...
>>> Nick Sabalausky Wrote:
>>>
>>>> "Dee Girl" <deegirl@noreply.com> wrote in message
>>>> news:g943oi$11f4$1@digitalmars.com...
>>>>> Benji Smith Wrote:
>>>>>
>>>>>> Dee Girl wrote:
>>>>>>> Michiel Helvensteijn Wrote:
>>>>>>>> That's simple. a[i] looks much nicer than a.nth(i).
>>>>>>>
>>>>>>> It is not nicer. It is more deceiving (correct spell?). If you look
>>>>>>> at
>>>>>>> code it looks like array code.
>>>>>>>
>>>>>>> foreach (i; 0 .. a.length)
>>>>>>> {
>>>>>>> a[i] += 1;
>>>>>>> }
>>>>>>>
>>>>>>> For array works nice. But for list it is terrible! Many operations
>>>>>>> for
>>>>>>> incrementing only small list.
>>>>>>
>>>>>> Well, that's what you get with operator overloading.
>>>>>
>>>>> I am sorry. I disagree. I think that is what you get with bad design.
>>>>>
>>>>>> The same thing could be said for "+" or "-". They're inherently
>>>>>> deceiving, because they look like builtin operations on primitive 
>>>>>> data
>>>>>> types.
>>>>>>
>>>>>> For expensive operations (like performing division on an
>>>>>> unlimited-precision decimal object), should the author of the code 
>>>>>> use
>>>>>> "opDiv" or should he implement a separate "divide" function?
>>>>>
>>>>> The cost of + and - is proportional to digits in number. For small
>>>>> number
>>>>> of digits computer does fast in hardware. For many digits the cost
>>>>> grows.
>>>>> The number of digits is log n. I think + and - are fine for big
>>>>> integer. I
>>>>> am not surprise.
>>>>>
>>>>>> Forget opIndex for a moment, and ask the more general question about
>>>>>> all
>>>>>> overloaded operators. Should they imply any sort of asymptotic
>>>>>> complexity guarantee?
>>>>>
>>>>> I think depends on good design. For example I think ++ or -- for
>>>>> iterator.
>>>>> If it is O(n) it is bad design. Bad design make people say like you
>>>>> "This
>>>>> is what you get with operator overloading".
>>>>>
>>>>>> Personally, I don't think so.
>>>>>>
>>>>>> I don't like "nth".
>>>>>>
>>>>>> I'd rather use the opIndex. And if I'm using a linked list, I'll be
>>>>>> aware of the fact that it'll exhibit linear-time indexing, and I'll 
>>>>>> be
>>>>>> cautious about which algorithms to use.
>>>>>
>>>>> But inside algorithm you do not know if you use a linked list or a
>>>>> vector.
>>>>> You lost that information in bad abstraction. Also abstraction is bad
>>>>> because if you change data structure you have concept errors that 
>>>>> still
>>>>> compile. And run until tomorrow ^_^.
>>>>>
>>>>
>>>> A generic algoritm has absolutely no business caring about the 
>>>> complexity
>>>> of
>>>> the collection it's operating on.
>>>
>>> absolutely. desperate. need. of. chanel.
>>>
>>>> If it does, then you've created a concrete
>>>> algoritm, not a generic one.
>>>
>>> sure you don't know what you're talking about. it is generic insofar as 
>>> it
>>> abstracts away the primitives it needs from its iterator. run don't walk
>>> and get a dose of stl.
>>>
>>>> If an algoritm uses [] and doesn't know the
>>>> complexity of the []...good! It shouldn't know, and it shouldn't care.
>>>
>>> nonsense. so wrong i won't even address it.
>>>
>>>> It's
>>>> the code that sends the collection to the algoritm that knows and 
>>>> cares.
>>>> Why? Because "what algoritm is best?" depends on far more than just 
>>>> what
>>>> type of collection is used. It depends on "Will the collection ever be
>>>> larger than X elements?". It depends on "Is it a standard textbook 
>>>> list,
>>>> or
>>>> does it use trick 1 and/or trick 2?". It depends on "Is it usually 
>>>> mostly
>>>> sorted or mostly random?". It depends on "What do I do with it most
>>>> often?
>>>> Sort, append, search, insert or delete?". And it depends on other 
>>>> things,
>>>> too.
>>>
>>> sure it does. problem is you have it backwards. types and algos tell you
>>> the theoretical properties. then you in knowledge of what's goin' on use
>>> the algorithm that does it for you in knowledge that the complexity 
>>> would
>>> work for your situation. or you encode your own specialized algorithm.
>>>
>>> thing is stl encoded the most general linear search. you can use it for
>>> linear searching everything. moreover it exactly said what's needed for 
>>> a
>>> linear search: a one-pass forward iterator aka input iterator.
>>>
>>> now to tie it with what u said: you know in your situation whether 
>>> linear
>>> find cuts the mustard. that don't change the nature of that fundamental
>>> algorithm. so you use it or use another. your choice. but find remains
>>> universal so long as it has access to the basics of one-pass iteration.
>>>
>>>> Using "[]" versus "nth()" can't tell the algoritm *any* of those 
>>>> things.
>>>
>>> doesn't have to.
>>>
>>>> But
>>>> those things *must* be known in order to make an accurate decision of 
>>>> "Is
>>>> this the right algoritm or not?"
>>>
>>> sure. you make them decision on the call site.
>>>
>>>> Therefore, a generic algoritm *cannot* ever
>>>> know for certain if it's the right algoritm *even* if you say "[]" 
>>>> means
>>>> "O(log n) or better".
>>>
>>> utterly wrong. poppycock. gobbledygook. nonsense. this is so far off i
>>> don't have time to even address. if you want to learn stuff go learn 
>>> stl.
>>> then we talk. if you want to teach me i guess we're done.
>>>
>>>> Therefore, the algorithm should not be designed to
>>>> only work with certain types of collections.
>>>
>>> it should be designed to work with certain iterator categories.
>>>
>>>> The code that sends the
>>>> collection to the algoritm is the *only* code that knows the answers to
>>>> all
>>>> of the questions above, therefore it is the only code that should ever
>>>> decide "I should use this algorithm, I shouldn't use that algorithm."
>>>
>>> correct. you just have all of your other hypotheses jumbled.
>>>
>>> sorry dood don't be hatin' but there's so much you don't know i ain't
>>> gonna continue this. last word is yours. call me a pompous prick if you
>>> want. go ahead.
>>
>> I'll agree to drop this issue. There's little point in debating with 
>> someone
>> whose arguments frequently consist of things like "You are wrong", "I'm 
>> not
>> going to explain my point", and "dood don't be hatin'".
>
> I am with dan dee_girl & co on this issue, the problem is that a generic 
> algorithm "knows" the types he is working on and can easily check the 
> operations they have, and based on this decide the strategy to use. This 
> choice works well if the presence of a given operation is also connected 
> with some performance guarantee.
>

IMO, a better way to do that would be via C#-style attributes or equivilent 
named interfaces. I'm not sure if this is what you're referring to below or 
not.

> Concepts (or better categories (aldor concept not C++), that are 
> interfaces for types, but interfaces that have to be explicitly assigned 
> to a type) might relax this situation a little, but the need for some 
> guarantees will remain.
>

If this "guarantee" (or mechanism for checking the types of operations that 
a collection supports) takes the form of a style guideline that says "don't 
implement opIndex for a collection if it would be O(n) or worse", then that, 
frankly, is absolutely no guarantee at all.

If you *really* need that sort of guarantee (and I can imagine it may be 
useful in some cases), then the implementation of the guarantee does *not* 
belong in the realm of "implements vs doesn't-implement a particular 
operator overload". Doing so is an abuse of operator overloading, since 
operator overloading is there for defining syntactic sugar, not for acting 
as a makeshift contract.

The correct mechanism for such guarantees is with named interfaces or 
C#-style attribtes, as I mentioned above. True, that can still be abused if 
the collection author wants to, but they have to actually try (ie, they have 
to lie and say "implements IndexingInConstantTime" in addition to 
implementing opIndex). If you instead try to implement that guarantee with 
the "don't implement opIndex for a collection if it would be O(n) or worse" 
style-guideline, then it's far too easy for a collection to come along that 
is ignorant of that "psuedo-contract" and accidentially breaks it. Proper 
use of interfaces/attributes instead of relying on the existence or absense 
of an overloaded operator fixes that problem.
August 28, 2008
Re: Why Strings as Classes?
Dee Girl wrote:
> Chris R. Miller Wrote:
>> You honestly cannot write a spec for generic programming and expect
>> uniform performance.
> 
> But this is what STL did. Sorry, Dee Girl

Reading back through the STL intro, it seems that all this STL power
comes from the iterator.  Supposing I wrote a horrid iterator (sort of
like the annoying O(n) opIndex previously discussed) I don't see why STL
is "immune" to the same weakness of a slower data structure.

I can see how STL is more powerful in that you can pick and choose the
algorithm to use, but at this point I think we're discussing changing
the nature of the sort property in D at a fundamental level.

I still just don't see the (apparently obvious) advantage of STL.
Disclaimer: I do not /know/ STL that well at all.  I came from Java with
a ___brief___ dabbling in C/C++.  So I'm not trying to be annoying,
stupid, or blind - I'm just ignorant of what you see that I don't.
August 28, 2008
Re: Why Strings as Classes?
"Dee Girl" <deegirl@noreply.com> wrote in message 
news:g94j7a$2875$1@digitalmars.com...
> Nick Sabalausky Wrote:
>
>> "Dee Girl" <deegirl@noreply.com> wrote in message
>> news:g943oi$11f4$1@digitalmars.com...
>> > Benji Smith Wrote:
>> >
>> >> Dee Girl wrote:
>> >> > Michiel Helvensteijn Wrote:
>> >> >> That's simple. a[i] looks much nicer than a.nth(i).
>> >> >
>> >> > It is not nicer. It is more deceiving (correct spell?). If you look 
>> >> > at
>> >> > code it looks like array code.
>> >> >
>> >> > foreach (i; 0 .. a.length)
>> >> > {
>> >> >     a[i] += 1;
>> >> > }
>> >> >
>> >> > For array works nice. But for list it is terrible! Many operations 
>> >> > for
>> >> > incrementing only small list.
>> >>
>> >> Well, that's what you get with operator overloading.
>> >
>> > I am sorry. I disagree. I think that is what you get with bad design.
>> >
>> >> The same thing could be said for "+" or "-". They're inherently
>> >> deceiving, because they look like builtin operations on primitive data
>> >> types.
>> >>
>> >> For expensive operations (like performing division on an
>> >> unlimited-precision decimal object), should the author of the code use
>> >> "opDiv" or should he implement a separate "divide" function?
>> >
>> > The cost of + and - is proportional to digits in number. For small 
>> > number
>> > of digits computer does fast in hardware. For many digits the cost 
>> > grows.
>> > The number of digits is log n. I think + and - are fine for big 
>> > integer. I
>> > am not surprise.
>> >
>> >> Forget opIndex for a moment, and ask the more general question about 
>> >> all
>> >> overloaded operators. Should they imply any sort of asymptotic
>> >> complexity guarantee?
>> >
>> > I think depends on good design. For example I think ++ or -- for 
>> > iterator.
>> > If it is O(n) it is bad design. Bad design make people say like you 
>> > "This
>> > is what you get with operator overloading".
>> >
>> >> Personally, I don't think so.
>> >>
>> >> I don't like "nth".
>> >>
>> >> I'd rather use the opIndex. And if I'm using a linked list, I'll be
>> >> aware of the fact that it'll exhibit linear-time indexing, and I'll be
>> >> cautious about which algorithms to use.
>> >
>> > But inside algorithm you do not know if you use a linked list or a 
>> > vector.
>> > You lost that information in bad abstraction. Also abstraction is bad
>> > because if you change data structure you have concept errors that still
>> > compile. And run until tomorrow ^_^.
>> >
>>
>> A generic algoritm has absolutely no business caring about the complexity 
>> of
>> the collection it's operating on. If it does, then you've created a 
>> concrete
>> algoritm, not a generic one.
>
> I appreciate your view point. Please allow me explain. The view point is 
> in opposition with STL. In STL each algorithm defines what kind of 
> iterator it operates with. And it requires what iterator complexity.
>
> I agree that other design can be made. But STL has that design. In my 
> opinion is much part of what make STL so successful.
>
> I disagree that algorithm that knows complexity of iterator is concrete. I 
> think exactly contrary. Maybe it is good that you read book about STL by 
> Josuttis. STL algorithms are the most generic I ever find in any language. 
> I hope std.algorithm in D will be better. But right now std.algorithm 
> works only with array.
>
>> If an algoritm uses [] and doesn't know the
>> complexity of the []...good! It shouldn't know, and it shouldn't care. 
>> It's
>> the code that sends the collection to the algoritm that knows and cares.
>
> I think this is mistake. Algorithm should know. Otherwise "linear find" is 
> not "linear find"! It is "cuadratic find" (spell?). If you want to define 
> something called linear find then you must know iterator complexity.
>

If a generic algorithm describes itself as "linear find" then I know damn 
well that it's referring to the behavior of *just* the function itself, and 
is not a statement that the function *combined* with the behavior of the 
collection and/or a custom comparison is always going to be O(n).

A question about STL: If I create a collection that, internally, is like a 
linked list, but starts each indexing operation from the position of the 
last indexing operation (so that a "find first" would run in O(n) instead of 
O(n*n)), is it possible to send that collection to STL's generic "linear 
find first"? I would argue that it should somehow be possible *even* if the 
STL's generic "linear find first" guarantees a *total* performance of O(n) 
(Since, in this case, it would still be O(n) anyway). Because otherwise, the 
STL wouldn't be very extendable, which would be a bad thing for a library of 
"generic" algorithms.

Another STL question: It is possible to use STL to do a "linear find" using 
a custom comparison? If so, it is possible to make STL's "linear find" 
function use a comparison that just happens to be O(n)? If so, doesn't that 
violate the linear-time guarantee, too? If not, how does it know that the 
custom comparison is O(n) instead of O(1) or O(log n)?
August 28, 2008
Re: Why Strings as Classes?
"Nick Sabalausky" wrote
>> Concepts (or better categories (aldor concept not C++), that are 
>> interfaces for types, but interfaces that have to be explicitly assigned 
>> to a type) might relax this situation a little, but the need for some 
>> guarantees will remain.
>>
>
> If this "guarantee" (or mechanism for checking the types of operations 
> that a collection supports) takes the form of a style guideline that says 
> "don't implement opIndex for a collection if it would be O(n) or worse", 
> then that, frankly, is absolutely no guarantee at all.

The guarantee is not enforced, but the expectation and convention is 
implict.  When someone sees an index operator the first thought is that it 
is a quick lookup.  You can force yourself to think differently, but the 
reality is that most people think that because of the universal usage of 
square brackets (except for VB, and I feel pity for anyone who needs to use 
VB) to mean 'lookup by key', and usually this is only useful on objects 
where the lookup is quick ( < O(n) ).  Although there is no requirement, nor 
enforcement, the 'quick' contract is expected by the user, no matter how 
much docs you throw at them.

Look, for instance, at Tango's now-deprecated LinkMap, which uses a 
linked-list of key/value pairs (copied from Doug Lea's implementation). 
Nobody in their right mind would use link map because lookups are O(n), and 
it's just as easy to use a TreeMap or HashMap.  Would you ever use it?

> If you *really* need that sort of guarantee (and I can imagine it may be 
> useful in some cases), then the implementation of the guarantee does *not* 
> belong in the realm of "implements vs doesn't-implement a particular 
> operator overload". Doing so is an abuse of operator overloading, since 
> operator overloading is there for defining syntactic sugar, not for acting 
> as a makeshift contract.

I don't think anybody is asking for a guarantee from the compiler or any 
specific tool.  I think what we are saying is that violating the 'opIndex is 
fast' notion is bad design because you end up with users thinking they are 
doing something that's quick.  You end up with people posting benchmarks on 
your containers saying 'why does python beat the pants off your list 
implementation?'.  You can say 'hey, it's not meant to be used that way', 
but then why can the user use it that way?  A better design is to nudge the 
user into using a correct container for the job by only supporting 
operations that make sense on the collections.

And as far as operator semantic meaning, D's operators are purposely named 
after what they are supposed to do.  Notice that the operator for + is 
opAdd, not opPlus.  This is because opAdd is supposed to mean you are 
performing an addition operation.  Assigning a different semantic meaning is 
not disallowed by the compiler, but is considered bad design.  opIndex is 
supposed to be an index function, not a linear search.  It's not called 
opSearch for a reason.  Sure you can redefine it however you want 
semantically, but it's considered bad design.  That's all we're saying.

-Steve
August 28, 2008
Re: Why Strings as Classes?
"Steven Schveighoffer" <schveiguy@yahoo.com> wrote in message 
news:g95b89$1jii$1@digitalmars.com...
>
> When someone sees an index operator the first thought is that it is a 
> quick lookup.

This seems to be a big part of the disagreement. Personally, I think it's 
insane to look at [] and just assume it's a cheap lookup. That's was true in 
pure C where, as someone else said, it was nothing more than a shorthand for 
a specific pointer arithmentic operation, but carrying that idea over to 
more modern languages (especially one that also uses [] for associative 
arrays) is a big mistake.

The '+' operator means "add". Addition is typically O(1). But vectors can be 
added, and that's an O(n) operation. Should opAdd never be used for vectors?

> You can force yourself to think differently, but the reality is that most 
> people think that because of the universal usage of square brackets 
> (except for VB, and I feel pity for anyone who needs to use VB) to mean 
> 'lookup by key', and usually this is only useful on objects where the 
> lookup is quick ( < O(n) ).  Although there is no requirement, nor 
> enforcement, the 'quick' contract is expected by the user, no matter how 
> much docs you throw at them.
>
> Look, for instance, at Tango's now-deprecated LinkMap, which uses a 
> linked-list of key/value pairs (copied from Doug Lea's implementation). 
> Nobody in their right mind would use link map because lookups are O(n), 
> and it's just as easy to use a TreeMap or HashMap.  Would you ever use it?
>
>> If you *really* need that sort of guarantee (and I can imagine it may be 
>> useful in some cases), then the implementation of the guarantee does 
>> *not* belong in the realm of "implements vs doesn't-implement a 
>> particular operator overload". Doing so is an abuse of operator 
>> overloading, since operator overloading is there for defining syntactic 
>> sugar, not for acting as a makeshift contract.
>
> I don't think anybody is asking for a guarantee from the compiler or any 
> specific tool.  I think what we are saying is that violating the 'opIndex 
> is fast' notion is bad design because you end up with users thinking they 
> are doing something that's quick.  You end up with people posting 
> benchmarks on your containers saying 'why does python beat the pants off 
> your list implementation?'.  You can say 'hey, it's not meant to be used 
> that way', but then why can the user use it that way?  A better design is 
> to nudge the user into using a correct container for the job by only 
> supporting operations that make sense on the collections.
>
> And as far as operator semantic meaning, D's operators are purposely named 
> after what they are supposed to do.  Notice that the operator for + is 
> opAdd, not opPlus.  This is because opAdd is supposed to mean you are 
> performing an addition operation.  Assigning a different semantic meaning 
> is not disallowed by the compiler, but is considered bad design.  opIndex 
> is supposed to be an index function, not a linear search.  It's not called 
> opSearch for a reason.  Sure you can redefine it however you want 
> semantically, but it's considered bad design.  That's all we're saying.
>

Nobody is suggesting using [] to invoke a search (Although we have talked 
about using [] *in the search function's implementation*). Search means you 
want to get the position of a given element, or in other words, "element" -> 
search -> "key/index". What we're talking about is the reverse: getting the 
element at a given position, ie, "key/index" -> [] -> "element". It doesn't 
matter if it's an array, a linked list, or a 
super-duper-collection-from-2092: that's still indexing, not searching.
August 28, 2008
Re: Why Strings as Classes?
Chris R. Miller wrote:
> Dee Girl wrote:
>> Chris R. Miller Wrote:
>>> You honestly cannot write a spec for generic programming and expect
>>> uniform performance.
>> But this is what STL did. Sorry, Dee Girl
> 
> Reading back through the STL intro, it seems that all this STL power
> comes from the iterator.  Supposing I wrote a horrid iterator (sort of
> like the annoying O(n) opIndex previously discussed) I don't see why STL
> is "immune" to the same weakness of a slower data structure.
> 
> I can see how STL is more powerful in that you can pick and choose the
> algorithm to use, but at this point I think we're discussing changing
> the nature of the sort property in D at a fundamental level.
> 
> I still just don't see the (apparently obvious) advantage of STL.

Read Alexander Stepanov's notes. They are fantastic.

http://www.stepanovpapers.com/notes.pdf
August 28, 2008
Re: Why Strings as Classes?
Nick Sabalausky wrote:
> "Dee Girl" <deegirl@noreply.com> wrote in message 
> news:g94j7a$2875$1@digitalmars.com...
>> Nick Sabalausky Wrote:
>>
>>> "Dee Girl" <deegirl@noreply.com> wrote in message
>>> news:g943oi$11f4$1@digitalmars.com...
>>>> Benji Smith Wrote:
>>>>
>>>>> Dee Girl wrote:
>>>>>> Michiel Helvensteijn Wrote:
>>>>>>> That's simple. a[i] looks much nicer than a.nth(i).
>>>>>> It is not nicer. It is more deceiving (correct spell?). If you look 
>>>>>> at
>>>>>> code it looks like array code.
>>>>>>
>>>>>> foreach (i; 0 .. a.length)
>>>>>> {
>>>>>>     a[i] += 1;
>>>>>> }
>>>>>>
>>>>>> For array works nice. But for list it is terrible! Many operations 
>>>>>> for
>>>>>> incrementing only small list.
>>>>> Well, that's what you get with operator overloading.
>>>> I am sorry. I disagree. I think that is what you get with bad design.
>>>>
>>>>> The same thing could be said for "+" or "-". They're inherently
>>>>> deceiving, because they look like builtin operations on primitive data
>>>>> types.
>>>>>
>>>>> For expensive operations (like performing division on an
>>>>> unlimited-precision decimal object), should the author of the code use
>>>>> "opDiv" or should he implement a separate "divide" function?
>>>> The cost of + and - is proportional to digits in number. For small 
>>>> number
>>>> of digits computer does fast in hardware. For many digits the cost 
>>>> grows.
>>>> The number of digits is log n. I think + and - are fine for big 
>>>> integer. I
>>>> am not surprise.
>>>>
>>>>> Forget opIndex for a moment, and ask the more general question about 
>>>>> all
>>>>> overloaded operators. Should they imply any sort of asymptotic
>>>>> complexity guarantee?
>>>> I think depends on good design. For example I think ++ or -- for 
>>>> iterator.
>>>> If it is O(n) it is bad design. Bad design make people say like you 
>>>> "This
>>>> is what you get with operator overloading".
>>>>
>>>>> Personally, I don't think so.
>>>>>
>>>>> I don't like "nth".
>>>>>
>>>>> I'd rather use the opIndex. And if I'm using a linked list, I'll be
>>>>> aware of the fact that it'll exhibit linear-time indexing, and I'll be
>>>>> cautious about which algorithms to use.
>>>> But inside algorithm you do not know if you use a linked list or a 
>>>> vector.
>>>> You lost that information in bad abstraction. Also abstraction is bad
>>>> because if you change data structure you have concept errors that still
>>>> compile. And run until tomorrow ^_^.
>>>>
>>> A generic algoritm has absolutely no business caring about the complexity 
>>> of
>>> the collection it's operating on. If it does, then you've created a 
>>> concrete
>>> algoritm, not a generic one.
>> I appreciate your view point. Please allow me explain. The view point is 
>> in opposition with STL. In STL each algorithm defines what kind of 
>> iterator it operates with. And it requires what iterator complexity.
>>
>> I agree that other design can be made. But STL has that design. In my 
>> opinion is much part of what make STL so successful.
>>
>> I disagree that algorithm that knows complexity of iterator is concrete. I 
>> think exactly contrary. Maybe it is good that you read book about STL by 
>> Josuttis. STL algorithms are the most generic I ever find in any language. 
>> I hope std.algorithm in D will be better. But right now std.algorithm 
>> works only with array.
>>
>>> If an algoritm uses [] and doesn't know the
>>> complexity of the []...good! It shouldn't know, and it shouldn't care. 
>>> It's
>>> the code that sends the collection to the algoritm that knows and cares.
>> I think this is mistake. Algorithm should know. Otherwise "linear find" is 
>> not "linear find"! It is "cuadratic find" (spell?). If you want to define 
>> something called linear find then you must know iterator complexity.
>>
> 
> If a generic algorithm describes itself as "linear find" then I know damn 
> well that it's referring to the behavior of *just* the function itself, and 
> is not a statement that the function *combined* with the behavior of the 
> collection and/or a custom comparison is always going to be O(n).
> 
> A question about STL: If I create a collection that, internally, is like a 
> linked list, but starts each indexing operation from the position of the 
> last indexing operation (so that a "find first" would run in O(n) instead of 
> O(n*n)), is it possible to send that collection to STL's generic "linear 
> find first"? I would argue that it should somehow be possible *even* if the 
> STL's generic "linear find first" guarantees a *total* performance of O(n) 
> (Since, in this case, it would still be O(n) anyway). Because otherwise, the 
> STL wouldn't be very extendable, which would be a bad thing for a library of 
> "generic" algorithms.

Yes, it will work.

> Another STL question: It is possible to use STL to do a "linear find" using 
> a custom comparison? If so, it is possible to make STL's "linear find" 
> function use a comparison that just happens to be O(n)? If so, doesn't that 
> violate the linear-time guarantee, too? If not, how does it know that the 
> custom comparison is O(n) instead of O(1) or O(log n)?

This will work too.

IF you follow the conventions THEN the STL gives you the guarantees.
August 28, 2008
Re: Why Strings as Classes?
"Don" <nospam@nospam.com.au> wrote in message 
news:g95ks5$2aon$1@digitalmars.com...
> Nick Sabalausky wrote:
>> "Dee Girl" <deegirl@noreply.com> wrote in message 
>> news:g94j7a$2875$1@digitalmars.com...
>>> I appreciate your view point. Please allow me explain. The view point is 
>>> in opposition with STL. In STL each algorithm defines what kind of 
>>> iterator it operates with. And it requires what iterator complexity.
>>>
>>> I agree that other design can be made. But STL has that design. In my 
>>> opinion is much part of what make STL so successful.
>>>
>>> I disagree that algorithm that knows complexity of iterator is concrete. 
>>> I think exactly contrary. Maybe it is good that you read book about STL 
>>> by Josuttis. STL algorithms are the most generic I ever find in any 
>>> language. I hope std.algorithm in D will be better. But right now 
>>> std.algorithm works only with array.
>>>
>>>> If an algoritm uses [] and doesn't know the
>>>> complexity of the []...good! It shouldn't know, and it shouldn't care. 
>>>> It's
>>>> the code that sends the collection to the algoritm that knows and 
>>>> cares.
>>> I think this is mistake. Algorithm should know. Otherwise "linear find" 
>>> is not "linear find"! It is "cuadratic find" (spell?). If you want to 
>>> define something called linear find then you must know iterator 
>>> complexity.
>>>
>>
>> If a generic algorithm describes itself as "linear find" then I know damn 
>> well that it's referring to the behavior of *just* the function itself, 
>> and is not a statement that the function *combined* with the behavior of 
>> the collection and/or a custom comparison is always going to be O(n).
>>
>> A question about STL: If I create a collection that, internally, is like 
>> a linked list, but starts each indexing operation from the position of 
>> the last indexing operation (so that a "find first" would run in O(n) 
>> instead of O(n*n)), is it possible to send that collection to STL's 
>> generic "linear find first"? I would argue that it should somehow be 
>> possible *even* if the STL's generic "linear find first" guarantees a 
>> *total* performance of O(n) (Since, in this case, it would still be O(n) 
>> anyway). Because otherwise, the STL wouldn't be very extendable, which 
>> would be a bad thing for a library of "generic" algorithms.
>
> Yes, it will work.
>
>> Another STL question: It is possible to use STL to do a "linear find" 
>> using a custom comparison? If so, it is possible to make STL's "linear 
>> find" function use a comparison that just happens to be O(n)? If so, 
>> doesn't that violate the linear-time guarantee, too? If not, how does it 
>> know that the custom comparison is O(n) instead of O(1) or O(log n)?
>
> This will work too.
>
> IF you follow the conventions THEN the STL gives you the guarantees.

I'm not sure that's really a "guarantee" per se, but that's splitting hairs.

In any case, it sounds like we're all arguing more or less the same point:

Setting aside the issue of "should opIndex be used and when?", suppose I 
have the following collection interface and find function (not guaranteed to 
compile):

interface ICollection(T)
{
   T getElement(index);
   int getSize();
}

int find(T)(ICollection(T) c, T elem)
{
   for(int i=0; i<c.size(); i++)
   {
if(c.getElement(i) == elem)
           return i;
   }
}

It sounds like STL's approach is to do something roughly like that and say:

"find()'s parameter 'c' should be an ICollection for which getElement() is 
O(1), in which case find() is guaranteed to be O(n)"

What I've been advocating is, again, doing something like the code above and 
saying:

"find()'s complexity is dependant on the complexity of the ICollection's 
getElement(). If getElement()'s complexity is O(m), then find()'s complexity 
is guaranteed to be O(m * n). Of course, this means that the only way to get 
ideal complexity from find() is to use an ICollection for which getElement() 
is O(1)".

But, you see, those two statements are effectively equivilent.
August 28, 2008
Re: Why Strings as Classes?
On 2008-08-28 06:06:11 +0200, "Nick Sabalausky" <a@a.a> said:

> "Fawzi Mohamed" <fmohamed@mac.com> wrote in message
> news:g94k2b$2a1e$1@digitalmars.com...
>> 
>> I am with dan dee_girl & co on this issue, the problem is that a generic
>> algorithm "knows" the types he is working on and can easily check the
>> operations they have, and based on this decide the strategy to use. This
>> choice works well if the presence of a given operation is also connected
>> with some performance guarantee.
>> 
> 
> IMO, a better way to do that would be via C#-style attributes or equivilent
> named interfaces. I'm not sure if this is what you're referring to below or
> not.

yes categories are basically named interfaces for types, unlike the 
constraint (that check if something is implemented) one has to 
explicitly say T implements Interface (and obviously T has to have all 
the requested functions/methods).
This should be available for all types, and you should be able to 
request also the existence of free functions, not only of methods.
Attributes are a simplified version of this (basically no checks for a 
given interface).
The important thing is that presence or absence of attributes in a 
given type is not automatically inferred from the presence of given 
functions.

>> Concepts (or better categories (aldor concept not C++), that are
>> interfaces for types, but interfaces that have to be explicitly assigned
>> to a type) might relax this situation a little, but the need for some
>> guarantees will remain.
>> 
> 
> If this "guarantee" (or mechanism for checking the types of operations that
> a collection supports) takes the form of a style guideline that says "don't
> implement opIndex for a collection if it would be O(n) or worse", then that,
> frankly, is absolutely no guarantee at all.

well if it is in the spec and everybody knows it, then breaking it and 
getting bad behavior is you own fault.

> If you *really* need that sort of guarantee (and I can imagine it may be
> useful in some cases), then the implementation of the guarantee does *not*
> belong in the realm of "implements vs doesn't-implement a particular
> operator overload". Doing so is an abuse of operator overloading, since
> operator overloading is there for defining syntactic sugar, not for acting
> as a makeshift contract.
> 
> The correct mechanism for such guarantees is with named interfaces or
> C#-style attribtes, as I mentioned above. True, that can still be abused if
> the collection author wants to, but they have to actually try (ie, they have
> to lie and say "implements IndexingInConstantTime" in addition to
> implementing opIndex). If you instead try to implement that guarantee with
> the "don't implement opIndex for a collection if it would be O(n) or worse"
> style-guideline, then it's far too easy for a collection to come along that
> is ignorant of that "psuedo-contract" and accidentially breaks it. Proper
> use of interfaces/attributes instead of relying on the existence or absense
> of an overloaded operator fixes that problem.

I fully agree that with interfaces (or categories or attributes) the 
correct thing is to use them to enforce extra constraints, so that 
overloading (or naming the functions) is really just syntactic sugar, 
but also in this case it can make reading (and writing from the 
beginning reasonably fast code) and understanding complexity (speed) of 
code reading it easier if some social contract about speed of 
operations is respected.

Pleas note that these "guarantees" are not such that one cannot break 
them, their purpose is to make the life of who knows them and enforces 
them easier, and also of the whole community if it chooses to adopt it. 
As Steve just argued much more fully than me.

STL, does it, and I think that also in D should do it, should D get 
categories or attributes this things could be relaxed a little, but I 
think there will still be cases where expecting a given function to 
have a given complexity will be a good thing to have.

It just makes thinking about the code easier, and simpler to stay at 
high level without having surprises that you have to find out by 
looking in detail at the code, and so it makes a programmer more 
productive.

Fawzi
13 14 15 16 17 18 19 20 21
Top | Discussion index | About this forum | D home