View mode: basic / threaded / horizontal-split · Log in · Help
August 27, 2008
Re: Why Strings as Classes?
Chris R. Miller Wrote:

> superdan wrote:
> > Steven Schveighoffer Wrote:
> > 
> >> "superdan" wrote
> >>> Christopher Wright Wrote:
> >>>
> >>>> superdan wrote:
> >>>>>> class LinkedList(T)
> >>>>>> {
> >>>>>> Node!(T) head;
> >>>>>>
> >>>>>> /** Gets the ith item of the list. Throws: SIGSEGV if i >= length of
> >>>>>> the list. Time complexity: O(N) for a list of length N. This operation
> >>>>>> is provided for completeness and not recommended for frequent use in
> >>>>>> large lists. */
> >>>>>> T opIndex(int i)
> >>>>>> {
> >>>>>> auto current = head;
> >>>>>> while (i)
> >>>>>> {
> >>>>>> current = current.next;
> >>>>>> }
> >>>>>> return current.value;
> >>>>>> }
> >>>>>> }
> >>>>> oopsies. houston we got a problem here. problem is all that pluggable 
> >>>>> sort business works only if it can count on a constant time opIndex. 
> >>>>> why? because sort has right in its spec that it takes o(n log n) time. 
> >>>>> if u pass LinkedList to it you obtain a nonsensical design that 
> >>>>> compiles but runs afoul of the spec. because with that opIndex sort 
> >>>>> will run in quadratic time and no amount of commentin' is gonna save it 
> >>>>> from that.
> >>>> WRONG!
> >>>> Those sorting algorithms are correct. Their runtime is now O(n^2 log n)
> >>>> for this linked list.
> >>> you got the wrong definition of correct. sort became a non-scalable algo 
> >>> from a scalable algo. whacha gonna say next. bubble sort is viable?!?
> >> O(n^2 log n) is still considered solved.
> > 
> > of course. for traveling salesman that is. just not sorting. o(n^2 log n) sorting algo, that's  a no-go. notice also i didn't say solved/unsolved. i said scalable. stuff beyond n log n don't scale.
> 
> Why are people unable to have the brains to understand that they're
> using data structures in an suboptimal nature?
> 
> Furthermore, you're faulting linked lists as having a bad opIndex.  Why
> not implement a cursor (Java LinkedList-like iterator) in the opIndex
> function?  Thus you could retain the reference to the last indexed
> location, and simply use that instead of the root node when calling
> opIndex.  Granted that whenever the contents of the list are modified
> that reference would have to be considered invalid (start from the root
> node again), but it'd work with an O(1) efficiency for sequential
> accesses from 0 to length.  True, it'll add another pointer to a node in
> memory, as well as a integer representing the position of that node
> reference.

I am sorry to enter discussion. But I have some thing to say. Please do not scare me ^_^.

I think Super Dan choose wrong example sort. Because sort is O(n log n) even for list. But good example is find. Taken collection that gives length and opIndex abstraction. Then I write find easy with index. It is slow O(n*n) for list. But with optimization from Chris it is fast again. But if I want to write findLast. Find last element equal to some thing. Then I go back. But going back the optimization never works. I am again to O(n*n)!

This is important because abstraction. You want to write find abstract. Also you want write container abstract. And you want both to work together well. If you choose algorithm manually "you cheat". You break abstraction. Because you want abstract algorithm work on abstract container. Not concrete algorithm on concrete container.

Also is not only detail. When call findLast on container I expect better or worse depending on optimization of library. But I expect proportional with number of elements. If I know is O(n*n) maybe I want redesign. O(n*n) is really bad. 1000 elements is not many. But 1000000 operations is many.

I took two data structures classes. What each structure gives fast is essential. Not detail. I am not sure is clear what I say. Structures are special for certain operations. For example there is suffix tree. It is for fast common substring. Suffix tree must not have same interface as O(n*n) search. Because algorithm should not accept both. If you say list has random access it is naive I think (sorry!). Everybody in class could laugh. To find index in list is linear search. A similar example an array string[] can define overload a["abc"] to do linear search for "abc". But search is not indexing. It must be name search find or linearSearch.
August 27, 2008
Re: Why Strings as Classes?
Michiel Helvensteijn Wrote:

> superdan wrote:
> 
> >> This of course means that a linked list cannot define opIndex, since a
> >> random access operation on it will take O(n) (there are tricks that can
> >> make it faster in most practical cases, but I digress).
> > 
> > it oughtn't. & you digress in the wrong direction. you can't prove a
> > majority of "practical cases" will not suffer a performance hit.
> 
> Perhaps. It's been a while since I've worked with data-structures on this
> level, but I seem to remember there are ways.

I write example with findLast a little time a go. But there are many examples. For example move to front algorithm. It is linear but with "trick" opIndex it is O(n*n) even with optimization. Bad!

> What if you maintain a linked list of small arrays? Say each node in the
> list contains around log(n) of the elements in the entire list. Wouldn't
> that bring random access down to O(log n)? Of course, this would also bring
> insertions up to O(log n).
> 
> And what if you save an index/pointer pair after each access. Then with each
> new access request, you can choose from three locations to start walking:
> * The start of the list.
> * The end of the list.
> * The last access-point of the list.
> 
> In a lot of practical cases a new access is close to the last access. Of
> course, the general case would still be O(n).

Michiel-san, this is new data structure very different from list! If I want list I never use this structure. It is like joking. Because when you write this you agree that list is not vector.

> >> That, in turn, means that a linked list and a dynamic array can not share
> >> a common interface that includes opIndex.
> > 
> > so what. they can share a common interface that includes nth(). what
> > exactly is yer problem with that.
> 
> That's simple. a[i] looks much nicer than a.nth(i).

It is not nicer. It is more deceiving (correct spell?). If you look at code it looks like array code.

foreach (i; 0 .. a.length)
{
   a[i] += 1;
}

For array works nice. But for list it is terrible! Many operations for incrementing only small list.

> By the way, I suspect that if opIndex is available only on arrays and nth()
> is available on all sequence types, algorithm writers will forget about
> opIndex and use nth(), to make their algorithm more widely compatible. And
> I wouldn't blame them, though I guess you would.

I do not agree with this. I am sorry! I think nobody should write find() that uses nth().

> >> * A list has faster insertions and growth.
> > 
> > o(1) insertion if u have the insertion point.  both list and vector have
> > o(1) growth.
> 
> Yeah, but dynamic arrays have to re-allocate once in a while. Lists don't.

Lists allocate memory each insert. Array allocate memory some time. With doubling cost of allocation+copy converge to zero. 

> > we get to the point where we realize the two r fundamentally different
> > structures built around fundamentally different tradeoffs. they do satisfy
> > the same interface. just ain't the vector interface. it's a
> > sequential-access interface. not a random-access interface.
> 
> I believe we agree in principle, but are just confused about each others
> definitions. If the "random-access interface" guarantees O(1) for
> nth/opIndex/whatever, of course you are right.
> 
> But if time-complexity is not taken into consideration, the
> sequential-access interface and the random-access interface are equivalent,
> no?

I think it is mistake to not taken into consideration time complexity. For basic data structures specially. I do not think you can call them equivalent.

> I'm not opposed to complexity guarantees in public contracts. Far from it,
> in fact. Just introduce both interfaces and let the algorithm writers
> choose which one to accept. But give both interfaces opIndex, since it's
> just good syntax.

I think is convenient syntax. Maybe too convenient ^_^.

> I do think it's a good idea for algorithms to support the interface with the
> weakest constraints (sequential-access). As long as they specify their
> time-complexity in terms of the complexities of the interface operations,
> not in absolute terms.
> 
> Then, when a programmer writes 'somealgorithm(someinterfaceobject)', a
> hypothetical analysis tool could tell him the time-complexity of the the
> resulting operation. The programmer might even assert a worst-case
> complexity at that point and the compiler could bail out if it doesn't
> match.

The specification I think is with types. If that works tool is the compiler.

> > even if u don't use c++ for its many faults. understand stl coz it's d
> > shiznit.
> 
> I use C++. I use STL. I love both. But that doesn't mean there is no room
> for improvement. The STL is quite complex, and maybe it doesn't have to be.

Many things in STL can be better with D. But iterators and complexity is beautiful in STL.
August 27, 2008
Re: Why Strings as Classes?
Dee Girl wrote:

>> What if you maintain a linked list of small arrays? Say each node in the
>> list contains around log(n) of the elements in the entire list. Wouldn't
>> that bring random access down to O(log n)? Of course, this would also
>> bring insertions up to O(log n).
>> 
>> And what if you save an index/pointer pair after each access. Then with
>> each new access request, you can choose from three locations to start
>> walking: * The start of the list.
>> * The end of the list.
>> * The last access-point of the list.
>> 
>> In a lot of practical cases a new access is close to the last access. Of
>> course, the general case would still be O(n).
> 
> Michiel-san, this is new data structure very different from list! If I
> want list I never use this structure. It is like joking. Because when you
> write this you agree that list is not vector.

Yes, the first 'trick' makes it a different datastructure. The second does
not. Would you still be opposed to using opIndex if its time-complexity is
O(log n)?

>> That's simple. a[i] looks much nicer than a.nth(i).
> 
> It is not nicer. It is more deceiving (correct spell?). If you look at
> code it looks like array code.
> 
> foreach (i; 0 .. a.length)
> {
>     a[i] += 1;
> }
> 
> For array works nice. But for list it is terrible! Many operations for
> incrementing only small list.

With that second trick the loop would have the same complexity for lists.

But putting that aside for the moment, are you saying you would allow
yourself to be deceived by a syntax detail? No, mentally attaching O(1) to
the *subscripting operator* is simply a legacy from C, where it is
syntactic sugar for pointer arithmetic.

>> By the way, I suspect that if opIndex is available only on arrays and
>> nth() is available on all sequence types, algorithm writers will forget
>> about opIndex and use nth(), to make their algorithm more widely
>> compatible. And I wouldn't blame them, though I guess you would.
> 
> I do not agree with this. I am sorry! I think nobody should write find()
> that uses nth().

Of course not. Find should be written with an iterator, which has optimal
complexity for both data-structures. My point is that an algorithm should
be generic first and foremost. Then you use the operations that have the
lowest complexity over all targeted data-structures if possible.

>> >> * A list has faster insertions and growth.
>> > 
>> > o(1) insertion if u have the insertion point.  both list and vector
>> > have o(1) growth.
>> 
>> Yeah, but dynamic arrays have to re-allocate once in a while. Lists
>> don't.
> 
> Lists allocate memory each insert. Array allocate memory some time. With
> doubling cost of allocation+copy converge to zero.

Lists allocate memory for bare nodes, but never have to copy their elements.
Arrays have to move their whole content to a larger memory location each
time they are outgrown. For more complex data-types that means potentially
very expensive copies.

>> But if time-complexity is not taken into consideration, the
>> sequential-access interface and the random-access interface are
>> equivalent, no?
> 
> I think it is mistake to not taken into consideration time complexity. For
> basic data structures specially. I do not think you can call them
> equivalent.

I did say 'if'. You have to agree that if you disregard complexity issues
(for the sake of argument), the two ARE equivalent.

>> I do think it's a good idea for algorithms to support the interface with
>> the weakest constraints (sequential-access). As long as they specify
>> their time-complexity in terms of the complexities of the interface
>> operations, not in absolute terms.
>> 
>> Then, when a programmer writes 'somealgorithm(someinterfaceobject)', a
>> hypothetical analysis tool could tell him the time-complexity of the the
>> resulting operation. The programmer might even assert a worst-case
>> complexity at that point and the compiler could bail out if it doesn't
>> match.
> 
> The specification I think is with types. If that works tool is the
> compiler.

But don't you understand that if this tool did exist, and the language had a
standard notation for time/space-complexity, I could simply write:

sequence<T> s;
/* fill sequence */
sort(s);

And the compiler (in cooperation with this 'tool') could automatically find
the most effective combination of data-structure and algorithm. The code
would be more readable and efficient.

-- 
Michiel
August 27, 2008
Re: Why Strings as Classes?
Michiel Helvensteijn Wrote:

> Dee Girl wrote:
> 
> >> What if you maintain a linked list of small arrays? Say each node in the
> >> list contains around log(n) of the elements in the entire list. Wouldn't
> >> that bring random access down to O(log n)? Of course, this would also
> >> bring insertions up to O(log n).
> >> 
> >> And what if you save an index/pointer pair after each access. Then with
> >> each new access request, you can choose from three locations to start
> >> walking: * The start of the list.
> >> * The end of the list.
> >> * The last access-point of the list.
> >> 
> >> In a lot of practical cases a new access is close to the last access. Of
> >> course, the general case would still be O(n).
> > 
> > Michiel-san, this is new data structure very different from list! If I
> > want list I never use this structure. It is like joking. Because when you
> > write this you agree that list is not vector.
> 
> Yes, the first 'trick' makes it a different datastructure. The second does
> not. Would you still be opposed to using opIndex if its time-complexity is
> O(log n)?

This is different question. And tricks are not answer for problem. Problem is list has other access method than array.  

> >> That's simple. a[i] looks much nicer than a.nth(i).
> > 
> > It is not nicer. It is more deceiving (correct spell?). If you look at
> > code it looks like array code.
> > 
> > foreach (i; 0 .. a.length)
> > {
> >     a[i] += 1;
> > }
> > 
> > For array works nice. But for list it is terrible! Many operations for
> > incrementing only small list.
> 
> With that second trick the loop would have the same complexity for lists.

Not for singly linked lists. I think name "trick" is very good. It is trick like prank to a friend. It does not do real thing. It only fools for few cases.

> But putting that aside for the moment, are you saying you would allow
> yourself to be deceived by a syntax detail? No, mentally attaching O(1) to
> the *subscripting operator* is simply a legacy from C, where it is
> syntactic sugar for pointer arithmetic.

I do not think so. I am sorry. If a[n] is not allowed then other array access primitive is allowed. Give index(a, n) as example. If language say index(a, n) is array access then it is big mistake for list to also define index(a, n). List maybe should define findAt(a, n). Then array also can define findAt(a, n). It is not mistake.

> >> By the way, I suspect that if opIndex is available only on arrays and
> >> nth() is available on all sequence types, algorithm writers will forget
> >> about opIndex and use nth(), to make their algorithm more widely
> >> compatible. And I wouldn't blame them, though I guess you would.
> > 
> > I do not agree with this. I am sorry! I think nobody should write find()
> > that uses nth().
> 
> Of course not. Find should be written with an iterator, which has optimal
> complexity for both data-structures. My point is that an algorithm should
> be generic first and foremost. Then you use the operations that have the
> lowest complexity over all targeted data-structures if possible.

Maybe I think "generic" word different than you. For me generic is that algorithm asks minimum from structure to do its work. For example find ask only one forward pass. Input iterator does one forward pass. It is mistake if find ask for index. It is also mistake if structure makes an algorithm think it has index as primitive operation.

> >> >> * A list has faster insertions and growth.
> >> > 
> >> > o(1) insertion if u have the insertion point.  both list and vector
> >> > have o(1) growth.
> >> 
> >> Yeah, but dynamic arrays have to re-allocate once in a while. Lists
> >> don't.
> > 
> > Lists allocate memory each insert. Array allocate memory some time. With
> > doubling cost of allocation+copy converge to zero.
> 
> Lists allocate memory for bare nodes, but never have to copy their elements.
> Arrays have to move their whole content to a larger memory location each
> time they are outgrown. For more complex data-types that means potentially
> very expensive copies.

I think this is mistake. I think you should google "amortized complexity". Maybe that can help much.

> >> But if time-complexity is not taken into consideration, the
> >> sequential-access interface and the random-access interface are
> >> equivalent, no?
> > 
> > I think it is mistake to not taken into consideration time complexity. For
> > basic data structures specially. I do not think you can call them
> > equivalent.
> 
> I did say 'if'. You have to agree that if you disregard complexity issues
> (for the sake of argument), the two ARE equivalent.

But it is useless comparison. Comparison can not forget important aspect. If we ignore fractionary floating point is integer. If organism is not alive it is mostly water.

> >> I do think it's a good idea for algorithms to support the interface with
> >> the weakest constraints (sequential-access). As long as they specify
> >> their time-complexity in terms of the complexities of the interface
> >> operations, not in absolute terms.
> >> 
> >> Then, when a programmer writes 'somealgorithm(someinterfaceobject)', a
> >> hypothetical analysis tool could tell him the time-complexity of the the
> >> resulting operation. The programmer might even assert a worst-case
> >> complexity at that point and the compiler could bail out if it doesn't
> >> match.
> > 
> > The specification I think is with types. If that works tool is the
> > compiler.
> 
> But don't you understand that if this tool did exist, and the language had a
> standard notation for time/space-complexity, I could simply write:
> 
> sequence<T> s;
> /* fill sequence */
> sort(s);
> 
> And the compiler (in cooperation with this 'tool') could automatically find
> the most effective combination of data-structure and algorithm. The code
> would be more readable and efficient.

Michiel-san, STL does that. Or I misunderstand you?
August 27, 2008
Re: Why Strings as Classes?
On 2008-08-27 16:16:49 +0200, bearophile <bearophileHUGS@lycos.com> said:

> Fawzi Mohamed:
>> it also depend on
>> the language, if in a language it is clear that using the default
>> structure you are not supposed to append often, and if you really have
>> to do it you use a special method if you do, then the programs written
>> in that language will use that.
> 
> I agree. But then the D specs have to be updated to say that the append 
> to built-in D arrays is a slow or very slow operation (not amortized 
> O(1)), so people coming from the C++ STL, Python, Ruby, TCl, Lua, Lisp, 
> Clean, Oz, ecc will not receive a bite from it.
> 
> 
>> D tries to do different things to mitigate the fact that appending is
>> difficult, maybe it could do more, but it will *never* be as efficient
>> as a structure that gives up that fact that an array has to be just a
>> chunk of contiguous memory.
> 
> I agree, a deque will probably be always faster in appending than a 
> dynamic array.
> But I think D may do more here :-)
> 
> 
>> Now I find the choice of having the basic array being just a chunk of
>> contiguous memory, and that the overhead of the structure should be
>> minimal very reasonable for a system programming language that has to
>> interact with C
> 
> Note that both vector of the C++ STL and "list" of Python are generally 
> (or always) implemented with a chunk of contiguous memory.
> 
> 
>> Clearly someone coming from lisp, any functional language or even
>> python, might disagree about what is requested from the "default" array
>> container.
>> It isn't that one is right and the other wrong, it is just a question
>> of priorities of the language, the feel of it, its style...
> 
> I agree, that's why I have suggested to collect statistics from real D 
> code, and not from Lisp programs, to see what's the a good performance 
> profile compromise for D built-in dynamic arrays :-) This means that 
> I'll stop caring for a fast array append in D if most D programmers 
> don't need fast appends much.
> 
> 
>> This does not mean that if improvements can be done without
>> compromising too much it shouldn't be done, just that failing some
>> benchmarks might be ok :)
> 
> Well, in my opinion that's okay, but there's a limit. So I think a 
> built-in data structure has to be optimized for being flexible, so 
> while not being very good in anything, it has to be not terrible in any 
> commonly done operation.
> On the other hand, I can see what you say, that in a low level language 
> a too much flexible data structure may be not fit as built-in, while a 
> simpler and less flexible one may be fitter. You may be right and I may 
> be wrong on this point :-)

well it is funny because now I am not too sure anymore that adding an 
extra field (pointing to the end of the reserved data, if the actual 
array is the "owner" of it and to before the start if it isn't and one 
should reallocate (i.e. for slices) is such a bad idea.

I started out with the idea "slightly improved C characteristics", and 
so for me it was clear that appending would be bad, and I was actually 
surprised (using it) in seeing that it was less bad that I supposed.
The way it is has the advantage of allowing bound checks, and a very 
little overhead, but it has no concept of "extra grow space".
So for me it was clear that if I wanted to insert something I had to 
make place first, and then insert it (keeping in mind the old size).

The vector approach (do know about the capacity) can be useful in high 
level code where one wants that a.length always is the length of the 
array (not maybe longer because of some reserved memory).

So maybe it is worthwhile, even if it will for sure add some bloat, I 
do not know...
For most of my code it will just add bloat, but probably a tolerable one

Fawzi
August 27, 2008
Re: Why Strings as Classes?
Michiel Helvensteijn <nomail@please.com> wrote:

> a[i] looks much nicer than a.nth(i).

To me, this is one of the most important points here. I want a
language that seems to make sense, more than I want a language
that is by default very fast. When I write a short example
program, I want to write a[i] not a.getElementAtPosition(i).

D is known as a language that does the safe thing by default,
and you have to jump through some hoops to do the fast, unsafe
thing. I will claim that a[i] is the default, as it is what
we're used to, and looks better. a.nth(i),
a.getElementAtPosition(i) and whatever other ways one might
come up with, is jumping through hoops.

Just my 0.02 kr.

-- 
Simen
August 27, 2008
Re: Why Strings as Classes?
Dee Girl wrote:

>> Yes, the first 'trick' makes it a different datastructure. The second
>> does not. Would you still be opposed to using opIndex if its
>> time-complexity is O(log n)?
> 
> This is different question. And tricks are not answer for problem. Problem
> is list has other access method than array.

And what's the answer?

>> With that second trick the loop would have the same complexity for lists.
> 
> Not for singly linked lists.

Yeah, also for singly linked lists.

>> But putting that aside for the moment, are you saying you would allow
>> yourself to be deceived by a syntax detail? No, mentally attaching O(1)
>> to the *subscripting operator* is simply a legacy from C, where it is
>> syntactic sugar for pointer arithmetic.
> 
> I do not think so. I am sorry. If a[n] is not allowed then other array
> access primitive is allowed. Give index(a, n) as example. If language say
> index(a, n) is array access then it is big mistake for list to also define
> index(a, n). List maybe should define findAt(a, n). Then array also can
> define findAt(a, n). It is not mistake.

Yes, I agree for "array access". That term implies O(1), since it uses the
word "array". But I was argueing against the subscripting operator to be
forced into O(1).

>> Lists allocate memory for bare nodes, but never have to copy their
>> elements. Arrays have to move their whole content to a larger memory
>> location each time they are outgrown. For more complex data-types that
>> means potentially very expensive copies.
> 
> I think this is mistake. I think you should google "amortized complexity".
> Maybe that can help much.

Amortized complexity has nothing to do with it. Dynamic arrays have to copy
their elements and lists do not. It's as simple as that.

>> But don't you understand that if this tool did exist, and the language
>> had a standard notation for time/space-complexity, I could simply write:
>> 
>> sequence<T> s;
>> /* fill sequence */
>> sort(s);
>> 
>> And the compiler (in cooperation with this 'tool') could automatically
>> find the most effective combination of data-structure and algorithm. The
>> code would be more readable and efficient.
> 
> Michiel-san, STL does that. Or I misunderstand you?

STL will choose the right sorting algorithm, given a specific
data-structure. But I am saying it may be possible also for the
data-structure to be automatically chosen, based on what the programmer
does with it.

-- 
Michiel
August 27, 2008
Re: Why Strings as Classes?
Simen Kjaeraas Wrote:

> Michiel Helvensteijn <nomail@please.com> wrote:
> 
> > a[i] looks much nicer than a.nth(i).
> 
> To me, this is one of the most important points here. I want a
> language that seems to make sense, more than I want a language
> that is by default very fast. When I write a short example
> program, I want to write a[i] not a.getElementAtPosition(i).

sure. you do so with arrays. i think you confuse "optimized-fast" with "complexity-fast".

> D is known as a language that does the safe thing by default,
> and you have to jump through some hoops to do the fast, unsafe
> thing. I will claim that a[i] is the default, as it is what
> we're used to, and looks better. a.nth(i),
> a.getElementAtPosition(i) and whatever other ways one might
> come up with, is jumping through hoops.

guess i missed the lesson teachin' array indexing was unsafe. you are looking at them wrong tradeoffs. it's not about slow and safe vs. fast and unsafe. safety's nothin' to do with all this. you're lookin' at bad design vs. good design of algos and data structs.
August 27, 2008
Re: Why Strings as Classes?
Dee Girl Wrote:

> Michiel Helvensteijn Wrote:
> 
> > Dee Girl wrote:
> > 
> > >> What if you maintain a linked list of small arrays? Say each node in the
> > >> list contains around log(n) of the elements in the entire list. Wouldn't
> > >> that bring random access down to O(log n)? Of course, this would also
> > >> bring insertions up to O(log n).
> > >> 
> > >> And what if you save an index/pointer pair after each access. Then with
> > >> each new access request, you can choose from three locations to start
> > >> walking: * The start of the list.
> > >> * The end of the list.
> > >> * The last access-point of the list.
> > >> 
> > >> In a lot of practical cases a new access is close to the last access. Of
> > >> course, the general case would still be O(n).
> > > 
> > > Michiel-san, this is new data structure very different from list! If I
> > > want list I never use this structure. It is like joking. Because when you
> > > write this you agree that list is not vector.
> > 
> > Yes, the first 'trick' makes it a different datastructure. The second does
> > not. Would you still be opposed to using opIndex if its time-complexity is
> > O(log n)?
> 
> This is different question. And tricks are not answer for problem. Problem is list has other access method than array.  
> 
> > >> That's simple. a[i] looks much nicer than a.nth(i).
> > > 
> > > It is not nicer. It is more deceiving (correct spell?). If you look at
> > > code it looks like array code.
> > > 
> > > foreach (i; 0 .. a.length)
> > > {
> > >     a[i] += 1;
> > > }
> > > 
> > > For array works nice. But for list it is terrible! Many operations for
> > > incrementing only small list.
> > 
> > With that second trick the loop would have the same complexity for lists.
> 
> Not for singly linked lists. I think name "trick" is very good. It is trick like prank to a friend. It does not do real thing. It only fools for few cases.

guess i'll risk telling which. forward iteration. backward iteration. accessing first k. accessing last k. that's pretty much it. and first/last k are already available in standard list. all else is linear time. so forget about using that as an index table. a naive design at best.

> > But putting that aside for the moment, are you saying you would allow
> > yourself to be deceived by a syntax detail? No, mentally attaching O(1) to
> > the *subscripting operator* is simply a legacy from C, where it is
> > syntactic sugar for pointer arithmetic.
> 
> I do not think so. I am sorry. If a[n] is not allowed then other array access primitive is allowed. Give index(a, n) as example. If language say index(a, n) is array access then it is big mistake for list to also define index(a, n). List maybe should define findAt(a, n). Then array also can define findAt(a, n). It is not mistake.

boils down to what's primitive access vs. what's actual algorithm. indexing in array is primitive. indexing in list is same algorithm as finding nth element anywhere - singly, doubly, file, you name it. so can't claim indexing is primitive for list.

> > >> By the way, I suspect that if opIndex is available only on arrays and
> > >> nth() is available on all sequence types, algorithm writers will forget
> > >> about opIndex and use nth(), to make their algorithm more widely
> > >> compatible. And I wouldn't blame them, though I guess you would.
> > > 
> > > I do not agree with this. I am sorry! I think nobody should write find()
> > > that uses nth().
> > 
> > Of course not. Find should be written with an iterator, which has optimal
> > complexity for both data-structures. My point is that an algorithm should
> > be generic first and foremost. Then you use the operations that have the
> > lowest complexity over all targeted data-structures if possible.
> 
> Maybe I think "generic" word different than you. For me generic is that algorithm asks minimum from structure to do its work. For example find ask only one forward pass. Input iterator does one forward pass. It is mistake if find ask for index. It is also mistake if structure makes an algorithm think it has index as primitive operation.
> 
> > >> >> * A list has faster insertions and growth.
> > >> > 
> > >> > o(1) insertion if u have the insertion point.  both list and vector
> > >> > have o(1) growth.
> > >> 
> > >> Yeah, but dynamic arrays have to re-allocate once in a while. Lists
> > >> don't.
> > > 
> > > Lists allocate memory each insert. Array allocate memory some time. With
> > > doubling cost of allocation+copy converge to zero.
> > 
> > Lists allocate memory for bare nodes, but never have to copy their elements.
> > Arrays have to move their whole content to a larger memory location each
> > time they are outgrown. For more complex data-types that means potentially
> > very expensive copies.
> 
> I think this is mistake. I think you should google "amortized complexity". Maybe that can help much.

to expand: array append is o(1) averaged over many appends if you double the capacity each time you need. interesting if you only add k complexity jumps to quadratic.

> > >> But if time-complexity is not taken into consideration, the
> > >> sequential-access interface and the random-access interface are
> > >> equivalent, no?
> > > 
> > > I think it is mistake to not taken into consideration time complexity. For
> > > basic data structures specially. I do not think you can call them
> > > equivalent.
> > 
> > I did say 'if'. You have to agree that if you disregard complexity issues
> > (for the sake of argument), the two ARE equivalent.
> 
> But it is useless comparison. Comparison can not forget important aspect. If we ignore fractionary floating point is integer. If organism is not alive it is mostly water.

pwned if u ask me :D
August 27, 2008
Re: Why Strings as Classes?
On Wed, Aug 27, 2008 at 9:49 AM, Michiel Helvensteijn <nomail@please.com> wrote:
> Dee Girl wrote:
>
>>> Yes, the first 'trick' makes it a different datastructure. The second
>>> does not. Would you still be opposed to using opIndex if its
>>> time-complexity is O(log n)?
>>
>> This is different question. And tricks are not answer for problem. Problem
>> is list has other access method than array.
>
> And what's the answer?
>

The complexity of STL's std::map indexing operator is O(lg N).
So it is not the case even in the STL that [] *always* means O(1).

Plus, if the element is not found in the std::map when using [], it
triggers an insertion which can mean an allocation, which means the
upper bound for time required for an index operation is whatever the
upper bound for 'new' is on your system.

But std::map is kind of an oddball case.  I think a lot of people are
surprised to find that merely accessing an element can trigger
allocation.  Not a great design in my opinion, precisely because it
fails to have the behavior one would expect out of an [] operator.

--bb
10 11 12 13 14 15 16 17 18
Top | Discussion index | About this forum | D home