Why Strings as Classes? (page 14) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Why Strings as Classes? (page 14)

August 27, 2008

Re: Why Strings as Classes?

Posted by Dee Girl
in reply to Chris R. Miller

Dee Girl

Posted in reply to Chris R. Miller

Chris R. Miller Wrote:

> superdan wrote:
> > Steven Schveighoffer Wrote:
> > 
> >> "superdan" wrote
> >>> Christopher Wright Wrote:
> >>>
> >>>> superdan wrote:
> >>>>>> class LinkedList(T)
> >>>>>> {
> >>>>>> Node!(T) head;
> >>>>>>
> >>>>>> /** Gets the ith item of the list. Throws: SIGSEGV if i >= length of
> >>>>>> the list. Time complexity: O(N) for a list of length N. This operation
> >>>>>> is provided for completeness and not recommended for frequent use in
> >>>>>> large lists. */
> >>>>>> T opIndex(int i)
> >>>>>> {
> >>>>>> auto current = head;
> >>>>>> while (i)
> >>>>>> {
> >>>>>> current = current.next;
> >>>>>> }
> >>>>>> return current.value;
> >>>>>> }
> >>>>>> }
> >>>>> oopsies. houston we got a problem here. problem is all that pluggable sort business works only if it can count on a constant time opIndex. why? because sort has right in its spec that it takes o(n log n) time. if u pass LinkedList to it you obtain a nonsensical design that compiles but runs afoul of the spec. because with that opIndex sort will run in quadratic time and no amount of commentin' is gonna save it from that.
> >>>> WRONG!
> >>>> Those sorting algorithms are correct. Their runtime is now O(n^2 log n)
> >>>> for this linked list.
> >>> you got the wrong definition of correct. sort became a non-scalable algo from a scalable algo. whacha gonna say next. bubble sort is viable?!?
> >> O(n^2 log n) is still considered solved.
> > 
> > of course. for traveling salesman that is. just not sorting. o(n^2 log n) sorting algo, that's  a no-go. notice also i didn't say solved/unsolved. i said scalable. stuff beyond n log n don't scale.
> 
> Why are people unable to have the brains to understand that they're using data structures in an suboptimal nature?
> 
> Furthermore, you're faulting linked lists as having a bad opIndex.  Why not implement a cursor (Java LinkedList-like iterator) in the opIndex function?  Thus you could retain the reference to the last indexed location, and simply use that instead of the root node when calling opIndex.  Granted that whenever the contents of the list are modified that reference would have to be considered invalid (start from the root node again), but it'd work with an O(1) efficiency for sequential accesses from 0 to length.  True, it'll add another pointer to a node in memory, as well as a integer representing the position of that node reference.

I am sorry to enter discussion. But I have some thing to say. Please do not scare me ^_^.

I think Super Dan choose wrong example sort. Because sort is O(n log n) even for list. But good example is find. Taken collection that gives length and opIndex abstraction. Then I write find easy with index. It is slow O(n*n) for list. But with optimization from Chris it is fast again. But if I want to write findLast. Find last element equal to some thing. Then I go back. But going back the optimization never works. I am again to O(n*n)!

This is important because abstraction. You want to write find abstract. Also you want write container abstract. And you want both to work together well. If you choose algorithm manually "you cheat". You break abstraction. Because you want abstract algorithm work on abstract container. Not concrete algorithm on concrete container.

Also is not only detail. When call findLast on container I expect better or worse depending on optimization of library. But I expect proportional with number of elements. If I know is O(n*n) maybe I want redesign. O(n*n) is really bad. 1000 elements is not many. But 1000000 operations is many.

I took two data structures classes. What each structure gives fast is essential. Not detail. I am not sure is clear what I say. Structures are special for certain operations. For example there is suffix tree. It is for fast common substring. Suffix tree must not have same interface as O(n*n) search. Because algorithm should not accept both. If you say list has random access it is naive I think (sorry!). Everybody in class could laugh. To find index in list is linear search. A similar example an array string[] can define overload a["abc"] to do linear search for "abc". But search is not indexing. It must be name search find or linearSearch.

August 27, 2008

Re: Why Strings as Classes?

Posted by Dee Girl
in reply to Michiel Helvensteijn

Dee Girl

Posted in reply to Michiel Helvensteijn

Michiel Helvensteijn Wrote:

> superdan wrote:
> 
> >> This of course means that a linked list cannot define opIndex, since a
> >> random access operation on it will take O(n) (there are tricks that can
> >> make it faster in most practical cases, but I digress).
> > 
> > it oughtn't. & you digress in the wrong direction. you can't prove a majority of "practical cases" will not suffer a performance hit.
> 
> Perhaps. It's been a while since I've worked with data-structures on this level, but I seem to remember there are ways.

I write example with findLast a little time a go. But there are many examples. For example move to front algorithm. It is linear but with "trick" opIndex it is O(n*n) even with optimization. Bad!

> What if you maintain a linked list of small arrays? Say each node in the
> list contains around log(n) of the elements in the entire list. Wouldn't
> that bring random access down to O(log n)? Of course, this would also bring
> insertions up to O(log n).
> 
> And what if you save an index/pointer pair after each access. Then with each
> new access request, you can choose from three locations to start walking:
> * The start of the list.
> * The end of the list.
> * The last access-point of the list.
> 
> In a lot of practical cases a new access is close to the last access. Of course, the general case would still be O(n).

Michiel-san, this is new data structure very different from list! If I want list I never use this structure. It is like joking. Because when you write this you agree that list is not vector.

> >> That, in turn, means that a linked list and a dynamic array can not share a common interface that includes opIndex.
> > 
> > so what. they can share a common interface that includes nth(). what exactly is yer problem with that.
> 
> That's simple. a[i] looks much nicer than a.nth(i).

It is not nicer. It is more deceiving (correct spell?). If you look at code it looks like array code.

foreach (i; 0 .. a.length)
{
    a[i] += 1;
}

For array works nice. But for list it is terrible! Many operations for incrementing only small list.

> By the way, I suspect that if opIndex is available only on arrays and nth() is available on all sequence types, algorithm writers will forget about opIndex and use nth(), to make their algorithm more widely compatible. And I wouldn't blame them, though I guess you would.

I do not agree with this. I am sorry! I think nobody should write find() that uses nth().

> >> * A list has faster insertions and growth.
> > 
> > o(1) insertion if u have the insertion point.  both list and vector have
> > o(1) growth.
> 
> Yeah, but dynamic arrays have to re-allocate once in a while. Lists don't.

Lists allocate memory each insert. Array allocate memory some time. With doubling cost of allocation+copy converge to zero.

> > we get to the point where we realize the two r fundamentally different structures built around fundamentally different tradeoffs. they do satisfy the same interface. just ain't the vector interface. it's a sequential-access interface. not a random-access interface.
> 
> I believe we agree in principle, but are just confused about each others definitions. If the "random-access interface" guarantees O(1) for nth/opIndex/whatever, of course you are right.
> 
> But if time-complexity is not taken into consideration, the sequential-access interface and the random-access interface are equivalent, no?

I think it is mistake to not taken into consideration time complexity. For basic data structures specially. I do not think you can call them equivalent.

> I'm not opposed to complexity guarantees in public contracts. Far from it, in fact. Just introduce both interfaces and let the algorithm writers choose which one to accept. But give both interfaces opIndex, since it's just good syntax.

I think is convenient syntax. Maybe too convenient ^_^.

> I do think it's a good idea for algorithms to support the interface with the weakest constraints (sequential-access). As long as they specify their time-complexity in terms of the complexities of the interface operations, not in absolute terms.
> 
> Then, when a programmer writes 'somealgorithm(someinterfaceobject)', a hypothetical analysis tool could tell him the time-complexity of the the resulting operation. The programmer might even assert a worst-case complexity at that point and the compiler could bail out if it doesn't match.

The specification I think is with types. If that works tool is the compiler.

> > even if u don't use c++ for its many faults. understand stl coz it's d shiznit.
> 
> I use C++. I use STL. I love both. But that doesn't mean there is no room for improvement. The STL is quite complex, and maybe it doesn't have to be.

Many things in STL can be better with D. But iterators and complexity is beautiful in STL.

August 27, 2008

Re: Why Strings as Classes?

Posted by Michiel Helvensteijn
in reply to Dee Girl

Michiel Helvensteijn

Posted in reply to Dee Girl

Dee Girl wrote:

>> What if you maintain a linked list of small arrays? Say each node in the
>> list contains around log(n) of the elements in the entire list. Wouldn't
>> that bring random access down to O(log n)? Of course, this would also
>> bring insertions up to O(log n).
>> 
>> And what if you save an index/pointer pair after each access. Then with
>> each new access request, you can choose from three locations to start
>> walking: * The start of the list.
>> * The end of the list.
>> * The last access-point of the list.
>> 
>> In a lot of practical cases a new access is close to the last access. Of course, the general case would still be O(n).
> 
> Michiel-san, this is new data structure very different from list! If I want list I never use this structure. It is like joking. Because when you write this you agree that list is not vector.

Yes, the first 'trick' makes it a different datastructure. The second does not. Would you still be opposed to using opIndex if its time-complexity is O(log n)?

>> That's simple. a[i] looks much nicer than a.nth(i).
> 
> It is not nicer. It is more deceiving (correct spell?). If you look at
> code it looks like array code.
> 
> foreach (i; 0 .. a.length)
> {
>     a[i] += 1;
> }
> 
> For array works nice. But for list it is terrible! Many operations for incrementing only small list.

With that second trick the loop would have the same complexity for lists.

But putting that aside for the moment, are you saying you would allow yourself to be deceived by a syntax detail? No, mentally attaching O(1) to the *subscripting operator* is simply a legacy from C, where it is syntactic sugar for pointer arithmetic.

>> By the way, I suspect that if opIndex is available only on arrays and nth() is available on all sequence types, algorithm writers will forget about opIndex and use nth(), to make their algorithm more widely compatible. And I wouldn't blame them, though I guess you would.
> 
> I do not agree with this. I am sorry! I think nobody should write find()
> that uses nth().

Of course not. Find should be written with an iterator, which has optimal complexity for both data-structures. My point is that an algorithm should be generic first and foremost. Then you use the operations that have the lowest complexity over all targeted data-structures if possible.

>> >> * A list has faster insertions and growth.
>> > 
>> > o(1) insertion if u have the insertion point.  both list and vector
>> > have o(1) growth.
>> 
>> Yeah, but dynamic arrays have to re-allocate once in a while. Lists don't.
> 
> Lists allocate memory each insert. Array allocate memory some time. With doubling cost of allocation+copy converge to zero.

Lists allocate memory for bare nodes, but never have to copy their elements. Arrays have to move their whole content to a larger memory location each time they are outgrown. For more complex data-types that means potentially very expensive copies.

>> But if time-complexity is not taken into consideration, the sequential-access interface and the random-access interface are equivalent, no?
> 
> I think it is mistake to not taken into consideration time complexity. For basic data structures specially. I do not think you can call them equivalent.

I did say 'if'. You have to agree that if you disregard complexity issues (for the sake of argument), the two ARE equivalent.

>> I do think it's a good idea for algorithms to support the interface with the weakest constraints (sequential-access). As long as they specify their time-complexity in terms of the complexities of the interface operations, not in absolute terms.
>> 
>> Then, when a programmer writes 'somealgorithm(someinterfaceobject)', a hypothetical analysis tool could tell him the time-complexity of the the resulting operation. The programmer might even assert a worst-case complexity at that point and the compiler could bail out if it doesn't match.
> 
> The specification I think is with types. If that works tool is the compiler.

But don't you understand that if this tool did exist, and the language had a standard notation for time/space-complexity, I could simply write:

sequence<T> s;
/* fill sequence */
sort(s);

And the compiler (in cooperation with this 'tool') could automatically find the most effective combination of data-structure and algorithm. The code would be more readable and efficient.

-- 
Michiel

August 27, 2008

Re: Why Strings as Classes?

Posted by Dee Girl
in reply to Michiel Helvensteijn

Dee Girl

Posted in reply to Michiel Helvensteijn

Michiel Helvensteijn Wrote:

> Dee Girl wrote:
> 
> >> What if you maintain a linked list of small arrays? Say each node in the
> >> list contains around log(n) of the elements in the entire list. Wouldn't
> >> that bring random access down to O(log n)? Of course, this would also
> >> bring insertions up to O(log n).
> >> 
> >> And what if you save an index/pointer pair after each access. Then with
> >> each new access request, you can choose from three locations to start
> >> walking: * The start of the list.
> >> * The end of the list.
> >> * The last access-point of the list.
> >> 
> >> In a lot of practical cases a new access is close to the last access. Of course, the general case would still be O(n).
> > 
> > Michiel-san, this is new data structure very different from list! If I want list I never use this structure. It is like joking. Because when you write this you agree that list is not vector.
> 
> Yes, the first 'trick' makes it a different datastructure. The second does not. Would you still be opposed to using opIndex if its time-complexity is O(log n)?

This is different question. And tricks are not answer for problem. Problem is list has other access method than array.

> >> That's simple. a[i] looks much nicer than a.nth(i).
> > 
> > It is not nicer. It is more deceiving (correct spell?). If you look at
> > code it looks like array code.
> > 
> > foreach (i; 0 .. a.length)
> > {
> >     a[i] += 1;
> > }
> > 
> > For array works nice. But for list it is terrible! Many operations for incrementing only small list.
> 
> With that second trick the loop would have the same complexity for lists.

Not for singly linked lists. I think name "trick" is very good. It is trick like prank to a friend. It does not do real thing. It only fools for few cases.

> But putting that aside for the moment, are you saying you would allow yourself to be deceived by a syntax detail? No, mentally attaching O(1) to the *subscripting operator* is simply a legacy from C, where it is syntactic sugar for pointer arithmetic.

I do not think so. I am sorry. If a[n] is not allowed then other array access primitive is allowed. Give index(a, n) as example. If language say index(a, n) is array access then it is big mistake for list to also define index(a, n). List maybe should define findAt(a, n). Then array also can define findAt(a, n). It is not mistake.

> >> By the way, I suspect that if opIndex is available only on arrays and nth() is available on all sequence types, algorithm writers will forget about opIndex and use nth(), to make their algorithm more widely compatible. And I wouldn't blame them, though I guess you would.
> > 
> > I do not agree with this. I am sorry! I think nobody should write find()
> > that uses nth().
> 
> Of course not. Find should be written with an iterator, which has optimal complexity for both data-structures. My point is that an algorithm should be generic first and foremost. Then you use the operations that have the lowest complexity over all targeted data-structures if possible.

Maybe I think "generic" word different than you. For me generic is that algorithm asks minimum from structure to do its work. For example find ask only one forward pass. Input iterator does one forward pass. It is mistake if find ask for index. It is also mistake if structure makes an algorithm think it has index as primitive operation.

> >> >> * A list has faster insertions and growth.
> >> > 
> >> > o(1) insertion if u have the insertion point.  both list and vector
> >> > have o(1) growth.
> >> 
> >> Yeah, but dynamic arrays have to re-allocate once in a while. Lists don't.
> > 
> > Lists allocate memory each insert. Array allocate memory some time. With doubling cost of allocation+copy converge to zero.
> 
> Lists allocate memory for bare nodes, but never have to copy their elements. Arrays have to move their whole content to a larger memory location each time they are outgrown. For more complex data-types that means potentially very expensive copies.

I think this is mistake. I think you should google "amortized complexity". Maybe that can help much.

> >> But if time-complexity is not taken into consideration, the sequential-access interface and the random-access interface are equivalent, no?
> > 
> > I think it is mistake to not taken into consideration time complexity. For basic data structures specially. I do not think you can call them equivalent.
> 
> I did say 'if'. You have to agree that if you disregard complexity issues (for the sake of argument), the two ARE equivalent.

But it is useless comparison. Comparison can not forget important aspect. If we ignore fractionary floating point is integer. If organism is not alive it is mostly water.

> >> I do think it's a good idea for algorithms to support the interface with the weakest constraints (sequential-access). As long as they specify their time-complexity in terms of the complexities of the interface operations, not in absolute terms.
> >> 
> >> Then, when a programmer writes 'somealgorithm(someinterfaceobject)', a hypothetical analysis tool could tell him the time-complexity of the the resulting operation. The programmer might even assert a worst-case complexity at that point and the compiler could bail out if it doesn't match.
> > 
> > The specification I think is with types. If that works tool is the compiler.
> 
> But don't you understand that if this tool did exist, and the language had a standard notation for time/space-complexity, I could simply write:
> 
> sequence<T> s;
> /* fill sequence */
> sort(s);
> 
> And the compiler (in cooperation with this 'tool') could automatically find the most effective combination of data-structure and algorithm. The code would be more readable and efficient.

Michiel-san, STL does that. Or I misunderstand you?

August 27, 2008

Re: Why Strings as Classes?

Posted by Fawzi Mohamed
in reply to bearophile

Fawzi Mohamed

Posted in reply to bearophile

On 2008-08-27 16:16:49 +0200, bearophile <bearophileHUGS@lycos.com> said:

> Fawzi Mohamed:
>> it also depend on
>> the language, if in a language it is clear that using the default
>> structure you are not supposed to append often, and if you really have
>> to do it you use a special method if you do, then the programs written
>> in that language will use that.
> 
> I agree. But then the D specs have to be updated to say that the append to built-in D arrays is a slow or very slow operation (not amortized O(1)), so people coming from the C++ STL, Python, Ruby, TCl, Lua, Lisp, Clean, Oz, ecc will not receive a bite from it.
> 
> 
>> D tries to do different things to mitigate the fact that appending is
>> difficult, maybe it could do more, but it will *never* be as efficient
>> as a structure that gives up that fact that an array has to be just a
>> chunk of contiguous memory.
> 
> I agree, a deque will probably be always faster in appending than a dynamic array.
> But I think D may do more here :-)
> 
> 
>> Now I find the choice of having the basic array being just a chunk of
>> contiguous memory, and that the overhead of the structure should be
>> minimal very reasonable for a system programming language that has to
>> interact with C
> 
> Note that both vector of the C++ STL and "list" of Python are generally (or always) implemented with a chunk of contiguous memory.
> 
> 
>> Clearly someone coming from lisp, any functional language or even
>> python, might disagree about what is requested from the "default" array
>> container.
>> It isn't that one is right and the other wrong, it is just a question
>> of priorities of the language, the feel of it, its style...
> 
> I agree, that's why I have suggested to collect statistics from real D code, and not from Lisp programs, to see what's the a good performance profile compromise for D built-in dynamic arrays :-) This means that I'll stop caring for a fast array append in D if most D programmers don't need fast appends much.
> 
> 
>> This does not mean that if improvements can be done without
>> compromising too much it shouldn't be done, just that failing some
>> benchmarks might be ok :)
> 
> Well, in my opinion that's okay, but there's a limit. So I think a built-in data structure has to be optimized for being flexible, so while not being very good in anything, it has to be not terrible in any commonly done operation.
> On the other hand, I can see what you say, that in a low level language a too much flexible data structure may be not fit as built-in, while a simpler and less flexible one may be fitter. You may be right and I may be wrong on this point :-)

well it is funny because now I am not too sure anymore that adding an extra field (pointing to the end of the reserved data, if the actual array is the "owner" of it and to before the start if it isn't and one should reallocate (i.e. for slices) is such a bad idea.

I started out with the idea "slightly improved C characteristics", and so for me it was clear that appending would be bad, and I was actually surprised (using it) in seeing that it was less bad that I supposed.
The way it is has the advantage of allowing bound checks, and a very little overhead, but it has no concept of "extra grow space".
So for me it was clear that if I wanted to insert something I had to make place first, and then insert it (keeping in mind the old size).

The vector approach (do know about the capacity) can be useful in high level code where one wants that a.length always is the length of the array (not maybe longer because of some reserved memory).

So maybe it is worthwhile, even if it will for sure add some bloat, I do not know...
For most of my code it will just add bloat, but probably a tolerable one

Fawzi

August 27, 2008

Re: Why Strings as Classes?

Posted by Simen Kjaeraas
in reply to Michiel Helvensteijn

Simen Kjaeraas

Posted in reply to Michiel Helvensteijn

Michiel Helvensteijn <nomail@please.com> wrote:

> a[i] looks much nicer than a.nth(i).

To me, this is one of the most important points here. I want a
language that seems to make sense, more than I want a language
that is by default very fast. When I write a short example
program, I want to write a[i] not a.getElementAtPosition(i).

D is known as a language that does the safe thing by default,
and you have to jump through some hoops to do the fast, unsafe
thing. I will claim that a[i] is the default, as it is what
we're used to, and looks better. a.nth(i),
a.getElementAtPosition(i) and whatever other ways one might
come up with, is jumping through hoops.

Just my 0.02 kr.

-- 
Simen

August 27, 2008

Re: Why Strings as Classes?

Posted by Michiel Helvensteijn
in reply to Dee Girl

Michiel Helvensteijn

Posted in reply to Dee Girl

Dee Girl wrote:

>> Yes, the first 'trick' makes it a different datastructure. The second does not. Would you still be opposed to using opIndex if its time-complexity is O(log n)?
> 
> This is different question. And tricks are not answer for problem. Problem is list has other access method than array.

And what's the answer?

>> With that second trick the loop would have the same complexity for lists.
> 
> Not for singly linked lists.

Yeah, also for singly linked lists.

>> But putting that aside for the moment, are you saying you would allow yourself to be deceived by a syntax detail? No, mentally attaching O(1) to the *subscripting operator* is simply a legacy from C, where it is syntactic sugar for pointer arithmetic.
> 
> I do not think so. I am sorry. If a[n] is not allowed then other array
> access primitive is allowed. Give index(a, n) as example. If language say
> index(a, n) is array access then it is big mistake for list to also define
> index(a, n). List maybe should define findAt(a, n). Then array also can
> define findAt(a, n). It is not mistake.

Yes, I agree for "array access". That term implies O(1), since it uses the
word "array". But I was argueing against the subscripting operator to be
forced into O(1).

>> Lists allocate memory for bare nodes, but never have to copy their elements. Arrays have to move their whole content to a larger memory location each time they are outgrown. For more complex data-types that means potentially very expensive copies.
> 
> I think this is mistake. I think you should google "amortized complexity". Maybe that can help much.

Amortized complexity has nothing to do with it. Dynamic arrays have to copy their elements and lists do not. It's as simple as that.

>> But don't you understand that if this tool did exist, and the language had a standard notation for time/space-complexity, I could simply write:
>> 
>> sequence<T> s;
>> /* fill sequence */
>> sort(s);
>> 
>> And the compiler (in cooperation with this 'tool') could automatically find the most effective combination of data-structure and algorithm. The code would be more readable and efficient.
> 
> Michiel-san, STL does that. Or I misunderstand you?

STL will choose the right sorting algorithm, given a specific data-structure. But I am saying it may be possible also for the data-structure to be automatically chosen, based on what the programmer does with it.

-- 
Michiel

August 27, 2008

Re: Why Strings as Classes?

Posted by superdan
in reply to Simen Kjaeraas

superdan

Posted in reply to Simen Kjaeraas

Simen Kjaeraas Wrote:

> Michiel Helvensteijn <nomail@please.com> wrote:
> 
> > a[i] looks much nicer than a.nth(i).
> 
> To me, this is one of the most important points here. I want a language that seems to make sense, more than I want a language that is by default very fast. When I write a short example program, I want to write a[i] not a.getElementAtPosition(i).

sure. you do so with arrays. i think you confuse "optimized-fast" with "complexity-fast".

> D is known as a language that does the safe thing by default, and you have to jump through some hoops to do the fast, unsafe thing. I will claim that a[i] is the default, as it is what we're used to, and looks better. a.nth(i), a.getElementAtPosition(i) and whatever other ways one might come up with, is jumping through hoops.

guess i missed the lesson teachin' array indexing was unsafe. you are looking at them wrong tradeoffs. it's not about slow and safe vs. fast and unsafe. safety's nothin' to do with all this. you're lookin' at bad design vs. good design of algos and data structs.

August 27, 2008

Re: Why Strings as Classes?

Posted by superdan
in reply to Dee Girl

superdan

Posted in reply to Dee Girl

Dee Girl Wrote:

> Michiel Helvensteijn Wrote:
> 
> > Dee Girl wrote:
> > 
> > >> What if you maintain a linked list of small arrays? Say each node in the
> > >> list contains around log(n) of the elements in the entire list. Wouldn't
> > >> that bring random access down to O(log n)? Of course, this would also
> > >> bring insertions up to O(log n).
> > >> 
> > >> And what if you save an index/pointer pair after each access. Then with
> > >> each new access request, you can choose from three locations to start
> > >> walking: * The start of the list.
> > >> * The end of the list.
> > >> * The last access-point of the list.
> > >> 
> > >> In a lot of practical cases a new access is close to the last access. Of course, the general case would still be O(n).
> > > 
> > > Michiel-san, this is new data structure very different from list! If I want list I never use this structure. It is like joking. Because when you write this you agree that list is not vector.
> > 
> > Yes, the first 'trick' makes it a different datastructure. The second does not. Would you still be opposed to using opIndex if its time-complexity is O(log n)?
> 
> This is different question. And tricks are not answer for problem. Problem is list has other access method than array.
> 
> > >> That's simple. a[i] looks much nicer than a.nth(i).
> > > 
> > > It is not nicer. It is more deceiving (correct spell?). If you look at
> > > code it looks like array code.
> > > 
> > > foreach (i; 0 .. a.length)
> > > {
> > >     a[i] += 1;
> > > }
> > > 
> > > For array works nice. But for list it is terrible! Many operations for incrementing only small list.
> > 
> > With that second trick the loop would have the same complexity for lists.
> 
> Not for singly linked lists. I think name "trick" is very good. It is trick like prank to a friend. It does not do real thing. It only fools for few cases.

guess i'll risk telling which. forward iteration. backward iteration. accessing first k. accessing last k. that's pretty much it. and first/last k are already available in standard list. all else is linear time. so forget about using that as an index table. a naive design at best.

> > But putting that aside for the moment, are you saying you would allow yourself to be deceived by a syntax detail? No, mentally attaching O(1) to the *subscripting operator* is simply a legacy from C, where it is syntactic sugar for pointer arithmetic.
> 
> I do not think so. I am sorry. If a[n] is not allowed then other array access primitive is allowed. Give index(a, n) as example. If language say index(a, n) is array access then it is big mistake for list to also define index(a, n). List maybe should define findAt(a, n). Then array also can define findAt(a, n). It is not mistake.

boils down to what's primitive access vs. what's actual algorithm. indexing in array is primitive. indexing in list is same algorithm as finding nth element anywhere - singly, doubly, file, you name it. so can't claim indexing is primitive for list.

> > >> By the way, I suspect that if opIndex is available only on arrays and nth() is available on all sequence types, algorithm writers will forget about opIndex and use nth(), to make their algorithm more widely compatible. And I wouldn't blame them, though I guess you would.
> > > 
> > > I do not agree with this. I am sorry! I think nobody should write find()
> > > that uses nth().
> > 
> > Of course not. Find should be written with an iterator, which has optimal complexity for both data-structures. My point is that an algorithm should be generic first and foremost. Then you use the operations that have the lowest complexity over all targeted data-structures if possible.
> 
> Maybe I think "generic" word different than you. For me generic is that algorithm asks minimum from structure to do its work. For example find ask only one forward pass. Input iterator does one forward pass. It is mistake if find ask for index. It is also mistake if structure makes an algorithm think it has index as primitive operation.
> 
> > >> >> * A list has faster insertions and growth.
> > >> > 
> > >> > o(1) insertion if u have the insertion point.  both list and vector
> > >> > have o(1) growth.
> > >> 
> > >> Yeah, but dynamic arrays have to re-allocate once in a while. Lists don't.
> > > 
> > > Lists allocate memory each insert. Array allocate memory some time. With doubling cost of allocation+copy converge to zero.
> > 
> > Lists allocate memory for bare nodes, but never have to copy their elements. Arrays have to move their whole content to a larger memory location each time they are outgrown. For more complex data-types that means potentially very expensive copies.
> 
> I think this is mistake. I think you should google "amortized complexity". Maybe that can help much.

to expand: array append is o(1) averaged over many appends if you double the capacity each time you need. interesting if you only add k complexity jumps to quadratic.

> > >> But if time-complexity is not taken into consideration, the sequential-access interface and the random-access interface are equivalent, no?
> > > 
> > > I think it is mistake to not taken into consideration time complexity. For basic data structures specially. I do not think you can call them equivalent.
> > 
> > I did say 'if'. You have to agree that if you disregard complexity issues (for the sake of argument), the two ARE equivalent.
> 
> But it is useless comparison. Comparison can not forget important aspect. If we ignore fractionary floating point is integer. If organism is not alive it is mostly water.

pwned if u ask me :D

August 27, 2008

Re: Why Strings as Classes?

Posted by Bill Baxter
in reply to Michiel Helvensteijn

Bill Baxter

Posted in reply to Michiel Helvensteijn

On Wed, Aug 27, 2008 at 9:49 AM, Michiel Helvensteijn <nomail@please.com> wrote:
> Dee Girl wrote:
>
>>> Yes, the first 'trick' makes it a different datastructure. The second does not. Would you still be opposed to using opIndex if its time-complexity is O(log n)?
>>
>> This is different question. And tricks are not answer for problem. Problem is list has other access method than array.
>
> And what's the answer?
>

The complexity of STL's std::map indexing operator is O(lg N).
So it is not the case even in the STL that [] *always* means O(1).

Plus, if the element is not found in the std::map when using [], it triggers an insertion which can mean an allocation, which means the upper bound for time required for an index operation is whatever the upper bound for 'new' is on your system.

But std::map is kind of an oddball case.  I think a lot of people are surprised to find that merely accessing an element can trigger allocation.  Not a great design in my opinion, precisely because it fails to have the behavior one would expect out of an [] operator.

--bb

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation