View mode: basic / threaded / horizontal-split · Log in · Help
August 26, 2008
Re: Why Strings as Classes?
Walter Bright wrote:
> Yigal Chripun wrote:
>> a) people here said that a virtual call will make it slow. How much
>> slow? how much of an overhead is it on modern hardware considering also
>> that this is a place where hardware manufacturers spend time on
>> optimizations?
> 
> Virtual function calls have been a problem for hardware optimization. 
> Direct function calls can be speculatively executed, but not virtual 
> ones, because the hardware cannot predict where it will go. This means 
> virtual calls can be much slower than direct function calls.

What about for software optimization?

I seem to remember reading something about the Objective-C compiler 
maybe six or eight months ago talking about some of its optimization 
techniques.

Obj-C uses a message-passing idiom, and all messages use dynamic 
dispatch, since the list of messages an object can receive is not fixed 
at compile-time.

If I remember correctly, this article said that the dynamic dispatch 
expense only had to be incurred once, upon the first invocation of each 
message type. After that, the address of the appropriate function was 
re-written in memory, so that it pointed directly to the correct code. 
No more dynamic dispatch. Although the message handlers aren't resolved 
until runtime, once invoked, they'll always use the same target.

Or some thing like that.

It was an interesting read. I'll see if I can find it.

--benji
August 26, 2008
Re: Why Strings as Classes?
superdan wrote:
> Christopher Wright Wrote:
> 
>> superdan wrote:
>>>> class LinkedList(T)
>>>> {
>>>> 	Node!(T) head;
>>>>
>>>> 	/** Gets the ith item of the list. Throws: SIGSEGV if i >= length of 
>>>> the list. Time complexity: O(N) for a list of length N. This operation 
>>>> is provided for completeness and not recommended for frequent use in 
>>>> large lists. */
>>>> 	T opIndex(int i)
>>>> 	{
>>>> 		auto current = head;
>>>> 		while (i)
>>>> 		{
>>>> 			current = current.next;
>>>> 		}
>>>> 		return current.value;
>>>> 	}
>>>> }
>>> oopsies. houston we got a problem here. problem is all that pluggable sort business works only if it can count on a constant time opIndex. why? because sort has right in its spec that it takes o(n log n) time. if u pass LinkedList to it you obtain a nonsensical design that compiles but runs afoul of the spec. because with that opIndex sort will run in quadratic time and no amount of commentin' is gonna save it from that.
>> WRONG!
>> Those sorting algorithms are correct. Their runtime is now O(n^2 log n) 
>> for this linked list.
> 
> you got the wrong definition of correct. sort became a non-scalable algo from a scalable algo. whacha gonna say next. bubble sort is viable?!?

No, YOU got the wrong definition of correct. "Correct" and "scalabale" 
are different words. As are "correct" and "viable". In Java, I've been 
known to index into linked lists... usually ones with ~5 elements, but 
I've done it.

>>> this design is a stillborn.
>>>
>>> what needs done is to allow different kinds of containers implement different interfaces. in fact a better way to factor things is via iterators, as stl has shown.
>> You didn't ask for an O(1) opIndex on a linked list. You asked for a 
>> correct opIndex on a linked list.
> 
> the correct opIndex runs in o(1).

No, the SCALABLE opIndex runs in O(1). The CORRECT opIndex can run in 
O(n^n) and still be correct.

>> Any sorting algorithm you name that 
>> would work on an array would also work on this linked list. Admittedly, 
>> insertion sort would be much faster than qsort, but if your library 
>> provides that, you, knowing that you are using a linked list, would 
>> choose the insertion sort algorithm.
> 
> no. the initial idea was for a design that allows that cool mixing and matching gig thru interfaces without knowing what is where. but ur design leads to a lot of unworkable mixing and matching.

Again, WORKABLE, just not SCALABLE. You should wrap your head around 
this concept, since it's been around for about 30 years now.

> it is a stillborn design.

Tell that to Java, the world's most used programming language for new 
projects. Or C#, the world's fastest growing programming language. Or 
Tango, one of D's standard libraries.

> my point was opIndex should not be written for a list to begin with.

Yes it should be. Here's a fairly good example: Say you have a GUI 
control that displays a list and allows the user to insert or remove 
items from the list. It also allows the user to double-click on an item 
at a given position. Looking up what position maps to what item is an 
opIndex. Would this problem be better solved using an array (Vector)? 
Maybe. Luckily, if you used a List interface throughout your code, you 
can change one line, and it'll work wither way.

> wrong. you only need to define your abstract types appropriately. e.g. stl defines forward and random iterators. a forward iterators has ++ but no []. random iterator has both. so a random iterator can be substituted for a forward iterator. but not the other way. bottom line, sort won't compile on a forward iterator. your design allowed it to compile. which makes the design wrong.

STL happens to be one design and one world-view. It's a good one, but 
it's not the only one. My main problem with the STL is that it takes 
longer to learn than the Java/.NET standard libraries -- and thus the 
cost of a programmer who knows it is higher. But there are language 
considerations in there too, and this is a topic for another day.

> btw my respect for tango improved when i found no opIndex in their list container.

http://www.dsource.org/projects/tango/browser/trunk/tango/util/collection/LinkSeq.d#L176
August 26, 2008
Re: Why Strings as Classes?
Robert Fraser wrote:

> superdan wrote:

>> btw my respect for tango improved when i found no opIndex in their list
>> container.
> 
>
http://www.dsource.org/projects/tango/browser/trunk/tango/util/collection/LinkSeq.d#L176

This particular collection package is deprecated.

-- 
Lars Ivar Igesund
blog at http://larsivi.net
DSource, #d.tango & #D: larsivi
Dancing the Tango
August 26, 2008
Re: Why Strings as Classes?
Lars Ivar Igesund wrote:
> Robert Fraser wrote:
> 
>> superdan wrote:
> 
>>> btw my respect for tango improved when i found no opIndex in their list
>>> container.
>>
> http://www.dsource.org/projects/tango/browser/trunk/tango/util/collection/LinkSeq.d#L176
> 
> This particular collection package is deprecated.

The new package has this feature too:
http://www.dsource.org/projects/tango/browser/trunk/tango/util/container/Slink.d#L248

It's a good feature to have (I wouldn't consider a list class complete 
without it), it just shouldn't be abused.
August 26, 2008
Re: Why Strings as Classes?
Reply to Robert,

> BCS wrote:
> 
>> Reply to Benji,
>> 
>>> The new JSON parser in the Tango library operates on templated
>>> string arrays. If I want to read from a file or a socket, I have to
>>> first slurp the whole thing into a character array, even though the
>>> character-streaming would be more practical.
>>> 
>> Unless you are only going to parse the start of the file or are going
>> to
>> be throwing away most of it *while you parse it, not after* The best
>> way
>> to parse a file is to load it all in one OS system call and then run
>> a
>> slicing parser (like the Tango XML parser) on that.
>> One memory allocation and one load or a mmap, and then only the meta
>> structures get allocated later.
> There are cases where you might want to parse an XML file that won't
> fit easily in main memory. I think a stream processing SAX parser
> would be a good addition (perhaps not replacement for) the exiting
> one.
> 

If you can't fit the data file in memory the I find it hard to believe you 
will be able to hold the parsed file in memory. If you can program the parser 
to dump unneeded data on the fly or process and discard the data, that might 
make a difference.
August 26, 2008
Re: Why Strings as Classes?
BCS wrote:
> Reply to Robert,
> 
>> BCS wrote:
>>
>>> Reply to Benji,
>>>
>>>> The new JSON parser in the Tango library operates on templated
>>>> string arrays. If I want to read from a file or a socket, I have to
>>>> first slurp the whole thing into a character array, even though the
>>>> character-streaming would be more practical.
>>>>
>>> Unless you are only going to parse the start of the file or are going
>>> to
>>> be throwing away most of it *while you parse it, not after* The best
>>> way
>>> to parse a file is to load it all in one OS system call and then run
>>> a
>>> slicing parser (like the Tango XML parser) on that.
>>> One memory allocation and one load or a mmap, and then only the meta
>>> structures get allocated later.
>> There are cases where you might want to parse an XML file that won't
>> fit easily in main memory. I think a stream processing SAX parser
>> would be a good addition (perhaps not replacement for) the exiting
>> one.
>>
> 
> If you can't fit the data file in memory the I find it hard to believe 
> you will be able to hold the parsed file in memory. If you can program 
> the parser to dump unneeded data on the fly or process and discard the 
> data, that might make a difference.

Well, for something like a DOM parser, it's pretty much impossible to 
parse a file that won't fit into memory. But a SAX parser doesn't 
actually create any objects. It just calls events, while processing XML 
data from a stream. A good SAX parser can operate without ever 
allocating anything on the heap, leaving the consumer to create any 
necessary objects from the parse process.

--benji
August 26, 2008
Re: Why Strings as Classes?
BCS wrote:
> Reply to Robert,
> 
>> BCS wrote:
>>
>>> Reply to Benji,
>>>
>>>> The new JSON parser in the Tango library operates on templated
>>>> string arrays. If I want to read from a file or a socket, I have to
>>>> first slurp the whole thing into a character array, even though the
>>>> character-streaming would be more practical.
>>>>
>>> Unless you are only going to parse the start of the file or are going
>>> to
>>> be throwing away most of it *while you parse it, not after* The best
>>> way
>>> to parse a file is to load it all in one OS system call and then run
>>> a
>>> slicing parser (like the Tango XML parser) on that.
>>> One memory allocation and one load or a mmap, and then only the meta
>>> structures get allocated later.
>> There are cases where you might want to parse an XML file that won't
>> fit easily in main memory. I think a stream processing SAX parser
>> would be a good addition (perhaps not replacement for) the exiting
>> one.
>>
> 
> If you can't fit the data file in memory the I find it hard to believe 
> you will be able to hold the parsed file in memory. If you can program 
> the parser to dump unneeded data on the fly or process and discard the 
> data, that might make a difference.

I think that's one of the reasons to use a streaming parser -- so you 
can dump data on the fly.
August 26, 2008
Re: Why Strings as Classes?
"Robert Fraser" <fraserofthenight@gmail.com> wrote in message 
news:g91fva$1if4$1@digitalmars.com...
> Lars Ivar Igesund wrote:
>> Robert Fraser wrote:
>>
>>> superdan wrote:
>>
>>>> btw my respect for tango improved when i found no opIndex in their list
>>>> container.
>>>
>> http://www.dsource.org/projects/tango/browser/trunk/tango/util/collection/LinkSeq.d#L176
>>
>> This particular collection package is deprecated.
>
> The new package has this feature too:
> http://www.dsource.org/projects/tango/browser/trunk/tango/util/container/Slink.d#L248
>
> It's a good feature to have (I wouldn't consider a list class complete 
> without it), it just shouldn't be abused.

First, Slink is not really the public interface, it is the unit that 
LinkedList (and other containers) use to build linked lists.

Second, LinkedList implements a 'lookup by index', through the get function, 
but note that it is not implemented as an opIndex function.  an opIndex 
implies fast lookup (at least < O(n)).

I don't think these functions were intended to be used in sorting routines.

-Steve
August 26, 2008
Re: Why Strings as Classes?
Reply to Benji,

> BCS wrote:
> 
>> Reply to Robert,
>> 
>>> BCS wrote:
>>> 
>>>> Reply to Benji,
>>>> 
>>>>> The new JSON parser in the Tango library operates on templated
>>>>> string arrays. If I want to read from a file or a socket, I have
>>>>> to first slurp the whole thing into a character array, even though
>>>>> the character-streaming would be more practical.
>>>>> 
>>>> Unless you are only going to parse the start of the file or are
>>>> going
>>>> to
>>>> be throwing away most of it *while you parse it, not after* The
>>>> best
>>>> way
>>>> to parse a file is to load it all in one OS system call and then
>>>> run
>>>> a
>>>> slicing parser (like the Tango XML parser) on that.
>>>> One memory allocation and one load or a mmap, and then only the
>>>> meta
>>>> structures get allocated later.
>>> There are cases where you might want to parse an XML file that won't
>>> fit easily in main memory. I think a stream processing SAX parser
>>> would be a good addition (perhaps not replacement for) the exiting
>>> one.
>>> 
>> If you can't fit the data file in memory the I find it hard to
>> believe you will be able to hold the parsed file in memory. If you
>> can program the parser to dump unneeded data on the fly or process
>> and discard the data, that might make a difference.
>> 
> Well, for something like a DOM parser, it's pretty much impossible to
> parse a file that won't fit into memory. But a SAX parser doesn't
> actually create any objects. It just calls events, while processing
> XML data from a stream. A good SAX parser can operate without ever
> allocating anything on the heap, leaving the consumer to create any
> necessary objects from the parse process.
> 
> --benji
> 

Interesting, I've worked with parsers* that function something like that 
but never thought of them in that way. OTOH I can think of only very limited 
domain where this would be useful. If I needed to process that much data 
I'd load it into a database and go from there.

*In fact my parser generator could be used that way.
August 26, 2008
Re: Why Strings as Classes?
Robert Fraser Wrote:

> superdan wrote:
> > Christopher Wright Wrote:
> > 
> >> superdan wrote:
> >>>> class LinkedList(T)
> >>>> {
> >>>> 	Node!(T) head;
> >>>>
> >>>> 	/** Gets the ith item of the list. Throws: SIGSEGV if i >= length of 
> >>>> the list. Time complexity: O(N) for a list of length N. This operation 
> >>>> is provided for completeness and not recommended for frequent use in 
> >>>> large lists. */
> >>>> 	T opIndex(int i)
> >>>> 	{
> >>>> 		auto current = head;
> >>>> 		while (i)
> >>>> 		{
> >>>> 			current = current.next;
> >>>> 		}
> >>>> 		return current.value;
> >>>> 	}
> >>>> }
> >>> oopsies. houston we got a problem here. problem is all that pluggable sort business works only if it can count on a constant time opIndex. why? because sort has right in its spec that it takes o(n log n) time. if u pass LinkedList to it you obtain a nonsensical design that compiles but runs afoul of the spec. because with that opIndex sort will run in quadratic time and no amount of commentin' is gonna save it from that.
> >> WRONG!
> >> Those sorting algorithms are correct. Their runtime is now O(n^2 log n) 
> >> for this linked list.
> > 
> > you got the wrong definition of correct. sort became a non-scalable algo from a scalable algo. whacha gonna say next. bubble sort is viable?!?
> 
> No, YOU got the wrong definition of correct. "Correct" and "scalabale" 
> are different words. As are "correct" and "viable". In Java, I've been 
> known to index into linked lists... usually ones with ~5 elements, but 
> I've done it.

yeah. for starters my dictionary fails to list "scalabale".

> >>> this design is a stillborn.
> >>>
> >>> what needs done is to allow different kinds of containers implement different interfaces. in fact a better way to factor things is via iterators, as stl has shown.
> >> You didn't ask for an O(1) opIndex on a linked list. You asked for a 
> >> correct opIndex on a linked list.
> > 
> > the correct opIndex runs in o(1).
> 
> No, the SCALABLE opIndex runs in O(1). The CORRECT opIndex can run in 
> O(n^n) and still be correct.

stepanov has shown that for composable operations the complexity must be part of the specification. otherwise composition easily leads to high-order polynomial that fail to terminate in reasonable time. opIndex is an indexing operator expected to run in constant time, and algorithms rely on that. so no. opIndex running in o(n^n) is incorrect because it fails it spec.

> >> Any sorting algorithm you name that 
> >> would work on an array would also work on this linked list. Admittedly, 
> >> insertion sort would be much faster than qsort, but if your library 
> >> provides that, you, knowing that you are using a linked list, would 
> >> choose the insertion sort algorithm.
> > 
> > no. the initial idea was for a design that allows that cool mixing and matching gig thru interfaces without knowing what is where. but ur design leads to a lot of unworkable mixing and matching.
> 
> Again, WORKABLE, just not SCALABLE. You should wrap your head around 
> this concept, since it's been around for about 30 years now.

guess would do you good to entertain just for a second the idea that i know what i'm talking about and you don't get me.

> > it is a stillborn design.
> 
> Tell that to Java, the world's most used programming language for new 
> projects. Or C#, the world's fastest growing programming language. Or 
> Tango, one of D's standard libraries.

here you hint you don't understand what i'm talking about indeed. neither of java, c#, or tango define a[n] to run in o(n). they define named functions, which i'm perfectly fine with.

> > my point was opIndex should not be written for a list to begin with.
> 
> Yes it should be. Here's a fairly good example: Say you have a GUI 
> control that displays a list and allows the user to insert or remove 
> items from the list. It also allows the user to double-click on an item 
> at a given position. Looking up what position maps to what item is an 
> opIndex. Would this problem be better solved using an array (Vector)? 
> Maybe. Luckily, if you used a List interface throughout your code, you 
> can change one line, and it'll work wither way.

funny you should mention that. window manager in windows 3.1 worked exactly like that. users noticed that the more windows the opened, the longer it took to open a new window. with new systems and more memory people would have many windows. before long this became a big issue. windows 95 fixed that.

never misunderestimate scalability.

> > wrong. you only need to define your abstract types appropriately. e.g. stl defines forward and random iterators. a forward iterators has ++ but no []. random iterator has both. so a random iterator can be substituted for a forward iterator. but not the other way. bottom line, sort won't compile on a forward iterator. your design allowed it to compile. which makes the design wrong.
> 
> STL happens to be one design and one world-view. It's a good one, but 
> it's not the only one. My main problem with the STL is that it takes 
> longer to learn than the Java/.NET standard libraries -- and thus the 
> cost of a programmer who knows it is higher. But there are language 
> considerations in there too, and this is a topic for another day.

cool. don't see how this all relates to the problem at hand.

> > btw my respect for tango improved when i found no opIndex in their list container.
> 
> http://www.dsource.org/projects/tango/browser/trunk/tango/util/collection/LinkSeq.d#L176

here you gently provide irrefutable proof you don't get what i'm sayin'. schveiguy did. the page fails to list opIndex. there is a function called 'get'. better yet. the new package lists a function 'nth' suggesting a linear walk to the nth element. good job tango fellas.
3 4 5 6 7 8 9 10 11
Top | Discussion index | About this forum | D home