Faster Virtual Method Dispatch (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Faster Virtual Method Dispatch (page 4)

April 25, 2006

Re: Faster Virtual Method Dispatch

Posted by Craig Black
in reply to Sean Kelly

Craig Black

Posted in reply to Sean Kelly

"Sean Kelly" <sean@f4.ca> wrote in message news:e2lpjj$hvv$1@digitaldaemon.com...
> In article <e2lhhq$6qv$1@digitaldaemon.com>, Craig Black says...
>>
>>
>>"Sean Kelly" <sean@f4.ca> wrote in message news:e2lgsh$5tf$1@digitaldaemon.com...
>>> Craig Black wrote:
>>>>
>>>> If you think this is doable, then you should write a C++ version of
>>>> wordcount based on this principal.  If you can get comparable
>>>> performance
>>>> to the D version then you may be on to something.
>>>
>>> I'd meant to do this and never got around to it.  However, the wordcount
>>> example is meant to compare performance using typical programming
>>> techniques for each language, and writing your own specialized
>>> containers
>>> isn't typical :-)
>>
>>That's not the point.  It demonstrates that the same performance can be
>>achieved without GC, and all the baggage that accompanies it.  So
>>performance-wise it would be like having your cake and eating it too.  If
>>this was possible, perhaps D could be streamlined to take advantage of
>>this
>>somehow.
>
> I merely meant that that's not the point of the wordcount example.
> However,
> I'll try and find the time to recode it using slices and hashtables in the
> next
> few days.  IIRC that brings performance in line with the D version, so I
> suppose
> that means you can "have your cake and eat it too."  It's worth
> mentioning,

Great!  I look forward to seeing the results.

> however, that slicing with explicit memory management doesn't have quite
> the
> utility it does with garbage collection.  It works great for apps where
> the
> entire buffer can be loaded en masse, as with the wordcount example, but
> in
> other cases it's a bit more complicated to manage the lifetime of the
> referenced
> data.  Thus the GC version is still a clear win in terms of ease of use,
> whether
> or not performance can be matched with specialized code elsewhere.

Different programmers have different problems that require different solutions.  I doubt that we will ever find one approach to make everyone happy about everything.  That's why I think GC should be completely optional.  Ideally, turning of GC could be as easy as specifying a compiler switch.

But maybe I'm wrong.  Maybe GC can be faster than manual memory management, but I doubt it.  Either way, we need to optimize this bottleneck somehow.

Although D currenty delivers competitive performance, and is in some ways is faster than C++, I still believe we have a few more steps to take before we can say with confidence that D provides performance superior to C++.

IMO, the fastest way to get extreme performance from D is (1) make GC optional and (2) provide a safe, easy way to take advantage of concurrency on multi-core systems.

(There is already work being done on #2 for C++, so we should try to beat them to the punch, and provide a superior solution for D.)

This would make D very attractive for performance-demanding applications.  I am especially thinking of the game industry, with which I am indirectly involved.  This is the kind of stuff that they are very concerned about.  If we do not provide a solution to these issues, I fear it will be a big deterrent for many, especially since migrating large projects to D from C++ requires a large investment.

-Craig

April 26, 2006

Re: Faster Virtual Method Dispatch

Posted by Walter Bright
in reply to Mike Capp

Walter Bright

Posted in reply to Mike Capp

Mike Capp wrote:
> In article <e2k12i$19bi$1@digitaldaemon.com>, Walter Bright says...
>> 1) for every allocation, shared_ptr<> allocates an *extra* block of memory to hold its accounting information. So you've got, out of the box, twice as many allocations/deallocations happening.
> Not true for intrusive refcounting. (This isn't always feasible if you're
> dealing with third-party classes, which is why shared_ptr isn't intrusive, but
> if you _can_ use it it's very close to ideal.)

True, but shared_ptr<> is billed as the C++ answer to gc, and shared_ptr<> is not intrusive.

>> 2) for every copy of the shared_ptr<>, like passing one to a function, returning one, etc., the reference count has to be incremented, then decremented.
> Again, not true for intrusive refcounting. You can assign an intrusive
> smartpointer from a raw pointer or reference, so most of the time you can use
> those to pass/return. Refcount twiddling is only needed when changing an
> "owning" pointer, plus the rare "safety" case where you have a raw pointer but
> are calling code which might drop the last "real" owning ref. 

But that means just going back to the old way of being very, very careful about who "owns" the pointer and who doesn't. Refcounting is supposed to eliminate that nuisance.

>> And even with all that falderol, shared_ptr<> still can't do slicing, which are an important efficiency improvement.
> Agreed, but couldn't you do something very similar with a simple ptr class
> containing a pointer to the (refcounted) base string plus a couple of range end
> indices? That's worked well for me in the past, but there may be benefits of D
> slices that I'm missing.

Sure you could do that, but then slices and shared_ptr's are not interchangeable. In D, you can build a data structure that contains slices, references to the heap, and references to static data willy-nilly. In C++, because each of those would be a different type, you'd have to build a wall between each.

I think that's one of the key benefits of D slicing - the user of the slice need not know or care that its a slice. I don't have to have two versions of toupper(), one for shared_ptr and one for slices.

>> Another advantage of gc (that isn't implemented, but could be,
>> in D's gc) is that allocated memory can be compacted.
> Has there ever been a language that successfully combined compacting GC with raw
> pointers? How would this work? And it's going to be tough to drop raw pointers
> without breaking ease-of-use for C libraries.

Oh, it's quite doable <g>.

April 26, 2006

Re: Faster Virtual Method Dispatch

Posted by Mike Capp
in reply to Walter Bright

Mike Capp

Posted in reply to Walter Bright

In article <e2mfjd$1f3d$1@digitaldaemon.com>, Walter Bright says...
>
>True, but shared_ptr<> is billed as the C++ answer to gc, and shared_ptr<> is not intrusive.

Shrug. I'm finding myself increasingly unconcerned these days with what's "billed as the C++ answer". I mean, iostreams?

>> Refcount twiddling is only needed when changing an
>> "owning" pointer, plus the rare "safety" case

>But that means just going back to the old way of being very, very careful about who "owns" the pointer and who doesn't. Refcounting is supposed to eliminate that nuisance.

I'm not sure that's true. Smartpointer refcounting is supposed to make ownership rules explicit in code where they can be a) seen and b) automated, instead of hidden in documentation.

In practice, I still make daft array-overrun and fencepost errors from time to time, but I've never had a problem with refcounting and ownership. As a first approximation: member variables, globals and new'ed locals are owners, and everything else isn't. Exceptions to this rule (e.g. double-linked lists) tend to be glaringly obvious.

>Sure you could do that, but then slices and shared_ptr's are not interchangeable.

True, but again, I'm not especially interested in shared_ptr, and this isn't a fundamental problem - you can write intrusive interchangeable string and slice classes in C++.

>> Has there ever been a language that successfully combined compacting GC with raw pointers? How would this work?
>
>Oh, it's quite doable <g>.

I find your ideas intriguing, and would like to subscribe to your newsletter.

Can you elaborate, or is this a trade secret?

cheers
Mike

April 26, 2006

Re: Faster Virtual Method Dispatch

Posted by Walter Bright
in reply to Mike Capp

Walter Bright

Posted in reply to Mike Capp

Mike Capp wrote:
> In article <e2mfjd$1f3d$1@digitaldaemon.com>, Walter Bright says...
>> True, but shared_ptr<> is billed as the C++ answer to gc, and shared_ptr<> is not intrusive.
> 
> Shrug. I'm finding myself increasingly unconcerned these days with what's
> "billed as the C++ answer". I mean, iostreams?

I know what you mean.


>>> Refcount twiddling is only needed when changing an
>>> "owning" pointer, plus the rare "safety" case 
>> But that means just going back to the old way of being very, very careful about who "owns" the pointer and who doesn't. Refcounting is supposed to eliminate that nuisance.
> I'm not sure that's true. Smartpointer refcounting is supposed to make ownership
> rules explicit in code where they can be a) seen and b) automated, instead of
> hidden in documentation.

I can see that point, but I can see that once you stray into such optimizations, there's no help from the compiler, and you'll need to be very careful. A corollary to that is extra time is spent, and bugs are likely.

> In practice, I still make daft array-overrun and fencepost errors from time to
> time, but I've never had a problem with refcounting and ownership. As a first
> approximation: member variables, globals and new'ed locals are owners, and
> everything else isn't. Exceptions to this rule (e.g. double-linked lists) tend
> to be glaringly obvious.

Andrei Alexandrescu has proposed a series of modifications to C++ to enable the compiler to enforce such rules, but nobody else seemed much interested in it.

>> Sure you could do that, but then slices and shared_ptr's are not interchangeable.
> 
> True, but again, I'm not especially interested in shared_ptr, and this isn't a
> fundamental problem - you can write intrusive interchangeable string and slice
> classes in C++.

Perhaps, but I've never seen anyone pull it off.


>>> Has there ever been a language that successfully combined compacting GC with raw pointers? How would this work? 
>> Oh, it's quite doable <g>.
> 
> I find your ideas intriguing, and would like to subscribe to your newsletter.
> 
> Can you elaborate, or is this a trade secret?

I'd rather not, until I get it working.

April 26, 2006

Re: Faster Virtual Method Dispatch

Posted by James Dunne
in reply to BCS

James Dunne

Posted in reply to BCS

BCS wrote:
> 
> [snip] I think the argument is that in any non trivial project, manually freeing
> all of the memory without any memory leaks or axing something to soon would
> require more overhead than using GC. The trade-off is not in the freeing time
> but in the can-I-free-this check.
> 

Not true.  A hierachical memory management system is great!  It sure beats reference counting, manual memory management, and conservative GC.

What you do is allocate some memory, but also provide a "parent" memory pointer to relate the new memory to.  When all is said and done you have a nice tree of related memory pointers (hopefully with just one root, provided you designed correctly).

To free some memory, simply walk all the branches of that node, freeing their memory, then free the original memory.  You can even get predictable destructors for objects this way.

I'm currently developing a project in C which uses this method.  I have it allocating lots of small blocks of memory to perform a job.  The total comes to 32KB.  When the program is finished I do one "free" call to the root structure and I get back ALL of my memory.  Not a single byte is leaked.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/MU/S d-pu s:+ a-->? C++++$ UL+++ P--- L+++ !E W-- N++ o? K? w--- O M--@ V? PS PE Y+ PGP- t+ 5 X+ !R tv-->!tv b- DI++(+) D++ G e++>e h>--->++ r+++ y+++
------END GEEK CODE BLOCK------

James Dunne

April 26, 2006

Re: Faster Virtual Method Dispatch

Posted by xs0
in reply to James Dunne

xs0

Posted in reply to James Dunne

> Not true.  A hierachical memory management system is great!  It sure beats reference counting, manual memory management, and conservative GC.
> 
> What you do is allocate some memory, but also provide a "parent" memory pointer to relate the new memory to.  When all is said and done you have a nice tree of related memory pointers (hopefully with just one root, provided you designed correctly).
> 
> To free some memory, simply walk all the branches of that node, freeing their memory, then free the original memory.  You can even get predictable destructors for objects this way.
> 
> I'm currently developing a project in C which uses this method.  I have it allocating lots of small blocks of memory to perform a job.  The total comes to 32KB.  When the program is finished I do one "free" call to the root structure and I get back ALL of my memory.  Not a single byte is leaked.

But how can you tell _when_ to delete the root? AFAIK, the main benefit of GC is not that you don't have to manually free the memory, but that you don't have to know when it is safe to do so. I don't see how organizing memory blocks into a tree solves that...


xs0

April 26, 2006

Re: Faster Virtual Method Dispatch

Posted by James Dunne
in reply to xs0

James Dunne

Posted in reply to xs0

xs0 wrote:
> 
>> Not true.  A hierachical memory management system is great!  It sure beats reference counting, manual memory management, and conservative GC.
>>
>> What you do is allocate some memory, but also provide a "parent" memory pointer to relate the new memory to.  When all is said and done you have a nice tree of related memory pointers (hopefully with just one root, provided you designed correctly).
>>
>> To free some memory, simply walk all the branches of that node, freeing their memory, then free the original memory.  You can even get predictable destructors for objects this way.
>>
>> I'm currently developing a project in C which uses this method.  I have it allocating lots of small blocks of memory to perform a job.  The total comes to 32KB.  When the program is finished I do one "free" call to the root structure and I get back ALL of my memory.  Not a single byte is leaked.
> 
> 
> But how can you tell _when_ to delete the root? AFAIK, the main benefit of GC is not that you don't have to manually free the memory, but that you don't have to know when it is safe to do so. I don't see how organizing memory blocks into a tree solves that...
> 
> 
> xs0

That's a different problem and depends on the nature of the application.  If you're in a server environment, you free memory when the client request has been completely processed.  If you're in an interactive user-interface environment, you free memory when you close documents, close windows, etc.  When the data is no longer needed, you'll know when to free and what to free.

I think you thought I was implying you should only have one memory root which all things are related to and that is the only thing that should ever be freed.  The first part is right, but not the second.  You should keep all the memory related to each other, but it's not that you have to free it in an "all or nothing" circumstance.  Then again, that depends on your application type.

In my example application I have lots of memory free operations on the individual nodes when I know I won't need that memory anymore, such as for temporary strings.  If I forget to include an explicit free call to that temporary memory, it's okay since I know that memory has been related to another parent memory block that _will_ be freed.

Can you provide me a few legitimate examples of when you won't know what to free and when to free it?  I'm having a tough time coming up with examples on my own...

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/MU/S d-pu s:+ a-->? C++++$ UL+++ P--- L+++ !E W-- N++ o? K? w--- O M--@ V? PS PE Y+ PGP- t+ 5 X+ !R tv-->!tv b- DI++(+) D++ G e++>e h>--->++ r+++ y+++
------END GEEK CODE BLOCK------

James Dunne

April 26, 2006

Re: Faster Virtual Method Dispatch

Posted by Walter Bright
in reply to James Dunne

Walter Bright

Posted in reply to James Dunne

James Dunne wrote:
> Can you provide me a few legitimate examples of when you won't know what to free and when to free it?  I'm having a tough time coming up with examples on my own...

One example that has plagued me in the past is having two distinct tree data structures, and it becomes necessary for a node in one tree to refer to data in the other.

Or when you have a third data structure, and you need to point to data in one or the other or both of the first two trees.

A typical example of that is symbol tables inside the compiler. In the DMD front end (which is garbage collected) the data has all kinds of cross-connections in it.

I.e. the problem crops up when the data in memory just doesn't fit into a strictly heirarcical format.

April 26, 2006

Re: Faster Virtual Method Dispatch

Posted by James Dunne
in reply to Walter Bright

James Dunne

Posted in reply to Walter Bright

Walter Bright wrote:
> James Dunne wrote:
> 
>> Can you provide me a few legitimate examples of when you won't know what to free and when to free it?  I'm having a tough time coming up with examples on my own...
> 
> 
> One example that has plagued me in the past is having two distinct tree data structures, and it becomes necessary for a node in one tree to refer to data in the other.
> 
> Or when you have a third data structure, and you need to point to data in one or the other or both of the first two trees.
> 
> A typical example of that is symbol tables inside the compiler. In the DMD front end (which is garbage collected) the data has all kinds of cross-connections in it.
> 
> I.e. the problem crops up when the data in memory just doesn't fit into a strictly heirarcical format.

Your example is having two trees _point_ to data within each other.  The original data is still _owned_ by the original tree which the data resides in.  You wouldn't be swapping ownership of the data like you're talking about.  In this system, when memory is allocated it is assigned an owner and that ownership does not ever change.  So the system resembles a tree (one parent only), not a directed graph.

Funny you mention compiler, because that is the example application I was referring to. :)

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/MU/S d-pu s:+ a-->? C++++$ UL+++ P--- L+++ !E W-- N++ o? K? w--- O M--@ V? PS PE Y+ PGP- t+ 5 X+ !R tv-->!tv b- DI++(+) D++ G e++>e h>--->++ r+++ y+++
------END GEEK CODE BLOCK------

James Dunne

April 26, 2006

Re: Faster Virtual Method Dispatch

Posted by James Dunne
in reply to James Dunne

James Dunne

Posted in reply to James Dunne

James Dunne wrote:
> Walter Bright wrote:
> 
>> James Dunne wrote:
>>
>>> Can you provide me a few legitimate examples of when you won't know what to free and when to free it?  I'm having a tough time coming up with examples on my own...
>>
>>
>>
>> One example that has plagued me in the past is having two distinct tree data structures, and it becomes necessary for a node in one tree to refer to data in the other.
>>
>> Or when you have a third data structure, and you need to point to data in one or the other or both of the first two trees.
>>
>> A typical example of that is symbol tables inside the compiler. In the DMD front end (which is garbage collected) the data has all kinds of cross-connections in it.
>>
>> I.e. the problem crops up when the data in memory just doesn't fit into a strictly heirarcical format.
> 
> 
> Your example is having two trees _point_ to data within each other.  The original data is still _owned_ by the original tree which the data resides in.  You wouldn't be swapping ownership of the data like you're talking about.  In this system, when memory is allocated it is assigned an owner and that ownership does not ever change.  So the system resembles a tree (one parent only), not a directed graph.
> 
> Funny you mention compiler, because that is the example application I was referring to. :)
> 

I should also mention that my compiler implementation makes use of context structures, so every memory block that the compiler will allocate is somehow parented to that context structure.  So when the compile job is complete, free the context structure and all the memory goes away.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/MU/S d-pu s:+ a-->? C++++$ UL+++ P--- L+++ !E W-- N++ o? K? w--- O M--@ V? PS PE Y+ PGP- t+ 5 X+ !R tv-->!tv b- DI++(+) D++ G e++>e h>--->++ r+++ y+++
------END GEEK CODE BLOCK------

James Dunne

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation