curiosity (OOC) (page 4)

On Sat, 24 Aug 2002 17:20:05 -0700 Burton Radons <loth@users.sourceforge.net> wrote: > I don't want to get into this debate, as the only result will be that refcounting's a bloody huge problem that needs more investigation. It's straightforward for interpreted languages and not so for native code. Unprecedented to my knowledge, which is far from broad, due to things like: > > counted class foo { int y; } > > foo h = new foo; > int *z = &h.y; > h = null; Well, D isn't sandbox C#. You can do weird things even now, for example return a pointer to a local variable. Not much different - and, in fact, happens quite often, compared to your example... pointers are dangerous by themselves, and as long as one avoids them (which _is_ possible in high-level D - dynamic arrays and references really help), there shouldn't be any problem with refcounting.

August 25, 2002

Re: curiosity (OOC)

Posted by Mac Reiter
in reply to Sean L. Palmer

Permalink

Mac Reiter

Posted in reply to Sean L. Palmer

Permalink

>In any case I have never ever seen refcounting overhead show up in a performance profiler.  It could cost you some valuable memory though, but in most cases I think it's worth it.  You wouldn't want to refcount, say, a point or rectangle.  But anything bigger than a matrix would be a good candidate.
>
>Sean

Most performance profilers (VTune, AMD Code Analyst, MSVC profiler, and even gprof, I think) are sampling profilers -- they kick in every so many milliseconds and see where the program counter is aimed.  Reference counting is one of those insidious performance problems where each individual reference counting operation takes a very small amount of time (although it is usually considerably longer than the amount of time a non-refcounted thing would take), so the chances of the sampling profiler catching it are almost zero.  One of the larger concerns about refcounting involves multithreading.  Every reference counting operation must be threadsafe, which involves a mutex lock/unlock for every increment/decrement.  OK, on an x86 you can use the "interlocked increment/decrement" instructions to do this much more efficiently, but not all architectures have these.  In the presence of SMP architectures, having to force all CPUs to agree on a value can introduce significant synchronization overhead.

If your profiler takes samples every 10ms, I could burn 9ms out of every 10 and never show up in its report.  I have seen this happen.  Another cost, similar to refcounting for difficulty pinning down, is array bounds checking.  I have run many programs through a profiler and never seen the bounds checking code show up as a meaningful percentage of execution time (maybe 1-2% in the report).  But if I remove it, suddenly my program runs between 50-100% faster.

The other profiling option is to instrument the code.  Usually only function entry/exit is instrumented (reported out to the external profiler).  Very small functions then consume an inordinate amount of time, because it takes two moderate sized logging functions for every call to the small function.  This tends to corrupt the report accuracy.  I know that TrueTime from NuMega used to claim that they could subtract out this logging time and give you a real report. The problem I always had was that it was hopelessly tied into particular sub-versions of MSVC, and just applying a service pack could cause your profiler to stop working...

The only real way to know where your program is spending all of its time, unfortunately, is an ICE and a super fast, mega memory logic analyzer.  Not really worthwhile for most development.  But any profiler that is sharing space on the same computer as the profiled code will have certain kinds of performance problems that it simply cannot see, or cannot accurately report.

Profilers are good things.  But it is always important to remember exactly what your meter is really showing you, and that what you are shown is rarely a complete picture.

Also, reference counting is a good thing in the right place.  I'm the one who suggested 'counted', so obviously I would like to see reference counting supported.  But it is also important to remember that, on the whole, garbage collection will manage memory for a much lower overall overhead than reference counting.  Reference counting is useful for deterministic finalization, but it comes at an additional price.  (OK, refcounting is also useful for distributing the cost of memory management thinly over your whole application run.  If it is better to have 20% memory overhead consistently than it is to have 10% memory overhead all bunched together in a half-second interval, then you want refcounting.  This is the kind of decision that real-time and game programmers have to make.)

Mac

>I don't want to get into this debate, Too late! (sorry...) >as the only result will be that >refcounting's a bloody huge problem that needs more investigation. It's >straightforward for interpreted languages and not so for native code. >Unprecedented to my knowledge, which is far from broad, due to things like: > > counted class foo { int y; } > > foo h = new foo; > int *z = &h.y; > h = null; Let me first state that I understand the problem of trying to come up with a short, clear example on the spur of the moment. My comments may be focusing too much on the particular example than on the root problem. I have attempted to think about and address the issue in a wider scope, but I have also not been awake for very long... If, when reading the following comments, you think of a real example piece of code that exhibits the root problem without having the bad design problems that I will discuss below, I would be very interested in seeing your code. Most garbage collectors will also hork on this problem (in the general case). The Boehm GC will consider h to be garbage after the NULL assignment, unless you specifically turn on support for tracking pointers to partial objects. (In this particular instance, the address of h.y is probably the same as the address of h, so it would happen to work even in a 'whole object only' GC. But let us assume a more general case where y is not the first member of foo.) If you do try to handle pointers to members, the cost for _any_ automatic memory management scheme increases by several hundred percent. For the Boehm collector, I believe the _additional_ penalty multiplier is roughly equivalent to the average size of your GC'ed objects. Boehm has no type information, and thus cannot perform faster operations to locate object starts -- it must scan through nearby memory. I may be misremembering this, so I would suggest reading the documentation for the Boehm GC rather than taking my word for it. I feel I should also point out that the above code deserves to explode violently. It is logically equivalent to returning a pointer or reference to an automatic local variable from a function in C or C++. By the time the pointer/reference is used, it is invalid. First, in most cases you shouldn't make pointers that point to members. Just make a foo* and aim it at h, then pull the member out normally. In C/C++, pointers to members would be used as an optimization in a loop (saves the indirect field access for each loop -- a good optimizer should have done it automatically, but we won't go there. Also, in D, you can use the 'with' keyword for the same effect). However, you would never point at a piece of an object, delete the object, and then start a loop that worked with a member. There are times, though they occur relatively infrequently, when a pointer/reference to a member is needed rather than just "convenient". But if you don't hold a pointer/reference to the full object somewhere else, the results are going to be undefined. This is not a problem specific to GC or refcounting. Imagine doing the same thing in C++. At the end of the code sequence above, you would have leaked a foo. Since you no longer have the address of the actual object, you can't delete it. As a side note: If you *REALLY* have to do this, it is possible to cast 0 to the class type, ask for the address of the member to which you have a pointer (this will give you the offset of that member inside that class), and then subtract this offset from your member pointer to get the original object address. Any time you find yourself writing code that is that horked up, you should seriously consider if your architecture has been designed properly... Mac

>I'm not pointing this out as an impossible problem; I'm pointing it out as an unobvious one with big consequences, and there are many others, certainly ones I have no knowledge of. Refcounting requires an in-depth review of its effects to find out what the damage really will be, what language changes will be needed, or if it's simply not possible to have both it and what's recognisably D. Something that has been bothering me a bit more is the interaction of GC and RC (refcounting). The GC must not simply ignore RCed objects, assuming that the RC system will properly dispose of them. If it does, cyclic references amongst RC'ed objects would leak. I think that if the GC treats RCed objects just like every other object, everything will be fine. The GC should never delete an object that is still validly in use, because the user should be visible to D and thus to the GC. The only issue I can see is if you tried to work with RCed objects that were being manipulated by external non-D libraries, but you have similar problems with the GC. The safest bet all around is to make a D reference to the object and keep it valid while the external library is using it. I can't think of any problems with the GC behind RC method (that wouldn't exist individually for each of GC and RC anyway). Am I overlooking something? Mac

Mac Reiter <Mac_member@pathlink.com> wrote in news:akaovp$1l7g$1@digitaldaemon.com: > Something that has been bothering me a bit more is the interaction of GC and RC (refcounting). The GC must not simply ignore RCed objects, assuming that the RC system will properly dispose of them. If it does, cyclic references amongst RC'ed objects would leak. I think that reference counting should not activly reclaim memory when refcount == 0. Instead leave it mainly as a method of deterministice finalization. If the programmer truly desires that the memory be reclamed now he can explicitly suggest it to to the gc in the finializer. Otherwise just let the gc reclain it in its next mark and sweep pass.

"Patrick Down" <pat@codemoon.com> wrote in message news:Xns9275760177718patcodemooncom@63.105.9.61... > I think that reference counting should not activly reclaim memory when refcount == 0. Instead leave it mainly as a method of deterministice finalization. Of course. Once I'm done with a piece of memory, all I want is for it to behave as if it's been freed, wether that's the case or not. As a general precept, I would never care what the program does behind my back (caching, reorganizing my data, etc...), as long as it behaves predictably the way it's intended to behave. Salutaciones, JCAB

Forums