January 28, 2005
In article <ctcbun$v1o$1@digitaldaemon.com>, Matthew says...
>What about an application, written in D and statically linked to a GC, that may or may not load a DDL to get some D classes, depending on its cmd-line params?

Right.

Well, I had assumed that any developer going to the trouble of supporting dynamically loadable modules (providing a 'container') would have statically linked to the DLL version GC, since all those lovely little loadable modules will be using said DLL anyway.

There are certain considerations that apply to containers, particularly those of a dynamic variety. As such, I don't think it's much of a stretch to note that such designs should use the DLL GC instead. In the end, that takes care of all those hideously complex issues you noted prior, in a robust manner, and it's simpler than /consistently/ following all those little details Walter added to the DMD doc :-)

Having said that, Walter has at least provided the bare-bones. I'll utilize that to provide a means of hiding the grubby details, such that both dynamic & static linking of DLLs will be both thoroughly transparent and painless.

IMO, this kind of thing should ideally be left to the O/S; not re-invented by the language runtime (someone else had noted this, also). Sometimes one has to sidestep the O/S, but in this case I don't feel the complexity tradeoffs are reasonable.

That is; I believe containers will be simpler and probably more robust if they avoid trying to do some fancy internal sharing of multiple GC instances. Just going with a single, shared GC instance, managed by the O/S is the better option. That simplicity might hopefully lead to more people writing dynamically loadable code, such as D Servlets. It also makes it easier for others to write alternate GC implementations, without the added complexity of re-implementing and thoroughly testing all that GC-sharing 'stuff'!

That's just my opinion, but it is the manner in which I will personally awaken the two containers currently slumbering within Mango; along with the mobile-code to go with them :-)

Lastly, I should note that this is just for the dynamic 'containment' style of programming (the specific case we're talking about). Other types of programs would link the GC in whatever means was appropriate to them (where static linking of the static-library GC would be the default, static, behavior).

Thoughts, Matthew? And how many times can one legitimally say 'static' in a single sentence?

- Kris

p.s. Pragma is building a container also, so I'd like to get his perspective on this too.





January 28, 2005
"Kris" <Kris_member@pathlink.com> wrote in message news:ctcgoa$13ro$1@digitaldaemon.com...
> Having said that, Walter has at least provided the bare-bones. I'll
utilize that
> to provide a means of hiding the grubby details, such that both dynamic &
static
> linking of DLLs will be both thoroughly transparent and painless.
>
> IMO, this kind of thing should ideally be left to the O/S; not re-invented
by
> the language runtime (someone else had noted this, also). Sometimes one
has to
> sidestep the O/S, but in this case I don't feel the complexity tradeoffs
are
> reasonable.

Most of the time, all you need to do is cut & paste from the examples given. One reason the details are shown is because D is a systems programming language, and knowing the how & the why of the details means one is much more likely to use it successfully. It also enables one to modify it for special purposes.

I also agree that the OS should provide gc services. But I am not in a position to design an OS <g>, so we must work with what we have.


January 28, 2005
In article <ctcgoa$13ro$1@digitaldaemon.com>, Kris says...
>
>
>p.s. Pragma is building a container also, so I'd like to get his perspective on this too.

Hey, sure thing.

Not to dilute Kris' argument here, but I think that Walter has given us what was needed for GC management between dll's and processes.  I haven't thought around all the corners of the problem space yet, but it looks more and more to me that using a separate dll for the GC may acutally further complicate things.  At first, I didn't think this was so.  But the updated model now creates a 1-to-1 mapping between GC's and processes, irrespective of how many dll's are in use. To me, that seems a damn fine solution, if not a step in the right direction.

That aside, the bigger issue is class management across dll boundaries.

Most applications do not need to worry about the validity of v-tables and delegates, since the dll is usually freed at program termination (this goes especially for static linking).  It is a problem that is not covered by the GC at all, so it requires additional management; hence Kris' notion of "Containers".

For those not familiar with the problem, here's what can easily happen.  Say I have an export from a dll that returns an object of class "Foobar".  I then free the dll since its no longer needed.  Finally, I attempt to print the contents of Foobar.

> // given: mylibrary represents a dll
> void makeSegFault(){
>    Foobar foo = mylibrary.getNewFoobar();
>    mylibrary.unload();
>    writeln(foo.toString());
> }

This will segfault since the vtable for 'foo' was a part of the dll.

Thankfully, the recent GC enhancements allow us to at least keep foo's memory footprint intact, but the methods are history.  Also, reloading the dll cannot be reliably used to 'magically' restore that vtable.

This pattern is easier to create than one would think, especially when one is cramming data into generic AA's and references become widely dispersed inside a large system.


For DSP, the solution I'm going to use involves a combination of object-proxies and reference counting of said proxies per dll.  A dll reload will not break code, since the proxies can be prodded to re-constitute their dll-bound counterparts.  This way, the proxies can be freely refrenced throughout the application, save the dll they're interfacing with (feedback would be *bad*).


The only other airtight solution I can think of, would be to apply the GC pattern to dll's.  This means that a dll is not unloaded until the heap is free from all refrences into a dll's address space (lazy unloading via garbage collection).  Adding a given dll's address space as a root to the GC should cover this.  The only drawback here is that its effectively the same as the present situation given that you cannot force a dll unload without potentially breaking something; the real advantage of dll's is to load and unload at will.


Aside: does anyone know what happens if you touch a used .class file while a Java app is running?  Can Java's ClassLoader be told to unload or reload a class file that's in use?  I'm curious since I'd like to know how other platforms have handled this space.


- EricAnderton at yahoo
January 28, 2005
In article <ctdqsl$30sc$1@digitaldaemon.com>, pragma says...
>
>In article <ctcgoa$13ro$1@digitaldaemon.com>, Kris says...
>>
>>
>>p.s. Pragma is building a container also, so I'd like to get his perspective on this too.
>
>Hey, sure thing.
>
>Not to dilute Kris' argument here, but I think that Walter has given us what was needed for GC management between dll's and processes.  I haven't thought around all the corners of the problem space yet, but it looks more and more to me that using a separate dll for the GC may acutally further complicate things.  At first, I didn't think this was so.  But the updated model now creates a 1-to-1 mapping between GC's and processes, irrespective of how many dll's are in use. To me, that seems a damn fine solution, if not a step in the right direction.
>
>That aside, the bigger issue is class management across dll boundaries.
>
>Most applications do not need to worry about the validity of v-tables and delegates, since the dll is usually freed at program termination (this goes especially for static linking).  It is a problem that is not covered by the GC at all, so it requires additional management; hence Kris' notion of "Containers".
>
>For those not familiar with the problem, here's what can easily happen.  Say I have an export from a dll that returns an object of class "Foobar".  I then free the dll since its no longer needed.  Finally, I attempt to print the contents of Foobar.
>
>> // given: mylibrary represents a dll
>> void makeSegFault(){
>>    Foobar foo = mylibrary.getNewFoobar();
>>    mylibrary.unload();
>>    writeln(foo.toString());
>> }
>
>This will segfault since the vtable for 'foo' was a part of the dll.
>
>Thankfully, the recent GC enhancements allow us to at least keep foo's memory footprint intact, but the methods are history.  Also, reloading the dll cannot be reliably used to 'magically' restore that vtable.
>
>This pattern is easier to create than one would think, especially when one is cramming data into generic AA's and references become widely dispersed inside a large system.
>
>
>For DSP, the solution I'm going to use involves a combination of object-proxies and reference counting of said proxies per dll.  A dll reload will not break code, since the proxies can be prodded to re-constitute their dll-bound counterparts.  This way, the proxies can be freely refrenced throughout the application, save the dll they're interfacing with (feedback would be *bad*).
>
>
>The only other airtight solution I can think of, would be to apply the GC pattern to dll's.  This means that a dll is not unloaded until the heap is free from all refrences into a dll's address space (lazy unloading via garbage collection).  Adding a given dll's address space as a root to the GC should cover this.  The only drawback here is that its effectively the same as the present situation given that you cannot force a dll unload without potentially breaking something; the real advantage of dll's is to load and unload at will.
>
>
>Aside: does anyone know what happens if you touch a used .class file while a Java app is running?  Can Java's ClassLoader be told to unload or reload a class file that's in use?  I'm curious since I'd like to know how other platforms have handled this space.
>
>
>- EricAnderton at yahoo

Good points, Pragma. Another thing to consider, regarding the explicit unloading of DLLs, is the 'version' issue. If one replaces an existing instance of some dynamically-loaded module with another, newer version, then the contract between the container and any existing (remote) clients has effectively been broken.

I note this because each newer version should be loaded as such; as a distinct and seperate instance in addition to any prior version instances. Doing so leads to long-term stability.

The upshot is that such a container would not have a regular need to /explicitly/ drop any particular (and previously loaded) module. Therefore, your approach of using the GC to manage module 'liveness' is rather suitable. Placing the GC within a DLL does not complicate this, as far as I can tell.

There's at least one tricky part there: how to know whether or not each dynamically loaded-module is still actually loaded. I think soft-references would alleviate that problem, and there are some ways to do that in D, although there's a subtle danger of deadlock since it appears that the GC halts all other threads when it reaps the heap :-(

Perhaps Walter could enlighten us on how to construct robust soft-references?

Thinking about this brings up another issue to consider; starting a thread from within a DLL will potentially cause the GC, and the process, to fail. Something to be careful of.



January 28, 2005
In article <ctd1v8$2239$1@digitaldaemon.com>, Walter says...
>
>
>"Kris" <Kris_member@pathlink.com> wrote in message news:ctcgoa$13ro$1@digitaldaemon.com...
>> Having said that, Walter has at least provided the bare-bones. I'll
>utilize that
>> to provide a means of hiding the grubby details, such that both dynamic &
>static
>> linking of DLLs will be both thoroughly transparent and painless.
>>
>> IMO, this kind of thing should ideally be left to the O/S; not re-invented
>by
>> the language runtime (someone else had noted this, also). Sometimes one
>has to
>> sidestep the O/S, but in this case I don't feel the complexity tradeoffs
>are
>> reasonable.
>
>Most of the time, all you need to do is cut & paste from the examples given. One reason the details are shown is because D is a systems programming language, and knowing the how & the why of the details means one is much more likely to use it successfully. It also enables one to modify it for special purposes.

It's /great/ that you documented all the details!

>
>I also agree that the OS should provide gc services. But I am not in a position to design an OS <g>, so we mst work with what we have.

We're misunderstanding each other, Walter. But there's nothing unusual about that :-)

Thanks for addressing the issue. Everyone has their own idea of how to skin the proverbial cat, but the end result is typically the same: one dead cat.

Can you perhaps enlighten us on how to contruct robust "soft-references"? It appears that the GC disables all threads whilst reaping allocations, which could then lead to deadlock between the GC and a soft-reference manager.

Are all threads (except the GC) halted when a destructor is invoked?


January 28, 2005
In article <cte18s$82o$1@digitaldaemon.com>, Kris says...

>Good points, Pragma. Another thing to consider, regarding the explicit unloading of DLLs, is the 'version' issue. If one replaces an existing instance of some dynamically-loaded module with another, newer version, then the contract between the container and any existing (remote) clients has effectively been broken.

Yep.  This is why I've advocated that we all get into the habit of naming our dlls with the version number as a part of the name.  It solves the majority of these problems.  The other techniques I've proposed in the past, may very well be suitable in an application-to-application manner.

Overall, this is an area where sufficent (and justified) pushback from Walter would have us forge an open standard for this kind of thing.

>
>The upshot is that such a container would not have a regular need to /explicitly/ drop any particular (and previously loaded) module. Therefore, your approach of using the GC to manage module 'liveness' is rather suitable.

I see where you're going with this.  Assuming that the only reason for a reload is to grab a newer version, you don't need to unload the old one at all.

>Placing the GC within a DLL does not complicate this, as far as I can tell.

I'm confused.  Did you mean "manage the dll with the GC" instead?


>There's at least one tricky part there: how to know whether or not each dynamically loaded-module is still actually loaded. I think soft-references would alleviate that problem, and there are some ways to do that in D, although there's a subtle danger of deadlock since it appears that the GC halts all other threads when it reaps the heap :-(
>
>Perhaps Walter could enlighten us on how to construct robust soft-references?

You're talking about having a soft (weak?) reference to the library in question, correct?  Constructing weaak-refrences in D should be as easy as writing a wrapper class that tells the GC to ignore the weak-pointer's address when checking for roots.  Now, checking their validitiy is tough to solve, since the GC doesn't expose any way to check if a pointer is under it's control (sure you could use win32, but it's not portable)

And as for deadlock: what if the call to unload a library is called on the GC's thread via a destructor?  Would that fix the problem?  I suppose if the dll held some kind of mutex inside of dllmain, that it would cause trouble.  But this may come back to "Best Practices" for managing such a mechanism.

>Thinking about this brings up another issue to consider; starting a thread from within a DLL will potentially cause the GC, and the process, to fail. Something to be careful of.

I'll have to take your word for this.  Perhaps you can furnish me with a more concrete example?  Unless you're inside of dllMain, there shouldn't be any side effects that I'm aware of.  Also, the MSDN library has a slew of articles of what to do and not to do inside of dllMain.  The gist of it all is that you should do the absolute minimum needed inside that routine as to avoid problems just within win32 itself.

- EricAnderton at yahoo
January 28, 2005
"Kris" <Kris_member@pathlink.com> wrote in message news:cte22g$97t$1@digitaldaemon.com...
> Can you perhaps enlighten us on how to contruct robust "soft-references"?

The way to do it is construct a pool of those soft references, so the gc won't reap them.

> It
> appears that the GC disables all threads whilst reaping allocations, which
could
> then lead to deadlock between the GC and a soft-reference manager.
>
> Are all threads (except the GC) halted when a destructor is invoked?

All the threads that the gc knows about (via std.thread). If you create a thread directly, not using std.thread, the gc won't stop it, scan it, or know anything about it.


January 28, 2005
In article <cte4c0$c8n$1@digitaldaemon.com>, pragma says...
>
>In article <cte18s$82o$1@digitaldaemon.com>, Kris says...
>>Placing the GC within a DLL does not complicate this, as far as I can tell.
>
>I'm confused.  Did you mean "manage the dll with the GC" instead?

Ahh; I was just referring to the earlier assertion that placing the GC itself within a seperate DLL might actually increase complexity. I don't think it does, but I could be wrong.


>>There's at least one tricky part there: how to know whether or not each dynamically loaded-module is still actually loaded. I think soft-references would alleviate that problem, and there are some ways to do that in D, although there's a subtle danger of deadlock since it appears that the GC halts all other threads when it reaps the heap :-(

>And as for deadlock: what if the call to unload a library is called on the GC's thread via a destructor?  Would that fix the problem?  I suppose if the dll held some kind of mutex inside of dllmain, that it would cause trouble.  But this may come back to "Best Practices" for managing such a mechanism.


Deadlock could occur if (a) the destructor is used to unload the module, (b) all
threads are halted whilst the GC runs (and hence during the destructor call),
and (c) the mutex protecting the "module is currently loaded" flag is held by
one of the stalled threads; one which was 'concurrently' asking for a handle to
that specific module. The GC thread would stall on that same mutex.

One way around this would be to utilize a mutex-free queue, to stack up destructor requests for unloading reaped module instances -- thereby decoupling the GC from aforementioned mutex.


>>Thinking about this brings up another issue to consider; starting a thread from within a DLL will potentially cause the GC, and the process, to fail. Something to be careful of.
>
>I'll have to take your word for this.  Perhaps you can furnish me with a more concrete example?


If one assumes that the GC has a valid reason for halting all threads during a sweep, then any thread it does not know about is a potential threat to stability. Since Phobos (and thus std.Thread) is still linked statically, all DLLs will have their own std.Thread instance, yet will be sharing a single GC.

The single GC only knows about one instance of std.Thread, and subsequently can only halt those threads created via that particular instance. Any thread created via a DLL will be noted only within that DLL std.Thread pool, and thus will not be stalled during a GC sweep. Therein lies trouble :-)

Full resolution is conceptually trivial, but apparently controversial.



January 28, 2005
"Walter" <newshound@digitalmars.com> wrote in message news:ctd1v8$2239$1@digitaldaemon.com...
>
> "Kris" <Kris_member@pathlink.com> wrote in message news:ctcgoa$13ro$1@digitaldaemon.com...
>> Having said that, Walter has at least provided the bare-bones. I'll
> utilize that
>> to provide a means of hiding the grubby details, such that both dynamic &
> static
>> linking of DLLs will be both thoroughly transparent and painless.
>>
<snip>
>
> Most of the time, all you need to do is cut & paste from the examples
> given.
> One reason the details are shown is because D is a systems programming
> language, and knowing the how & the why of the details means one is much
> more likely to use it successfully. It also enables one to modify it for
> special purposes.
>
<snip>
>

Walter - thank you for the DLL/GC addition!!

I gotta add my $0.02 on this though..

If the code inside DllMain, MyDLL_Initialize and MyDLL_Terminate can be handled by some sort of boiler-plate wrapper for 8 of 10 uses, I think it would be a /very/ good thing to provide it (while still allowing the developer to use the detailed version).

This would be especially true if it would make shared library development more portable between Win and the 'nix's for the majority of cases where the code in MyDLL_Initialize and MyDLL_Terminate can be handled by an import and a few wrapper functions like:

import std.gc;
import std.slinit;        // extern(C) { _minit(), etc... }

version (Windows) {

HINSTANCE g_hInst;

extern (Windows)
    BOOL DllMain(HINSTANCE hInstance, ULONG ulReason, LPVOID pvReserved)
{
    return SL_DllMain(hInstance,ulReason,g_hInst);
}

} // version (Windows)

export void MySharedLib_Initialize(void* gc)
{
    SL_Init(gc);
}

export void MySharedLib_Terminate()
{
    SL_Term();
}

I think it worth the effort just to minimize the code overhead (and learning curve and clutter) needed for most shared libs. But if it also turns out that the standard copy and paste code (of your example) needs to be different between Win and Linux, wrapper functions will make things that much more elegant for portable library development, IMHO.

To me, this would coincide with the D philosophy of hiding the messy details for the general case while still providing for their use if needed.

- Dave


1 2 3
Next ›   Last »