January 26, 2021 Re: F*cked by memory corruption after assiging value to associative array | ||||
---|---|---|---|---|
| ||||
Posted in reply to frame | On Monday, 25 January 2021 at 17:11:37 UTC, frame wrote:
> Wrong way?
Please, someone correct me if I'm getting this wrong:
Structure:
EXE/Main Thread:
- GC: manual
- requests DLL 1 object A
- GC knows about object A
DLL/Thread 1:
- GC: conservative
- allocates new object A -> addRoot(object A), return to EXE (out param)
- requests DLL 2 object B
- GC knows about object A and object B
- requests sub objects of object B later
DLL/Thread 2:
- GC: manual
- allocates new object B -> addRoot(object B), return to DLL 1 (out param)
- GC knows about object B
- allocates sub objects over object B when DLL 1 requests it, return to DLL 1 (out param)
- sub objects are stored in object B
- object B sub objects memory gets corrupted after DLL 1 becomes active thread again
In this scenario only DLL 1 can cause the corruption as it does not occur if all GCs are set to manual.
At this point I am confused about how memory allocation is ensured. Each thread should have assigned its own memory area. Each GC adopts the root by the returned object and knows about that area too. But if DLL 1 becomes active it writes into sub memory of DLL 2. It only can because it has adopted the root of object B - but why does not see DLL 1 then that sub objects of B are still alive?
|
January 27, 2021 Re: F*cked by memory corruption after assiging value to associative array | ||||
---|---|---|---|---|
| ||||
Posted in reply to frame | On Tuesday, 26 January 2021 at 14:31:58 UTC, frame wrote:
but why does not see DLL 1 then that sub objects of
> B are still alive?
I may fool myself but could it be caused by an already gone slice data? It very looks like that only a specific string property is corrupted which got the same slice data as an input parameter.
I thought that the slice data should stay referenced in the persistent object anyway but the GC seems not so smart to detect this.
The error can be prevented with .dup so far.
|
January 27, 2021 Re: F*cked by memory corruption after assiging value to associative array | ||||
---|---|---|---|---|
| ||||
Posted in reply to frame | On 1/26/21 6:31 AM, frame wrote: > all GCs Multiple D runtimes? That might work I guess but I've never heard of anybody talking about having multiple runtimes. Does rt_init() initialize *a* D runtime or *the* D runtime? If it indeed works we definitely need much better documentation. I load my libraries with loadLibrary[1] so that "[if] the library contains a D runtime it will be integrated with the current runtime." Ali [1] https://dlang.org/library/core/runtime/runtime.load_library.html |
January 27, 2021 Re: F*cked by memory corruption after assiging value to associative array | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ali Çehreli | On Wednesday, 27 January 2021 at 17:41:05 UTC, Ali Çehreli wrote:
> On 1/26/21 6:31 AM, frame wrote:
>
> > all GCs
>
> Multiple D runtimes? That might work I guess but I've never heard of anybody talking about having multiple runtimes. Does rt_init() initialize *a* D runtime or *the* D runtime? If it indeed works we definitely need much better documentation.
>
> I load my libraries with loadLibrary[1] so that "[if] the library contains a D runtime it will be integrated with the current runtime."
>
> Ali
>
> [1] https://dlang.org/library/core/runtime/runtime.load_library.html
I have no idea if there are multiple runtimes. I just use the mixin SimpleDllMain. But there must be multiple instances of GCs running because
1) command line argument --DRT-gcopt=gc:manual was seen by the EXE but ignored by the DLL and still crashed
2) after "burning in" gc:manual in the DLL, observing GC.profileStats.numCollections shows in one DLL thread 0 and the other DLL thread > 0 and thus crashed. Or my debugger lied to me.
I also use loadLibrary.
|
January 27, 2021 Re: F*cked by memory corruption after assiging value to associative array | ||||
---|---|---|---|---|
| ||||
Posted in reply to frame | On Wednesday, 27 January 2021 at 18:09:39 UTC, frame wrote: > there must be multiple instances of GCs running because Sharing data between multiple threads that each use a different instance of the D GC will definitely not work right, because each GC will only know to pause the threads and scan the roots that it has been directly informed of. There is supposed to only be one instance of the D GC running per process. If you have more than one running then either you aren't linking and loading the DLLs correctly, or you have run into a serious bug in the D tooling. > Or my debugger lied to me. I have found the gdb debugger on Linux often breaks horribly on my D code, especially when it is multi-threaded, and the debugger is only semi-usable. Maybe the Windows debugger is better now? (I wouldn't know, since I haven't used it in a while.) I think skepticism is warranted here. |
January 27, 2021 Re: F*cked by memory corruption after assiging value to associative array | ||||
---|---|---|---|---|
| ||||
Posted in reply to frame | On Wednesday, 27 January 2021 at 18:09:39 UTC, frame wrote: > I have no idea if there are multiple runtimes. I just use the mixin SimpleDllMain. But there must be multiple instances of GCs running because Another thread is running right now which I think is touching upon these same issues. Adam D. Ruppe explains some of what's going on: https://forum.dlang.org/post/veeksndchoppftlujrwl@forum.dlang.org Sadly, it looks like shared D DLLs are just kind of broken on Windows, unless you want to go the betterC route... |
January 28, 2021 Re: F*cked by memory corruption after assiging value to associative array | ||||
---|---|---|---|---|
| ||||
Posted in reply to tsbockman | On Wednesday, 27 January 2021 at 22:57:11 UTC, tsbockman wrote: > > There is supposed to only be one instance of the D GC running per process. If you have more than one running then either you aren't linking and loading the DLLs correctly, or you have run into a serious bug in the D tooling. What could I do wrong by just using SimpleDllMain and then put my exports? build line for DLL is: rdmd -shared --build-only -gf -m64 Under Linux everything is shared. Under Windows each DLL seems to run in its own thread, has its own rt_options and do not see any __gshared variable value. Its completely isolated and so I assume that also GC is. Also https://wiki.dlang.org/Win32_DLLs_in_D says: Each EXE and DLL will have their own gc instance. I also wonder why the static linked DLL should use a GC proxy while as SimpleDllMain does nothing with a proxy - should loadLibrary() take care off here automatically? It seems, it does not. |
January 28, 2021 Re: F*cked by memory corruption after assiging value to associative array | ||||
---|---|---|---|---|
| ||||
Posted in reply to frame | On Thursday, 28 January 2021 at 07:50:43 UTC, frame wrote: > Under Linux everything is shared. Under Windows each DLL seems to run in its own thread, has its own rt_options and do not see any __gshared variable value. Its completely isolated and so I assume that also GC is. This stuff works correctly under Linux, and is quite broken in Windows. This has been known for years, but hasn't been fixed yet. This link for my other reply gives more details: https://forum.dlang.org/post/veeksndchoppftlujrwl@forum.dlang.org > Also https://wiki.dlang.org/Win32_DLLs_in_D says: Each EXE and DLL will have their own gc instance. They each have their own GC instance because no one has fully fixed the problems discussed at my link, above, not because it's actually a good idea for them each to have their own GC instance. It is possible to get things sort of working with on Windows, anyway. But, this requires either: A) Following all the same rules that you would need to follow if you wanted to share D GCed memory with another thread written in C. (Just adding GC roots is not enough.) B) Ensuring that the GC proxy connections are properly established before doing anything else. This doesn't actually work correctly or reliably, but it might work well enough for your use case. Maybe. |
January 28, 2021 Re: F*cked by memory corruption after assiging value to associative array | ||||
---|---|---|---|---|
| ||||
Posted in reply to tsbockman | On Thursday, 28 January 2021 at 19:22:16 UTC, tsbockman wrote:
> It is possible to get things sort of working with on Windows, anyway.
I'm ok with it as long as the memory is not re-used by the GC. It seems that it can be prevented with addRoot() successfully. The other problem with shared slice data is somewhat logical as the DLL GC doesn't care on the origin of the data from another thread and the data's origin GC sees any reference to it gone after passing it to the DLL function. They are isolated and data which must be kept longer should be copied where it's necessary.
|
January 28, 2021 Re: F*cked by memory corruption after assiging value to associative array | ||||
---|---|---|---|---|
| ||||
Posted in reply to frame | On Thursday, 28 January 2021 at 20:17:09 UTC, frame wrote: > On Thursday, 28 January 2021 at 19:22:16 UTC, tsbockman wrote: >> It is possible to get things sort of working with on Windows, anyway. > > I'm ok with it as long as the memory is not re-used by the GC. It seems that it can be prevented with addRoot() successfully. GC.addRoot is not enough by itself. Each GC needs to know about every single thread that may own or mutate any pointer to memory managed by that particular GC. If a GC doesn't know, memory may be prematurely freed, and therefore wrongly re-used. This is because when it scans memory for pointers to find out which memory is still in use, an untracked thread may be hiding a pointer on its stack or in registers, or it might move a pointer value from a location late in the scanning order to a location early in the scanning order while the GC is scanning the memory in between, such that the pointer value is not in either location *at the time the GC checks it*. You won't be able to test for this problem easily, because it is non-deterministic and depends upon the precise timing with which each thread is scheduled and memory is synchronized. But, it will probably still bite you later. If you were just manually creating additional threads unknown to the GC, you could tell the GC about them with core.thread.osthread.thread_attachThis and thread_detachThis. But, I don't know if those work right when there are multiple disconnected copies of D runtime running at the same time like this. The official solution is to get the GC proxy connected properly from each DLL to the EXE. This is still very broken on Windows in other ways (again, explained at my link), but it should at least prevent the race condition I described above, as well as being more efficient than running multiple GCs in parallel. Alternatively, you can design your APIs so that no pointer to GC memory is ever owned or mutated by any thread unknown to that GC. (This is the only option when working across language boundaries.) |
Copyright © 1999-2021 by the D Language Foundation