Thread overview | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
June 21, 2010 [Issue 4358] New: Potential Memory Leaks in std.file.read() ? | ||||
---|---|---|---|---|
| ||||
http://d.puremagic.com/issues/show_bug.cgi?id=4358 Summary: Potential Memory Leaks in std.file.read() ? Product: D Version: D1 Platform: x86 OS/Version: Windows Status: NEW Severity: normal Priority: P2 Component: DMD AssignedTo: nobody@puremagic.com ReportedBy: moi@vermi.fr --- Comment #0 from Vermi <moi@vermi.fr> 2010-06-21 10:13:49 PDT --- Created an attachment (id=668) Source file showing problem in memory managment Hi, it looks like the Garbage Collector don't free the memory used by std.file.read(). Here a simple example : [code] import std.file; import std.gc; void main() { for (int i=0; i<4096; i++) { void[] buffer = std.file.read("C:\\windows\\System32\\imageres.dll"); std.gc.fullCollect(); } } [/code] With this code I get a memory usage of ~200 MB. Now, simply change it with : [code] import std.file; import std.gc; void main() { for (int i=0; i<4096; i++) { void[] buffer = std.file.read("C:\\windows\\System32\\imageres.dll"); std.gc.fullCollect(); delete buffer; } } [/code] The memory usage is now about 20 MB. (the size of the file). It poses me a big problem with a program I'm writing, mermoy usage is rising until ~600 MB... Must I add delete everywhere in my code ? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
June 22, 2010 [Issue 4358] Potential Memory Leaks in std.file.read() ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vermi | http://d.puremagic.com/issues/show_bug.cgi?id=4358 nfxjfg@gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |nfxjfg@gmail.com --- Comment #1 from nfxjfg@gmail.com 2010-06-21 21:55:39 PDT --- Congratulations, you just found out that D's GC implementation is fucking shit. There doesn't seem to be a bug in std.file.read. It just uses the GC to allocate a big chunk of memory, and returns it. There are probably false pointers. Note that the latest allocated buffer will not be free'd with fullCollect() (because you still have a reference to the buffer in that variable), but I assume imageres.dll is much smaller than 200 MB. Rule of thumb: never allocate big arrays with the GC. There's a patch in bug 3463 to alleviate the problem, but it seems to be abandoned. Though, there is one potential problem in the D2 version of std.file.read. It contains this line: auto buf = GC.malloc(size, GC.BlkAttr.NO_SCAN)[0 .. size]; buf is returned as void[] to the user. As far as I understand the recently introduced changes to array appending, the memory block for an array slice contains a hidden length field. This is not present in this case, thus random things may happen if you try to append to the returned array. I do not know if this actually is a problem. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
June 22, 2010 [Issue 4358] Potential Memory Leaks in std.file.read() ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vermi | http://d.puremagic.com/issues/show_bug.cgi?id=4358 --- Comment #2 from Vermi <moi@vermi.fr> 2010-06-21 23:55:49 PDT --- It's very strange, I never had this kind of problem before. I must first finish my software, so I will add delete() in the code, but as soon as I have free time I will look if I can help with the D :) (I will take a look at GC's implementation). To answer you yes, imareges.dll is about 20 MB. I tried with a ~2MB file, and all worked well, so I guess there a limit size where the GC stop working correctly. I'll make some try this winter to see if I can make something. I take a look in the patch, but I need more than 2 minutes to understand the whole way the memory is managed in D. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
June 22, 2010 [Issue 4358] Potential Memory Leaks in std.file.read() ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vermi | http://d.puremagic.com/issues/show_bug.cgi?id=4358 --- Comment #3 from nfxjfg@gmail.com 2010-06-22 00:15:35 PDT --- Most likely, the GC _is_ working correctly. It's a conservative GC, and it treats integers, floats, random binary data the same as actual pointers. The GC doesn't have enough information to know what is pointer and what not. If an integer value looks like a pointer to a memory block, it's called a "false pointer". This may lead to memory leaks. And the larger a memory block is, the higher the probability that a false pointer exists and will prevent the memory block from being free'd. At least that's my theory. Conclusion: use malloc() for really large arrays. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
June 22, 2010 [Issue 4358] Potential Memory Leaks in std.file.read() ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vermi | http://d.puremagic.com/issues/show_bug.cgi?id=4358 --- Comment #4 from Vermi <moi@vermi.fr> 2010-06-22 00:23:46 PDT --- (In reply to comment #3) > Most likely, the GC _is_ working correctly. It's a conservative GC, and it treats integers, floats, random binary data the same as actual pointers. The GC doesn't have enough information to know what is pointer and what not. If an integer value looks like a pointer to a memory block, it's called a "false pointer". This may lead to memory leaks. > > And the larger a memory block is, the higher the probability that a false pointer exists and will prevent the memory block from being free'd. At least that's my theory. Conclusion: use malloc() for really large arrays. Maybe, depend of how the GC scan the stack. I will think about it. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
June 22, 2010 [Issue 4358] Potential Memory Leaks in std.file.read() ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vermi | http://d.puremagic.com/issues/show_bug.cgi?id=4358 Leandro Lucarella <llucax@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |llucax@gmail.com --- Comment #5 from Leandro Lucarella <llucax@gmail.com> 2010-06-22 08:05:07 PDT --- (In reply to comment #3) > Most likely, the GC _is_ working correctly. It's a conservative GC, and it treats integers, floats, random binary data the same as actual pointers. This is false. The GC only treats void[] as a potential source of pointers, which makes sense, really. int[], float[], char[], byte[] are *NOT* scanned for pointers. For a binary buffer that doesn't have pointers in it, you should probably use ubyte[]. If std.file.read() return void[], I'd say that std.file.read() is broken. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
June 22, 2010 [Issue 4358] Potential Memory Leaks in std.file.read() ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vermi | http://d.puremagic.com/issues/show_bug.cgi?id=4358 --- Comment #6 from Leandro Lucarella <llucax@gmail.com> 2010-06-22 08:11:50 PDT --- (In reply to comment #5) > (In reply to comment #3) > > Most likely, the GC _is_ working correctly. It's a conservative GC, and it treats integers, floats, random binary data the same as actual pointers. > > This is false. The GC only treats void[] as a potential source of pointers, which makes sense, really. int[], float[], char[], byte[] are *NOT* scanned for pointers. For a binary buffer that doesn't have pointers in it, you should probably use ubyte[]. If std.file.read() return void[], I'd say that std.file.read() is broken. I should add that, even when ubyte[] is not scanned for pointers, you are right about allocating big chunks of memory in the GC could lead to leaks, as the probabilities of having a false pointer pointing to that chunk of data gets really high. There is another bug report (with a patch too, and from David Simcha too) that helps with this problem: bug 2927. The idea is to add a new property to the GC to mark a chunk as not having interior pointers. If you mark a chunk of memory as not having interior pointers, the chances of a false pointer pointing to the beginning of your memory chunk becomes *very* low. It's a real shame that David's patches didn't get into druntime, as they are very helpful avoiding this kind of issues. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
June 22, 2010 [Issue 4358] Potential Memory Leaks in std.file.read() ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vermi | http://d.puremagic.com/issues/show_bug.cgi?id=4358 --- Comment #7 from nfxjfg@gmail.com 2010-06-22 08:23:51 PDT --- std.file.read returns a void[], but it's allocated manually via gc.malloc(), and I think the no-pointers thing was done right. The correct solution to this problem would be to apply the patch in bug 3463 (precise heap scanning), and then probably investigate precise stack scanning. I believe the patch in bug 2927 (no interior pointers attribute) would not roll well with Andrei's safety ideology, so let's forget that. Note that enhancement 2927 could be simulated by using a wrapper object, that behaves like an array, but actually uses C malloc to allocate memory. When the wrapper gets collected, its finalizer can free() the memory. Or using reference counting, like (I believe) Andrei's TightArray (or what was it named) does. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
June 22, 2010 [Issue 4358] Potential Memory Leaks in std.file.read() ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vermi | http://d.puremagic.com/issues/show_bug.cgi?id=4358 --- Comment #8 from Leandro Lucarella <llucax@gmail.com> 2010-06-22 10:03:22 PDT --- (In reply to comment #7) > std.file.read returns a void[], but it's allocated manually via gc.malloc(), and I think the no-pointers thing was done right. And what is the rationale for not returning ubyte[]? > The correct solution to this problem would be to apply the patch in bug 3463 (precise heap scanning), and then probably investigate precise stack scanning. You can't have full precise scanning in D, because of unions and other low-level constructs, so the problem of false pointers will be still there even with that patch (of course the chances of having a false pointer will be much lower). > I believe the patch in bug 2927 (no interior pointers attribute) would not roll well with Andrei's safety ideology, so let's forget that. That's stupid, it's not unsafer than NO_SCAN or casting a pointer to int, or whatever construct you like. That's obviously a construct that not everybody will use. So let's not forget that (and putting words in somebody else's mouth is not a good argument either =). > Note that enhancement 2927 could be simulated by using a wrapper object, that behaves like an array, but actually uses C malloc to allocate memory. When the wrapper gets collected, its finalizer can free() the memory. Or using reference counting, like (I believe) Andrei's TightArray (or what was it named) does. That's true, is a little more convoluted than having NO_INTERIOR and is not as easy to use by the compiler itself, but could be another option. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
Copyright © 1999-2021 by the D Language Foundation