Thread overview
Profiling the memory in D
Dec 04, 2019
Erdem
Dec 04, 2019
kerdemdemir
Dec 07, 2019
kerdemdemir
December 04, 2019
I am used to have cool tools like valgrid massif to visualize the memory usage from C++ but in D it seems I am blind folded looking for the problem.

Until now I tried:

--vgc option which show million things which makes it not useful
--build=profile-gc
which seems to slow down my program like *200 times and I have some operation depending on time which makes creating the real case of the application impossible. The observation itself effect the experiment very badly.

Since this options are not useful for me, I tried to put debug logs in critical places to estimate where I am losing memory. I couldn't find a D native call to get the memory usage of the program. I had to call a shell command which also slows down the app. This slow down causes me to not be able to put this debug log too many places or in critical loops.

I manage to narrow it down a bit. Than I wanted to check the object instances to find out which objects are getting bigger. I tried
GC.sizeOf(&object) which always prints 0. GC.sizeOf also does not works with associative arrays , I don't see any usefull case than GC.sizeOf(array.ptr).

That is the background of my questions and my questions are:

Is there a better way to visualize the memory in D something like valgrid massif?

profile-gc literally makes to program not do anything due to slow down is there any way to profile-gc in a faster fashion?

Is there a native function which will return the memory usage of my program?

Is there a function to return sizes of AAs and even class instances ?

Erdemdem



December 04, 2019
On 12/4/19 3:10 AM, Erdem wrote:
> I am used to have cool tools like valgrid massif to visualize the memory usage from C++ but in D it seems I am blind folded looking for the problem.
> 
> Until now I tried:
> 
> --vgc option which show million things which makes it not useful
> --build=profile-gc
> which seems to slow down my program like *200 times and I have some operation depending on time which makes creating the real case of the application impossible. The observation itself effect the experiment very badly.
> 
> Since this options are not useful for me, I tried to put debug logs in critical places to estimate where I am losing memory. I couldn't find a D native call to get the memory usage of the program. I had to call a shell command which also slows down the app. This slow down causes me to not be able to put this debug log too many places or in critical loops.
> 
> I manage to narrow it down a bit. Than I wanted to check the object instances to find out which objects are getting bigger. I tried
> GC.sizeOf(&object) which always prints 0. GC.sizeOf also does not works with associative arrays , I don't see any usefull case than GC.sizeOf(array.ptr).
> 
> That is the background of my questions and my questions are:
> 
> Is there a better way to visualize the memory in D something like valgrid massif?
> 
> profile-gc literally makes to program not do anything due to slow down is there any way to profile-gc in a faster fashion?
> 
> Is there a native function which will return the memory usage of my program?

If it's total GC memory only you are interested in, then try the D runtime switch for the GC: --DRT-gcopt=profile:1

This will print out a summary of GC usage at the end of your program, and shouldn't significantly affect runtime.

> 
> Is there a function to return sizes of AAs and even class instances ?
> 

Class instances have a compiler-defined size. Try __traits(classInstanceSize, SomeClass)

This is the compile-time size of that specific type. If you have a class instance, and you want the actual class size, in the case of a derived instance, you may retrieve that using typeid(classInstance).initializer.length.

Note also that this is not necessarily the size consumed from the GC! If you want *that* size, you need to use GC.sizeof(cast(void*)object) (you were almost correct, what you did was get the GC size of the *class reference* which is really a pointer, and really lives on the stack, hence the 0).

-Steve
December 04, 2019
On Wednesday, 4 December 2019 at 15:38:36 UTC, Steven Schveighoffer wrote:
> On 12/4/19 3:10 AM, Erdem wrote:
>> I am used to have cool tools like valgrid massif to visualize the memory usage from C++ but in D it seems I am blind folded looking for the problem.
>> 
>> Until now I tried:
>> 
>> --vgc option which show million things which makes it not useful
>> --build=profile-gc
>> which seems to slow down my program like *200 times and I have some operation depending on time which makes creating the real case of the application impossible. The observation itself effect the experiment very badly.
>> 
>> Since this options are not useful for me, I tried to put debug logs in critical places to estimate where I am losing memory. I couldn't find a D native call to get the memory usage of the program. I had to call a shell command which also slows down the app. This slow down causes me to not be able to put this debug log too many places or in critical loops.
>> 
>> I manage to narrow it down a bit. Than I wanted to check the object instances to find out which objects are getting bigger. I tried
>> GC.sizeOf(&object) which always prints 0. GC.sizeOf also does not works with associative arrays , I don't see any usefull case than GC.sizeOf(array.ptr).
>> 
>> That is the background of my questions and my questions are:
>> 
>> Is there a better way to visualize the memory in D something like valgrid massif?
>> 
>> profile-gc literally makes to program not do anything due to slow down is there any way to profile-gc in a faster fashion?
>> 
>> Is there a native function which will return the memory usage of my program?
>
> If it's total GC memory only you are interested in, then try the D runtime switch for the GC: --DRT-gcopt=profile:1
>
> This will print out a summary of GC usage at the end of your program, and shouldn't significantly affect runtime.
>
>> 
>> Is there a function to return sizes of AAs and even class instances ?
>> 
>
> Class instances have a compiler-defined size. Try __traits(classInstanceSize, SomeClass)
>
> This is the compile-time size of that specific type. If you have a class instance, and you want the actual class size, in the case of a derived instance, you may retrieve that using typeid(classInstance).initializer.length.
>
> Note also that this is not necessarily the size consumed from the GC! If you want *that* size, you need to use GC.sizeof(cast(void*)object) (you were almost correct, what you did was get the GC size of the *class reference* which is really a pointer, and really lives on the stack, hence the 0).
>
> -Steve

GC.sizeof(cast(void*)object) will be super useful. I will use that.

I also tried GC: --DRT-gcopt=profile:1 already. It provides so little information.
I need to find out which member AA or array of which object is causing this memory problem of mine.I am ending up around 2GB of ram usage in a single day.

Is there any way to manipulate profile-gc flag on run time? Like I will start my program without it somehow and after the my program initializes I will turn it on.

One last thing in my program I am getting a message from vibe sometimes like
"leaking eventcore driver because there are still active handles". I use websockets and do web requests. I wanted to add that because I saw you fixed something with Redis about that in https://github.com/vibe-d/vibe.d/issues/2245.

Thanks for your help and replies Steve.


December 04, 2019
On 12/4/19 5:04 PM, kerdemdemir wrote:

> GC.sizeof(cast(void*)object) will be super useful. I will use that.
> 
> I also tried GC: --DRT-gcopt=profile:1 already. It provides so little information.
> I need to find out which member AA or array of which object is causing this memory problem of mine.I am ending up around 2GB of ram usage in a single day.

These are not easy to discover. Are you using 32-bit compiler? If so, it has an issue with false pointers -- basically a large piece of memory can get pinned by a non-pointer that happens to be on the stack. This is not so much an issue in 64-bit land because the chances of accidental pointers is infinitesimal.

> 
> Is there any way to manipulate profile-gc flag on run time? Like I will start my program without it somehow and after the my program initializes I will turn it on.

I'm not familiar much with the gc profiling features. Lately I have had to measure max GC size, which this does provide. Sorry I can't be more help there.

> 
> One last thing in my program I am getting a message from vibe sometimes like
> "leaking eventcore driver because there are still active handles". I use websockets and do web requests. I wanted to add that because I saw you fixed something with Redis about that in https://github.com/vibe-d/vibe.d/issues/2245.

What happens is that the eventcore driver provides a mechanism to allocate an implementation-specific chunk of data for each descriptor. This is done via C malloc. In the past, when the eventcore driver was shut down, these spaces were all deallocated when the driver was deallocated. If you had classes which held descriptors that were cleaned up by the GC, they would then try to use that space and generate a segfault.

So the eventcore driver will leak when it sees descriptors still in use, even when the GC is trying to clean it up because it doesn't know when the file descriptors will be released fully.

To fix, you need to ensure all file descriptors are destroyed deterministically. For the Redis problem, it was really difficult to solve without altering vibe.d itself, because the Redis session manager did not provide a mechanism to release all resources in the pool deterministically.

Tracking down what file descriptors are holding on, and who allocated them, is not an easy task. I had to instrument a lot of stuff to find it. Some easy possibilities may be if you aren't closing down all listening sockets before exiting main.

I will say that I still get these messages if I kill my web server while some keepalive connections are still open. But now, if there is no activity for about 10 seconds, then I get no messages.

Good luck tracking it down!

-Steve
December 07, 2019
On Wednesday, 4 December 2019 at 22:51:45 UTC, Steven Schveighoffer wrote:

I localized that the leak was actually being caused by websockets. I want to write down my experience because I did some weird stuff which seems to be working but I want to learn how it actually make sense and works.

I had a fancy work flow which caused by that bug https://github.com/vibe-d/vibe.d/issues/2169 which disallowed to open multiple sockets from my main process.

I solved that by lunching multiple process and using one web socket by one process. And communicated this process and my main process via zmqd.

My suggestion is; don't do that. Don't be super creative with current Vibe websockets. I had this unidentifable leak which took to much time to localize.

The bug I created around one year ago is solved now so I left creating processes approach and put web sockets in to a list.

Next problem I had while listening 300 websocket was I got some crushes within
webSocket.dataAvailableForRead() function, I am not sure if it is a bug or my misusage some how so I haven't created a bug yet. But my question to vibe forum can be seen https://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/thread/112309/

I solved that ( I hope ) by using something like:

if ( webSocket.waitForData(0.msecs) && webSocket.dataAvailableForRead() )

I know it looks so wierd to wait for 0.msecs but things seems to be working :/ .

The last problem I had was with closing the socket because this sockets are getting disconnected time to time. I need to close and reopen. When I call
webSocket.close() directly, after my program runs about 1 day it was freezing in a function called epoll while calling webSocket.close().

I also find another weird solution to that problem :

while ( true )
{
    auto future = vibe.core.concurrency.async( { socket.socket.close(); return true;} );
    vibe.core.core.sleep(100.msecs);
    if ( future.ready() )
        break;
    writeln( " Couldn't close the socket retring ");
}
sockets.remove(uniqStreamName);// BTW order of this removal matters if you remove
   //from your list before closing the ticket you are screwed.


Which seems to be working with 300 websockets around 2 days without any leak nor crush neither freeze.

As I pointed in the beginning I don't have any question or problems now but I am very open to feedback if you guys have any.

Erdemdem