Jump to page: 1 2
Thread overview
btdu - a sampling disk usage profiler for btrfs (written in D)
Nov 09
user1234
Nov 09
user1234
Nov 09
matheus
5 days ago
IGotD-
November 08
https://blog.cy.md/2020/11/08/btdu-sampling-disk-usage-profiler-for-btrfs/

https://github.com/CyberShadow/btdu

D-related thoughts:

- D programs that build fine on one Linux machine may still fail to build with mysterious linking errors on another, even when using Dub which takes care of dependency management. I saw two counts of this, caused by differences in DMD/LDC and Arch/Debian (one being that, for whatever reason, libz is not pulled in on LDC/Debian despite being a Phobos dependency). Also, LDC is the D compiler that's installed by default when the system wants a D compiler (e.g. if you try do install Dub by itself).

- The garbage collector is still a major hindrance for system programming. In this case it was due to the ioctls used being slow, and when the GC tries to stop the world to do its thing, it just hangs the entire program until ALL ioctls in all threads complete. This means it wasn't possible to have a stutter-free interactive UI, so I had to move processing to subprocesses.

- One user wondered why the program needed so many threads. The answer was that half of them were owned by the GC (it never stops its worker threads, they just sit idle).

- I used the Deimos ncurses bindings package. I'm thankful that it already existed, though I had to push some fixes to fix static linking. The most annoying part was waiting overnight for code.dlang.org to pick up the new tags, because there is no way to get it to update a package unless you're the owner, and no way to otherwise specify a dependency unless using a branch (which is deprecated and prints a big warning when your users build your program).

- Nice D features that came in useful: reflection to generate a lightweight serializer/deserializer for subprocess communication; strings as slices to allow processing them without copying them out of the network buffer; and template mixins to add common behavior to types without runtime polymorphism.

November 09
On Sunday, 8 November 2020 at 17:23:32 UTC, Vladimir Panteleev wrote:
> https://blog.cy.md/2020/11/08/btdu-sampling-disk-usage-profiler-for-btrfs/
>
> https://github.com/CyberShadow/btdu
>
> D-related thoughts:
>
> - D programs that build fine on one Linux machine may still fail to build with mysterious linking errors on another, even when using Dub which takes care of dependency management. I saw two counts of this, caused by differences in DMD/LDC and Arch/Debian (one being that, for whatever reason, libz is not pulled in on LDC/Debian despite being a Phobos dependency). Also, LDC is the D compiler that's installed by default when the system wants a D compiler (e.g. if you try do install Dub by itself).
>
> - The garbage collector is still a major hindrance for system programming. In this case it was due to the ioctls used being slow, and when the GC tries to stop the world to do its thing, it just hangs the entire program until ALL ioctls in all threads complete. This means it wasn't possible to have a stutter-free interactive UI, so I had to move processing to subprocesses.
>
> - One user wondered why the program needed so many threads. The answer was that half of them were owned by the GC (it never stops its worker threads, they just sit idle).
>
> - I used the Deimos ncurses bindings package. I'm thankful that it already existed, though I had to push some fixes to fix static linking. The most annoying part was waiting overnight for code.dlang.org to pick up the new tags, because there is no way to get it to update a package unless you're the owner, and no way to otherwise specify a dependency unless using a branch (which is deprecated and prints a big warning when your users build your program).
>
> - Nice D features that came in useful: reflection to generate a lightweight serializer/deserializer for subprocess communication; strings as slices to allow processing them without copying them out of the network buffer; and template mixins to add common behavior to types without runtime polymorphism.

I like the report about how D was efficienet to develop this tool, otherwise
what do you use it for ? What is the typical usage of such tools ?
November 09
On Monday, 9 November 2020 at 12:21:55 UTC, user1234 wrote:
> I like the report about how D was efficienet to develop this tool, otherwise
> what do you use it for ? What is the typical usage of such tools ?

Well, the README and linked blog post answer that to some extent, but my personal use cases are actually tangential to D, so I can write more about that here.

I've been using btrfs on my home system ever since switching to Linux full-time, and a few years ago I switched over the server (hosting this forum / the wiki / some other services) to it too. This allowed us to have incremental, atomic, hourly, off-site backups, which actually saved our butts big-time when the hosting provider decided to shut off the server over a clerical issue in the distant year of 2019. Some snapshots are also retained for a while to allow rollbacks or undelete files in case I fat-finger something during maintenance.

One of btrfs's boons is that across subvolumes and clones, deduplication allows reusing the same unique block across many files and snapshots, which saves space but also what enables atomic snapshots to work (with successive writes being COW). If you add compression on top of that, it can be challenging to understand what is actually using how much space, and since storage costs are not insignificant on a FOSS budget, it does need to be managed, and I was missing a tool that would help do this. Another unique benefit of btdu is that it starts displaying results almost instantly, which is great when the disk is full causing everything to be on fire and you need to free up some disk space right now.

November 09
On Monday, 9 November 2020 at 12:52:12 UTC, Vladimir Panteleev wrote:
> On Monday, 9 November 2020 at 12:21:55 UTC, user1234 wrote:
>> I like the report about how D was efficienet to develop this tool, otherwise
>> what do you use it for ? What is the typical usage of such tools ?
>
> Well, the README and linked blog post answer that to some extent, but my personal use cases are actually tangential to D, so I can write more about that here.
>
> I've been using btrfs on my home system ever since switching to Linux full-time, and a few years ago I switched over the server (hosting this forum / the wiki / some other services) to it too. This allowed us to have incremental, atomic, hourly, off-site backups, which actually saved our butts big-time when the hosting provider decided to shut off the server over a clerical issue in the distant year of 2019. Some snapshots are also retained for a while to allow rollbacks or undelete files in case I fat-finger something during maintenance.
>
> One of btrfs's boons is that across subvolumes and clones, deduplication allows reusing the same unique block across many files and snapshots, which saves space but also what enables atomic snapshots to work (with successive writes being COW). If you add compression on top of that, it can be challenging to understand what is actually using how much space, and since storage costs are not insignificant on a FOSS budget, it does need to be managed, and I was missing a tool that would help do this. Another unique benefit of btdu is that it starts displaying results almost instantly, which is great when the disk is full causing everything to be on fire and you need to free up some disk space right now.

Allright it's clearer now, thanks for the clarifications ;)
November 09
On Sunday, 8 November 2020 at 17:23:32 UTC, Vladimir Panteleev wrote:
> ...
> - The garbage collector is still a major hindrance for system programming. In this case it was due to the ioctls used being slow, and when the GC tries to stop the world to do its thing, it just hangs the entire program until ALL ioctls in all threads complete. This means it wasn't possible to have a stutter-free interactive UI, so I had to move processing to subprocesses.
> ...

I read about GC issues like this very often and my question is: Can't GC be set just to run without collecting anything, and manually set it to collect after a process is finished?

Matheus.
November 09
On Monday, 9 November 2020 at 13:33:50 UTC, matheus wrote:
> I read about GC issues like this very often and my question is: Can't GC be set just to run without collecting anything, and manually set it to collect after a process is finished?

You can disable the GC and you can run it manually, but this wouldn't help in this case, because the ioctls are run across threads in an overlapping way. It would be possible if the program was designed such that every once in a while, the main thread tells all worker threads "OK, let's do a GC so nobody start any new ioctls for now", and when the last ioctl finishes run the GC and then let worker threads start ioctls again, but this means that up to all but one worker threads are idle and waiting for the last ioctl to finish. ioctl duration varies from milliseconds to seconds in this case, so it would noticeably affect throughput.

November 09
On 11/9/20 8:41 AM, Vladimir Panteleev wrote:
> On Monday, 9 November 2020 at 13:33:50 UTC, matheus wrote:
>> I read about GC issues like this very often and my question is: Can't GC be set just to run without collecting anything, and manually set it to collect after a process is finished?
> 
> You can disable the GC and you can run it manually, but this wouldn't help in this case, because the ioctls are run across threads in an overlapping way. It would be possible if the program was designed such that every once in a while, the main thread tells all worker threads "OK, let's do a GC so nobody start any new ioctls for now", and when the last ioctl finishes run the GC and then let worker threads start ioctls again, but this means that up to all but one worker threads are idle and waiting for the last ioctl to finish. ioctl duration varies from milliseconds to seconds in this case, so it would noticeably affect throughput.
> 

It would still help I think, because for instance, the UI is probably not running ioctls, and so it wouldn't pause while you are waiting for the ioctle-running threads to finish.

-Steve
November 10
On Sunday, 8 November 2020 at 17:23:32 UTC, Vladimir Panteleev wrote:

> - D programs that build fine on one Linux machine may still fail to build with mysterious linking errors on another, even when using Dub which takes care of dependency management. I saw two counts of this, caused by differences in DMD/LDC and Arch/Debian (one being that, for whatever reason, libz is not pulled in on LDC/Debian despite being a Phobos dependency). Also, LDC is the D compiler that's installed by default when the system wants a D compiler (e.g. if you try do install Dub by itself).

I don't think this is specific to D. I've seen in the past problems caused by package maintainers not building the package in the same way as upstream. Or they split up a package in multiple packages.

> - The garbage collector is still a major hindrance for system programming. In this case it was due to the ioctls used being slow, and when the GC tries to stop the world to do its thing, it just hangs the entire program until ALL ioctls in all threads complete.

You should probably never let the GC run on a realtime thread, like audio or video processing (not sure if ioctls falls into this category). These days, modern UIs should probably fall into the realtime category.

> This means it wasn't possible to have a stutter-free interactive UI, so I had to move processing to subprocesses.

I'm not sure if it's possible to ever have a completely stutter-free UI with a stop-the-world GC.

> - One user wondered why the program needed so many threads. The answer was that half of them were owned by the GC (it never stops its worker threads, they just sit idle).

Is that the answer? I mean, the GC doesn't create any threads by itself, does it?

> - I used the Deimos ncurses bindings package. I'm thankful that it already existed, though I had to push some fixes to fix static linking. The most annoying part was waiting overnight for code.dlang.org to pick up the new tags, because there is no way to get it to update a package unless you're the owner, and no way to otherwise specify a dependency unless using a branch (which is deprecated and prints a big warning when your users build your program).

Since 2.094.0, you can specify a Git repository as a dependency [1]. You can also specify a local path as a dependency [2], useful when developing a library and an application at the same time, as two separate Dub packages.

[1] https://dlang.org/changelog/2.094.0.html#git-paths
[2] https://dub.pm/package-format-sdl.html#version-specs
November 10
On Tuesday, 10 November 2020 at 09:40:33 UTC, Jacob Carlborg wrote:
> On Sunday, 8 November 2020 at 17:23:32 UTC, Vladimir Panteleev wrote:
>
>> - D programs that build fine on one Linux machine may still fail to build with mysterious linking errors on another, even when using Dub which takes care of dependency management. I saw two counts of this, caused by differences in DMD/LDC and Arch/Debian (one being that, for whatever reason, libz is not pulled in on LDC/Debian despite being a Phobos dependency). Also, LDC is the D compiler that's installed by default when the system wants a D compiler (e.g. if you try do install Dub by itself).
>
> I don't think this is specific to D. I've seen in the past problems caused by package maintainers not building the package in the same way as upstream. Or they split up a package in multiple packages.

I think it might be less of a problem in e.g. Go.

>> - The garbage collector is still a major hindrance for system programming. In this case it was due to the ioctls used being slow, and when the GC tries to stop the world to do its thing, it just hangs the entire program until ALL ioctls in all threads complete.
>
> You should probably never let the GC run on a realtime thread, like audio or video processing (not sure if ioctls falls into this category). These days, modern UIs should probably fall into the realtime category.

Doing UI without GC in D would be pretty painful.

But, by itself the GC doesn't add much latency to introduce stutter in the UI - a GC scan is generally quick enough that the UI doesn't feel laggy or stuttery. The problem is that the GC is waiting for all threads to finish their ioctls, while the program otherwise is completely suspended. This affects not just UI, but throughput.

>> - One user wondered why the program needed so many threads. The answer was that half of them were owned by the GC (it never stops its worker threads, they just sit idle).
>
> Is that the answer? I mean, the GC doesn't create any threads by itself, does it?

Yes, it does, since the introduction of parallel heap scanning in 2.087:

https://dlang.org/changelog/2.087.0.html#gc_parallel

>> - I used the Deimos ncurses bindings package. I'm thankful that it already existed, though I had to push some fixes to fix static linking. The most annoying part was waiting overnight for code.dlang.org to pick up the new tags, because there is no way to get it to update a package unless you're the owner, and no way to otherwise specify a dependency unless using a branch (which is deprecated and prints a big warning when your users build your program).
>
> Since 2.094.0, you can specify a Git repository as a dependency [1]. You can also specify a local path as a dependency [2], useful when developing a library and an application at the same time, as two separate Dub packages.
>
> [1] https://dlang.org/changelog/2.094.0.html#git-paths
> [2] https://dub.pm/package-format-sdl.html#version-specs

This is super useful. Thanks.

November 10
On Tuesday, 10 November 2020 at 10:42:09 UTC, Vladimir Panteleev wrote:
> But, by itself the GC doesn't add much latency to introduce stutter in the UI - a GC scan is generally quick enough that the UI doesn't feel laggy or stuttery. The problem is that the GC is waiting for all threads to finish their ioctls, while the program otherwise is completely suspended. This affects not just UI, but throughput.

Would a thread local GC with reference counted shared objects work for your use case?

« First   ‹ Prev
1 2