June 20, 2017
On Tuesday, 20 June 2017 at 11:49:49 UTC, Jacob Carlborg wrote:
> On 2017-06-20 06:54, ketmar wrote:
>
>> [...]
>
> You need to move to 64bit. Apple is already deprecating support for 32bit apps and after the next version of macOS (High Sierra) they're going to remove the support for 32bit apps.

I highly doubt that ketmar would have any intention of touching macOS regardless ;)
Besides, there are many domains where the x32 ABI is a more worthwhile upgrade from i688 than x86_64.
June 20, 2017
On Tuesday, 20 June 2017 at 11:44:41 UTC, rikki cattermole wrote:
> On 20/06/2017 12:41 PM, Jacob Carlborg wrote:
>> On 2017-06-20 06:37, ketmar wrote:
>> 
>>> it is higly depends of undocumented windows internals, and not portable between windows versions. more-or-less working implementations of `fork()` were existed at least since NT3 era, but nobody considered 'em as more than a PoC, and even next service pack can break everything.
>> 
>> I'm wondering what Windows 10 is using to implement "fork" for Windows Subsystem for Linux. If it's using these internal functions or something else.
>
> It wouldn't surprise me to learn that it was a posix layer specific syscall, meaning we can't from a native Windows process.

The Windows Subsystem for Linux is build on a new form processes called
picoprocesses. There's a whole API build specifically to service WSL,
that's not otherwise available (AFAIR) for security reasons to normal processes.

I highly recommend watching this talk: https://www.youtube.com/watch?v=36Ykla27FIo and browsing through this repo: https://github.com/ionescu007/lxss which reveals many interesting details about that part of Windows.

I have watched that talk a while ago and maybe I have misremembered something, but my understanding is that using the WSL infrastructure is off limits for normal Win32 processes and as such is not suitable for implementation of CoW pages for D's GC.
(I watched that talk specifically because I was interested if some of that could be used in druntime.)
June 20, 2017
On Tuesday, 20 June 2017 at 07:11:10 UTC, Dmitry Olshansky wrote:
> On Monday, 19 June 2017 at 22:50:05 UTC, Adam D. Ruppe wrote:
>> What is it about Windows that makes you call it a distant possibility? Is it just that you are unfamiliar with it or is there some specific OS level feature you plan on needing?
>
> This is mostly because I wanted to abuse lazy commit of POSIX. Now that I think of it Windows is mostly ok, except for the fork trick used in concurrent GC. As Vladimir pointed out on Windows there are other ways to do it but they are more involved.
>
> ---
> Dmitry Olshansky

BTW, Rainer Schuetze has studied this in detail and has written down some of it here: http://rainers.github.io/visuald/druntime/concurrentgc.html
June 20, 2017
> My take on D's GC problem, also spoiler - I'm going to build a new one soonish.
>
> http://olshansky.me/gc/runtime/dlang/2017/06/14/inside-d-gc.html
>
> ---
> Dmitry Olshansky

Many thanks for your efforts Dmitry :)

May I ask you if you plan to make a soft real-time GC similar to the one implemented in the Nim language ?

https://nim-lang.org/docs/gc.html
https://nim-lang.org/docs/intern.html#debugging-nim-s-memory-management

What is great about it is that we can call it regularly to collect memory a bit at a time, giving it a maximum delay for this operation.

Being able to manually specify the maximum GC delay is what makes Nim compatible with game development, as collections can be made iteratively, and on a per-thread basis.

In the worst case, we know that just one of the application threads will be delayed for a few milliseconds between two frame renderings, which is generally acceptable for games and other similar applications.

Moreover this opens to opportunity to call the GC only in the main menu or the pause menu for instance, but not during actual gameplay, so that even these few lost milliseconds will always remain unnoticed.

This is probably why Nim's author was once paid to wrap an open source game engine (Urho3D), and improve the language's native compatibility with C++ libraries.

https://forum.nim-lang.org/t/870

June 20, 2017
On Tuesday, 20 June 2017 at 15:16:01 UTC, Ecstatic Coder wrote:
>> My take on D's GC problem, also spoiler - I'm going to build a new one soonish.
>>
>> http://olshansky.me/gc/runtime/dlang/2017/06/14/inside-d-gc.html
>>
>> ---
>> Dmitry Olshansky
>
> Many thanks for your efforts Dmitry :)
>
> May I ask you if you plan to make a soft real-time GC similar to the one implemented in the Nim language ?
>
> https://nim-lang.org/docs/gc.html
> https://nim-lang.org/docs/intern.html#debugging-nim-s-memory-management
>
> What is great about it is that we can call it regularly to collect memory a bit at a time, giving it a maximum delay for this operation.
>

No incremental GC, sorry. It may grow thread-local collection one day, once spec is precise about what is allowed and what is not.


June 20, 2017
On Tue, Jun 20, 2017 at 07:47:13AM +0000, Dmitry Olshansky via Digitalmars-d wrote:
> On Monday, 19 June 2017 at 23:52:16 UTC, Vladimir Panteleev wrote:
[...]
> > - Support generational collection using write barriers implemented through memory protection.
> 
> Super slow sadly. That being said I belive D is just fine without generational GC. The generational hypothesis just doesn't hold to the extent it holds in say Java. My hypothesis is that most performance minded applications already allocate temporaries using region allocator of sorts (or using C heap).
[...]

FWIW, here's a data point to the contrary:

One of my projects involves constructing a (very large) AA that grows over time, and entries are never deleted.  The AA itself is persistent and lasts until the end of the program.  Besides the AA, there are a couple of arrays that also grow (more slowly) but eventually become unreferenced.  Because of the sheer size of the AA, I've observed that GC collection cycles become slower and slower, yet most of this extra work is completely needless, because the only thing that might need collecting is the arrays, yet the GC has to mark the entire AA each time, only to discover it's still live.

After some experimentation I discovered that I could get up to 40-50% performance improvement just by calling GC.disable and scheduling my own GC collection cycles via GC.collect at a slower rate than the current default setting.

>From this, it would seem to me that a generational collector would have
helped, since most of the AA will eventually migrate to older generations and most of the time the GC won't bother marking/scanning those parts.  Of course, this is only for this particular program, and I can't say that this is typical usage for D programs in general.  But I think D would still benefit from a generational collector.


T

-- 
What did the alien say to Schubert? "Take me to your lieder."
June 20, 2017
On 2017-06-20 16:03, Petar Kirov [ZombineDev] wrote:

> I highly doubt that ketmar would have any intention of touching macOS
> regardless ;)

I somehow mixed up ketmar and Guillaume Piolat (which used to go by the alias p0nce). My mistake.

-- 
/Jacob Carlborg
June 20, 2017
On Tuesday, 20 June 2017 at 16:49:44 UTC, H. S. Teoh wrote:
> On Tue, Jun 20, 2017 at 07:47:13AM +0000, Dmitry Olshansky via Digitalmars-d wrote:
>> On Monday, 19 June 2017 at 23:52:16 UTC, Vladimir Panteleev wrote:
> [...]
>
>
> FWIW, here's a data point to the contrary:
>
> One of my projects involves constructing a (very large) AA that grows over time, and entries are never deleted.  The AA itself is persistent and lasts until the end of the program.  Besides the AA, there are a couple of arrays that also grow (more slowly) but eventually become unreferenced.  Because of the sheer size of the AA, I've observed that GC collection cycles become slower and slower, yet most of this extra work is completely needless, because the only thing that might need collecting is the arrays, yet the GC has to mark the entire AA each time, only to discover it's still live.
>
> After some experimentation I discovered that I could get up to 40-50% performance improvement just by calling GC.disable and scheduling my own GC collection cycles via GC.collect at a slower rate than the current default setting.
>
>>From this, it would seem to me that a generational collector would have
> helped, since most of the AA will eventually migrate to older generations and most of the time the GC won't bother marking/scanning those parts.  Of course, this is only for this particular program, and I can't say that this is typical usage for D programs in general.  But I think D would still benefit from a generational collector.
>

Interestingly the moment you "reallocate" to expand the AA it will be considered a new object. Overall I think your case is more about faulty collection heuristics, that is collecting when there is a slim chance of getting enough of free space after collection.

>
> T


June 20, 2017
On Tue, Jun 20, 2017 at 07:14:11PM +0000, Dmitry Olshansky via Digitalmars-d wrote:
> On Tuesday, 20 June 2017 at 16:49:44 UTC, H. S. Teoh wrote:
[...]
> Interestingly the moment you "reallocate" to expand the AA it will be considered a new object.
[...]

This is not entirely true.  The *table* itself will of course get moved to a new object, but most of the size of the AA comes from its entries, and those are nodes that stay in-place. You'll still have to scan references to the table, of course, but that's a lot better than scanning all the entries as well.


T

-- 
The diminished 7th chord is the most flexible and fear-instilling chord. Use it often, use it unsparingly, to subdue your listeners into submission!
June 21, 2017
On 2017-06-20 16:16, Petar Kirov [ZombineDev] wrote:

> I highly recommend watching this talk: https://www.youtube.com/watch?v=36Ykla27FIo and browsing through this repo: https://github.com/ionescu007/lxss which reveals many interesting details about that part of Windows.

Looks interesting.

-- 
/Jacob Carlborg