More radical ideas about gc and reference counting (page 22) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » More radical ideas about gc and reference counting (page 22)

May 07, 2014

Re: More radical ideas about gc and reference counting

Posted by Jacob Carlborg
in reply to Manu

Jacob Carlborg

Posted in reply to Manu

On 07/05/14 05:36, Manu via Digitalmars-d wrote:

> Haha, nice! I didn't realise that all my examples for hypothetical
> consideration came back to just you! :)
>
> So then, your take on an experimental space in the compiler for
> features that are still baking seems especially relevant.
> Am I talking shit, or do you think the idea has any value? Would it be
> valuable to get users testing these (quite large) developments while
> they're still baking?

I don't think not so much for D/Objective-C but definitely for AST macros and std.serialization. D/Objective-C is pretty straight forward, at least for the user facing parts. But it wouldn't hurt to test D/Objective-C.

-- 
/Jacob Carlborg

May 07, 2014

Re: More radical ideas about gc and reference counting

Posted by Paulo Pinto
in reply to Ola Fosheim Grøstad

Paulo Pinto

Posted in reply to Ola Fosheim Grøstad

On Wednesday, 7 May 2014 at 07:26:34 UTC, Ola Fosheim Grøstad wrote:
> On Wednesday, 7 May 2014 at 06:50:34 UTC, Paulo Pinto wrote:
>> A *nix package manager is brain dead idea for software
>> development as it ties the language libraries to the specific OS
>> one is using.
>
> The difference is that you are more vulnerable by getting software from language specific repositories.

This is for software development, not end user software.

--
Paulo

May 07, 2014

Re: More radical ideas about gc and reference counting

Posted by Ola Fosheim Grøstad
in reply to Paulo Pinto

Ola Fosheim Grøstad

Posted in reply to Paulo Pinto

On Wednesday, 7 May 2014 at 12:47:05 UTC, Paulo Pinto wrote:
> This is for software development, not end user software.

Which makes avoiding trojans or rogue libraries even more critical. I avoid sources with low volume, no auditing and questionable authentication for commercial development. Basically it means downloading from the original author or using a distribution like Debian... (or creating a separate account, which I have done for D).

May 07, 2014

Re: More radical ideas about gc and reference counting

Posted by Xavier Bigand
in reply to Manu

Xavier Bigand

Posted in reply to Manu

Le 07/05/2014 05:58, Manu via Digitalmars-d a écrit :
> On 7 May 2014 08:07, Xavier Bigand via Digitalmars-d
> <digitalmars-d@puremagic.com> wrote:
>> Le 06/05/2014 13:39, Paulo Pinto a écrit :
>>>
>> Android works well, I love my nexus, it proves to me that it's possible to
>> create really smooth applications based completely on Java (not 100% of
>> that) but if we compare the Nexus 5 to iPhone 4 :
>> Memory : 2 GB RAM vs 512 MB RAM
>> CPU : Quad-core 2.3 GHz Krait 400 vs 1 GHz Cortex-A8
>> Battery : Li-Po 2300 mAh battery vs Li-Po 1420 mAh battery
>>
>> And compared to an iPhone 5s
>> Memory : 2 GB RAM vs 1 GB RAM
>> CPU : Quad-core 2.3 GHz Krait 400 vs Dual-core 1.3 GHz Cyclone
>> Battery : Li-Po 2300 mAh battery vs Li-Po 1560 mAh battery
>>
>> It's maybe not really significant but the majority of Android devices that
>> have acceptable performances have a lot of memory, a quad cores CPU and an
>> heavy battery.
>>
>> So that cool Java can run smoothly but at which price? I think the margin of
>> Apple produce is unbelievable.
>
> Yeah, these are excellent points that I've tried and perhaps failed to
> articulate properly in the past.
> The amount of 'wasted' resources required to maintain a significant
> surplus on Android devices is ridiculous, and that's why I separated
> phones from other embedded systems. While phones can, and do, do this,
> other embedded systems don't.
> To say we need to leave half of the xbox/ps4 resources idle to soak up
> intermittency is completely unworkable.
>
> It's always important to remember too that the embedded market is by
> far the largest software market in the world.
>
I started to work on Pocket PC and a year later on Nintendo DS have only 4Mo of RAM. I sill was at school at this date, and it was hard to deal with so few memory, ROM and video memory was very limited. The project I worked on was only limited by memory cause of use of precomputed pictures (it was an old adventure game).
On this console, game vendors have to pay for the ROM and backup
memories, so we put in place some compressions and data packing logics.

From this time I always take care of avoiding unnecessary absurd allocations.

May 08, 2014

Re: More radical ideas about gc and reference counting

Posted by Paulo Pinto
in reply to Xavier Bigand

Paulo Pinto

Posted in reply to Xavier Bigand

On Wednesday, 7 May 2014 at 20:09:07 UTC, Xavier Bigand wrote:
> Le 07/05/2014 05:58, Manu via Digitalmars-d a écrit :
>> On 7 May 2014 08:07, Xavier Bigand via Digitalmars-d
>> <digitalmars-d@puremagic.com> wrote:
>>> Le 06/05/2014 13:39, Paulo Pinto a écrit :
>>>>
>>> Android works well, I love my nexus, it proves to me that it's possible to
>>> create really smooth applications based completely on Java (not 100% of
>>> that) but if we compare the Nexus 5 to iPhone 4 :
>>> Memory : 2 GB RAM vs 512 MB RAM
>>> CPU : Quad-core 2.3 GHz Krait 400 vs 1 GHz Cortex-A8
>>> Battery : Li-Po 2300 mAh battery vs Li-Po 1420 mAh battery
>>>
>>> And compared to an iPhone 5s
>>> Memory : 2 GB RAM vs 1 GB RAM
>>> CPU : Quad-core 2.3 GHz Krait 400 vs Dual-core 1.3 GHz Cyclone
>>> Battery : Li-Po 2300 mAh battery vs Li-Po 1560 mAh battery
>>>
>>> It's maybe not really significant but the majority of Android devices that
>>> have acceptable performances have a lot of memory, a quad cores CPU and an
>>> heavy battery.
>>>
>>> So that cool Java can run smoothly but at which price? I think the margin of
>>> Apple produce is unbelievable.
>>
>> Yeah, these are excellent points that I've tried and perhaps failed to
>> articulate properly in the past.
>> The amount of 'wasted' resources required to maintain a significant
>> surplus on Android devices is ridiculous, and that's why I separated
>> phones from other embedded systems. While phones can, and do, do this,
>> other embedded systems don't.
>> To say we need to leave half of the xbox/ps4 resources idle to soak up
>> intermittency is completely unworkable.
>>
>> It's always important to remember too that the embedded market is by
>> far the largest software market in the world.
>>
> I started to work on Pocket PC and a year later on Nintendo DS have only 4Mo of RAM. I sill was at school at this date, and it was hard to deal with so few memory, ROM and video memory was very limited. The project I worked on was only limited by memory cause of use of precomputed pictures (it was an old adventure game).
> On this console, game vendors have to pay for the ROM and backup
> memories, so we put in place some compressions and data packing logics.
>
> From this time I always take care of avoiding unnecessary absurd allocations.

4MB?! That is a world of pleasure.

Try to cram a Z80 application into 48 KB. :)

The main problem nowadays is not automatic memory management, in
whatever form, be it GC, RC, compiler dataflow, dependent types
or whatever.

The problem is how many developers code like the memory was
infinite without pausing a second to think about their data
structures and algorithms.

Just yesterday I have re-written a Java application that in the
application hot path does zero allocations on the code under our
control.

It requires execution analysis tooling, and thinking how to write the said code.
That's it.

..
Paulo

May 09, 2014

Re: More radical ideas about gc and reference counting

Posted by Manu
in reply to Paulo Pinto

Manu

Posted in reply to Paulo Pinto

On 8 May 2014 16:11, Paulo Pinto via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On Wednesday, 7 May 2014 at 20:09:07 UTC, Xavier Bigand wrote:
>
> 4MB?! That is a world of pleasure.
>
> Try to cram a Z80 application into 48 KB. :)

I've never heard of a Z80 program running a tracing GC. (I have used
refcounting on a Z80 though...)

> The main problem nowadays is not automatic memory management, in whatever form, be it GC, RC, compiler dataflow, dependent types or whatever.
>
> The problem is how many developers code like the memory was infinite without pausing a second to think about their data structures and algorithms.

Okay, please don't be patronising. Let's assume the programmers you're talking to aren't incompetent. This discussion's existence should prove that your theory is incorrect.

Let's consider your argument though, it sounds EXACTLY like the sort
of programmer that would invent and rely on a tracing GC!
The technology practically implies a presumption of infinite (or
significant excess) memory. Such an excess suggests either the
software isn't making full use of the system it's running on, or the
programmers writing the software are like you say.

> Just yesterday I have re-written a Java application that in the application hot path does zero allocations on the code under our control.
>
> It requires execution analysis tooling, and thinking how to write the said
> code.
> That's it.

Congratulations. But I'm not sure how that helps me?

You know my arguments, how does this example of yours address them?
What were the operating and target hardware requirements of your
project? How much memory did you have? How much memory did you use?
What is the frequency of runtime resource loading/swapping? What was
the pause period and the consequence when you did collect?
Let's also bear in mind that Java's GC is worlds ahead of D's.

I am getting very tired of repeating myself and having my points
basically ignored, or dismissed with something like "my project which
doesn't actually share those requirements works fine" (not that I'm
saying you did that just now; I don't know, you need to tell me more
about your project).
I'd really like to establish as fact or fiction whether tracing GC is
_practically_ incompatible with competitive embedded/realtime
environments where pushing the hardware to the limits is a desire.
(Once upon a time, this was a matter of pride for all software
engineers)
If people can prove that I'm making it all up, and "my concerns are
invalid if I just ...", or whether my points are actually true.

It doesn't matter what awesome GC research is out there if it's incompatible with D and/or small devices that may or may not have a robust operating system. D is allegedly a systems language, and while there is no such definition, my own take is that means it shouldn't be incompatible with, or discourage certain classes of computers or software by nature, otherwise it becomes a niche language.

Please, prove me wrong. Show me how tracing collection can satisfy the basic requirements I've raised on countless prior posts, or practical workarounds that you would find reasonable if you were to consider working within those restrictions yourself and still remain compelling enough to adopt D in your corporation in the first place (implying a massive risk, and cost in retraining all the staff and retooling).

I don't know how to reconcile the problem with the existing GC, and I am not happy to sacrifice large parts of the language for it. I've made the argument before that sacrificing large parts of the language as a 'work-around' is, in essence, sacrificing practically all libraries. That is a truly absurd notion; to suggest that anybody should take advice to sacrifice access to libraries is being unrealistic.

I refer again to my example from last weekend. I was helping my mates
try and finish up their PS4 release. It turns out, the GC is very
erratic, causing them massive performance problems, and they're super
stressed about this.
Naturally, the first thing I did was scolded them for being stupid
enough to use C# on a game in the first place, but then as I tried to
search for practical options, I started to realise the gravity of the
situation, and I'm really glad I'm not wearing their shoes!
I've made the argument in the past that this is the single most
dangerous class of issue to encounter; one that emerges as a problem
only at the very end of the project and the issues are fundamental and
distributed, beyond practical control. Massive volumes of code are
committed, and no time or budget exists to revisit and repair the work
that was already signed off.
The only practical solution in this emergency situation is to start
cutting assets (read: eliminate your competitive edge against
competition), and try and get the system resource usage down to that
half-ish of the system, as demonstrated from the prior android vs
iphone comparison.

It comes to this; how can a GC ever work in a memory limited
environment? If the frequency of collection is almost 100% of
allocations, then GC must practically be banned everywhere and all the
associated language features lost. How can I rely on library code to
never allocate?
As I see it, this problem is basically synonymous with the current
issue of lazy/unreliable destruction. They would both be solved by
addressing this fundamental issue. And the only solution I know is to
use RC instead.
Eager release of memory is required to make sure the consecutive alloc
has free memory to allocate, problem solved, and destructors work too!
:)

May 09, 2014

Re: More radical ideas about gc and reference counting

Posted by Wyatt
in reply to Manu

Wyatt

Posted in reply to Manu

On Friday, 9 May 2014 at 16:12:00 UTC, Manu via Digitalmars-d wrote:
>
> Let's also bear in mind that Java's GC is worlds ahead of D's.
>
Is Sun/Oracle reference implementation actually any good?

> I am getting very tired of repeating myself and having my points
> basically ignored, or dismissed with something like "my project
> which doesn't actually share those requirements works fine" (not
> that I'm saying you did that just now; I don't know, you need to
> tell me more about your project).  I'd really like to establish
> as fact or fiction whether tracing GC is _practically_
> incompatible with competitive embedded/realtime environments
> where pushing the hardware to the limits is a [requirement].
>
I've actually become very curious about this, too.  I know that our GC isn't good, but I've seen a lot of handwaving.  The pattern lately has involved someone talking about ARC being insufficient, leading somehow to Manu asserting sufficient GC is impossible (why?), and everyone kind of ignoring it after that.

I've been digging into research on the subject while I wait for test scripts to run, and my gut feeling is it's definitely possible to get GC at least into striking distance, but I'm not nearly an expert on this area.

(Some of these are dead clever, though! I just read this one today: https://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/local-gc.pdf)

> I don't know how to reconcile the problem with the existing GC,
> and I am not happy to sacrifice large parts of the language for
> it.  I've made the argument before that sacrificing large parts
> of the language as a 'work-around' is, in essence, sacrificing
> practically all libraries. That is a truly absurd notion; to
> suggest that anybody should take advice to sacrifice access to
> libraries is being unrealistic.
>
This is important, and simply throwing up our collective hands and saying to just not use major language features (I believe I recall slices were in that list?) really doesn't sit well with me either.

But conversely, Manu, something has been bothering me: aren't you restricted from using most libraries anyway, even in C++?  "Decent" or "acceptable" performance isn't anywhere near "maximum", so shouldn't any library code that allocates in any language be equally suspect?  So from that standpoint, isn't any library you use in any language going to _also_ be tuned for performance in the hot path?  Maybe I'm barking up the wrong tree, but I don't recall seeing this point addressed.

More generally, I feel like we're collectively missing some important context:  What are you _doing_ in your 16.6ms timeslice?  I know _I'd_ appreciate a real example of what you're dealing with without any hyperbole.  What actually _must_ be done in that timeframe?  Why must collection run inside that window?  What must be collected when it runs in that situation?  (Serious questions.)

See, in the final-by-default discussions, you clearly explained the issues and related them well to concerns that are felt broadly, but this... yeah, I don't really have any context for this, when D would already be much faster than the thirty years of C navel lint (K&R flavour!) that I grapple in my day job.

-Wyatt

May 09, 2014

Re: More radical ideas about gc and reference counting

Posted by Francesco Cattoglio
in reply to Wyatt

Francesco Cattoglio

Posted in reply to Wyatt

On Friday, 9 May 2014 at 21:05:18 UTC, Wyatt wrote:
> But conversely, Manu, something has been bothering me: aren't you restricted from using most libraries anyway, even in C++?  "Decent" or "acceptable" performance isn't anywhere near "maximum", so shouldn't any library code that allocates in any language be equally suspect?  So from that standpoint, isn't any library you use in any language going to _also_ be tuned for performance in the hot path?  Maybe I'm barking up the wrong tree, but I don't recall seeing this point addressed.
>
> More generally, I feel like we're collectively missing some important context:  What are you _doing_ in your 16.6ms timeslice?  I know _I'd_ appreciate a real example of what you're dealing with without any hyperbole.  What actually _must_ be done in that timeframe?  Why must collection run inside that window?  What must be collected when it runs in that situation?  (Serious questions.)
I'll try to guess: if you want something running at 60 Frames per Second, 16.6ms is the time
you have to do everything between frames. This means that in that timeframe
you have to:
-update your game state.
-possibly process all network I/O.
-prepare the rendering pipeline for the next frame.

Updating the game state can imply make computations on lots of stuff: physics, animations, creation and deletion of entities and particles, AI logic... pick your poison. At every frame you will have an handful of objects being destroyed and a few resources that might go forgotten. One frame would probably only need very little objects collected. But given some times the amount of junk can grow out of control easily. Your code will end up stuttering at some point (because of random collections at random times), and this can be really bad.

May 09, 2014

Re: More radical ideas about gc and reference counting

Posted by luka8088
in reply to Walter Bright

luka8088

Posted in reply to Walter Bright

On 6.5.2014. 20:10, Walter Bright wrote:
> On 5/6/2014 10:47 AM, Manu via Digitalmars-d wrote:
>> On 7 May 2014 01:46, Andrei Alexandrescu via Digitalmars-d
>> I'm not even sure what the process it... if I go through and "LGTM" a
>> bunch of pulls, does someone accept my judgement and click the merge
>> button?
>> You can see why I might not feel qualified to do such a thing?
> 
> You don't need to be qualified (although you certainly are) to review PR's. The process is anyone can review/comment on them. Non-language-changing PR's can be pulled by anyone on "Team DMD". Language changing PR's need to be approved by Andrei and I.
> 
> "Team DMD" consists of people who have a consistent history of doing solid work reviewing PR's.
> 

Interesting. This really needs to pointed out on the site.

May 10, 2014

Re: More radical ideas about gc and reference counting

Posted by Manu
in reply to Wyatt

Manu

Posted in reply to Wyatt

On 10 May 2014 07:05, Wyatt via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On Friday, 9 May 2014 at 16:12:00 UTC, Manu via Digitalmars-d wrote:
>
> I've been digging into research on the subject while I wait for test scripts to run, and my gut feeling is it's definitely possible to get GC at least into striking distance, but I'm not nearly an expert on this area.
>
> (Some of these are dead clever, though! I just read this one today: https://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/local-gc.pdf)

Well, that's a nice paper. But I've seen heaps of people paste heaps
of papers, and nobody EVER come along and say "this, we could do this
in D".
I'm happy to be proven wrong, in fact, there's nothing I'd like more.
I'm not an expert on GC (and I don't really want to be), but I've been
trolling this forum for years and the conversation emerges regularly.
As far as I can tell, there is a de facto agreement that any of those
potentially awesome GC's are not actually feasible in the context of
D.

This is perhaps my biggest mistake in my commitment to D; I came along
and identified it as a red-flag on day 1, but I saw there was lots of
discussion and activity on the matter. I assumed I shouldn't worry
about it, it would sort itself out...
Years later, still, nobody seems to have any idea what to do, at
least, such that it would be addressed in a manner acceptable to my
work.
The only option I know that works is Obj-C's solution, as demonstrated
by a very successful embedded RTOS, and compared to competition, runs
silky smooth. Indeed iOS makes it a specific design goal that it
should always feel silky smooth, never stuttery, they consider it a
point of quality, and I completely agree. I don't know what other
horse to back?

>> I don't know how to reconcile the problem with the existing GC, and I am not happy to sacrifice large parts of the language for it.  I've made the argument before that sacrificing large parts of the language as a 'work-around' is, in essence, sacrificing practically all libraries. That is a truly absurd notion; to suggest that anybody should take advice to sacrifice access to libraries is being unrealistic.
>>
> This is important, and simply throwing up our collective hands and saying to just not use major language features (I believe I recall slices were in that list?) really doesn't sit well with me either.
>
> But conversely, Manu, something has been bothering me: aren't you restricted from using most libraries anyway, even in C++?

No, and this is where everyone seems to completely lose the point.
Just because high performance/realtime code has time critical parts,
that tends not to be much of your code. It is small by volume, and
received disproportionate attention by coders. It's fine, forget about
that bit, except for that it needs to be able to run uninterrupted.
_Most_ of your code is ancillary logic and glue, which typically runs
in response to events, and even though it's execution frequency is
super-low, it's still often triggered in realtime thread (just not
very often).
There are also many background threads you employ to do low priority
tasks where the results aren't an immediate requirement.
Some of these tasks include: resource management, loading and
preparing data, communications/networking, processing low-frequency
work; almost all of these tasks make heavy use of 3rd party libraries,
and allocate.

You can't have an allocation stop the world, because it stops the realtime threads too, at least, under any mythical GC scheme I'm aware of that's been proposed as a potential option for D?

> "Decent" or "acceptable"
> performance isn't anywhere near "maximum", so shouldn't any library code
> that allocates in any language be equally suspect?  So from that standpoint,
> isn't any library you use in any language going to _also_ be tuned for
> performance in the hot path?  Maybe I'm barking up the wrong tree, but I
> don't recall seeing this point addressed.

A library which is a foundation of a realtime system will employ realtime practises. Those are not the libraries I'm worried about. Most libraries that are useful aren't those libraries. They are tool libs, and they are typically written to be simple and maintainable, and usually by a PC developer, with no real consideration towards specific target applications.

> More generally, I feel like we're collectively missing some important context:  What are you _doing_ in your 16.6ms timeslice?  I know _I'd_ appreciate a real example of what you're dealing with without any hyperbole.

It doesn't matter what I'm doing in my 16ms timeslice most of the
time. I'm running background threads, and also triggering occasional
low frequency events in the realtime thread.
Again, most code by volume is logic and glue, it is not typically
serviced intensively like the core realtime systems, and most often,
by the junior programmers...
I appreciate that I haven't successfully articulated the function of
this code, but that is because to describe "what I'm doing" would be
to give you a million lines of code to nit-pick through. Almost
anything you can imagine, is the answer, as long as it's reasonably
short such that it's not worth the synchronisation costs of queueing
it with a parallel job manager or whatever.
This logic and glue needs to have access to all the conveniences of
the language for productivity and maintainability reasons, and
typically, if you execute only one or 2 of these bits of code per
frame, it will have no meaningful impact on performance... unless it
allocates, triggers a collect, and freezes the system. I repeat, the
juniors... D has lots of safety features to save programmers from
themselves, and I don't consider it a workable option, or goal for the
language, to suggest we should abandon them.

ARC overhead would have no meaningful impact on performance, GC may potentially freeze execution. I am certain I would never notice ARC overhead on a profiler, and if I did, there are very simple methods to shift it elsewhere in the few specific circumstances it emerges.

> What actually _must_ be done in that timeframe?  Why must collection run inside that window?  What must be collected when it runs in that situation? (Serious questions.)

Anything can and does happen in low-frequency event logic. Collection
'must' run in that window in the event an allocation exists, and there
is no free memory, which is the likely scenario.
strings? closures? array initialisations were a problem (i'm not sure
if that was considered a bug and fixed though?). even some should-be
stack allocations are allocated when the compiler thinks it's a
requirement for safety.
string interaction with C libs is good source of allocations, but
there are many.
Or even small transient allocations, temp's to do small amounts of
work which would otherwise be released upon scope exit. Banning that
sort of practise throws a massive spanner in conventional software
engineering practise that everyone is familiar with.

> See, in the final-by-default discussions, you clearly explained the issues and related them well to concerns that are felt broadly, but this... yeah, I don't really have any context for this, when D would already be much faster than the thirty years of C navel lint (K&R flavour!) that I grapple in my day job.

I appreciate your argument. I realise it's why I've had so little effect... I just can't easily articulate specific instances, because they are basically unknowable until they happen, and they're hidden by literally millions of loc. Implicit allocations appear all the time, and if you deliberately ban them, you quickly find all those language features don't work and your code gets a lot more difficult to write and maintain. It's a practicality, risk aversion, and maintenance issue. If you had to tag main() with @nogc, and then write a millions-of-loc program in a team of 60, D becomes so much less compelling argument than otherwise. D has 2 major advantages over C++ as I see. 1, meta, and that's great, but you don't sit and write meta all the time. 2, productivity and correctness, which I find to be the more compelling case to be made to adopt D. It affects all programmers all day, every day, and we lose many aspects of the languages offering if we tag @nogc, which includes libs.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation