May 25, 2013
On 25 May 2013 05:05, H. S. Teoh <hsteoh@quickfur.ath.cx> wrote:

> On Fri, May 24, 2013 at 07:55:44PM +0200, deadalnix wrote:
> > On Friday, 24 May 2013 at 15:17:00 UTC, Manu wrote:
> > >One important detail to consider for realtime usage, is that it's very unconventional to allocate at runtime at all...  Perhaps a couple of short lived temp buffers each frame, and the occasional change in resources as you progress through a world (which are probably not allocated in GC memory anyway).  Surely the relatively high temporal consistency of the heap across cycles can be leveraged here somehow to help?
> >
> > That is good because it means not a lot of floating garbage.
>
> Isn't the usual solution here to use a memory pool that gets deallocated in one shot at the end of the cycle? So during a frame, you'd create a pool, allocate all short-lived objects on it, and at the end free the entire pool in one shot (which could just be a no-op if you recycle the pool memory for the temp objects in the next frame). Long-lived objects, of course, will have to live in the heap, and since they usually aren't in GC memory anyway, it wouldn't matter.
>

This totally depends on the task. Almost every task will have its own
solution. I think there are 3 common approaches though:
1. Just don't allocate. Seriously, you don't need dynamic memory anywhere
near as much as you think you do. Get creative!
2. Use a pool like you say.
3. Use a scratch buffer or some sort. Allocate from this buffer linearly,
and wipe it clean each frame. Similar to a pool but supporting irregularly
sized allocations.

A naīve, hackish implementation might be a function to reset all GC
> memory to a clean slate. So basically, you treat the entire GC memory as your pool, and you allocate at will during a single frame; then at the end of the frame, you reset the GC, which is equivalent to collecting every object from GC memory except it can probably be done much faster than a real collection cycle. Anything that needs to live past a single frame will have to be allocated via malloc/free. So this way, you don't need any collection cycle at all.
>

Problem with implementing that pattern in the GC, is it's global now.
You can no longer choose the solution for the problem as such.
How do you allocate something with long life? malloc?
What do non-realtime threads to?

Of course, this may interact badly with certain language constructs: if
> any reference to GC objects lingers past a frame, you may break language guarantees (e.g. immutable array gets reused, violating immutability when you dereference the stale array pointer in the next frame). But if the per-frame code has no escaping GC references, this problem won't occur. Maybe if the per-frame code is marked pure? It doesn't work if you need to malloc/free, though (as those are inherently impure -- the pointers need to survive past the current frame). Can UDAs be used somehow to enforce no escaping GC references but allow non-GC references to persist past the frame?
>
>
> T
>
> --
> People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANG
>


May 25, 2013
On 25 May 2013 11:26, Manu <turkeyman@gmail.com> wrote:

> On 25 May 2013 03:55, deadalnix <deadalnix@gmail.com> wrote:
>
>> With real time constraint, a memory overhead is better than a pause.
>>
>
> I wouldn't necessarily agree. Depends on the magnitude of each.
> What sort of magnitude are we talking?
> If you had 64mb of ram, and no virtual memory, would you be happy to
> sacrifice 20% of it? 5% of it?
>

Actually, I don't think I've made this point clearly before, but it is of critical importance.

The single biggest threat when considering unexpected memory-allocation, a
la, that in phobos, is NOT performance, it is non-determinism.
Granted, this is the biggest problem with using a GC on embedded hardware
in general.

So let's say I need to keep some free memory over-head, so that I don't run
out of memory when a collect hasn't happened recently...
How much over-head do I need? I can't afford much/any, so precisely how
much do I need?
Understand, I have no virtual-memory manager, it won't page, it's not a
performance problem, it will just crash if I mis-calculate this value.
And does the amount of overhead required change throughout development? How
often do I need to re-calibrate?
What about memory fragmentation? Functions that perform many small
short-lived allocations have a tendency to fragment the heap.

This is probably the most critical reason why phobos function's can't
allocate internally. General realtime code may have some small flexibility,
but embedded use has hard limits.
So we need to know where allocations are coming from for reasons of
determinism. We need to be able to tightly control these factors to make
confident use of a GC.

The more I think about it, the more I wonder if ref-counting is just better
for strictly embedded use across the board...?
Does D actually have a ref-counted GC? Surely it wouldn't be particularly
hard? Requires compiler support though I suppose.


May 25, 2013
On Friday, 24 May 2013 at 19:44:23 UTC, Jonathan M Davis wrote:
> We already have stuff like format vs formattedWrite where one allocates and the
> other takes an output range. We should adopt that practice in general. Where
> possible, it should probably be done with an overload of the function, but
> where that's not possible, we can simply create a new function with a similar
> name.

Sounds good to me. Should the overloads return the output range or void?
May 25, 2013
On Saturday, 25 May 2013 at 02:41:00 UTC, Brad Anderson wrote:
> On Friday, 24 May 2013 at 19:44:23 UTC, Jonathan M Davis wrote:
>> We already have stuff like format vs formattedWrite where one allocates and the
>> other takes an output range. We should adopt that practice in general. Where
>> possible, it should probably be done with an overload of the function, but
>> where that's not possible, we can simply create a new function with a similar
>> name.
>
> Sounds good to me. Should the overloads return the output range or void?

If it returned the output range it would be possible to make another function which returns a temporary output range and then easily chain together function calls:

CallWindowsApiW(mystr.writeUTF16z(tempBuffer()))

No GC allocation but not an unpleasant syntax either.
May 25, 2013
On Saturday, May 25, 2013 04:40:58 Brad Anderson wrote:
> On Friday, 24 May 2013 at 19:44:23 UTC, Jonathan M Davis wrote:
> > We already have stuff like format vs formattedWrite where one
> > allocates and the
> > other takes an output range. We should adopt that practice in
> > general. Where
> > possible, it should probably be done with an overload of the
> > function, but
> > where that's not possible, we can simply create a new function
> > with a similar
> > name.
> 
> Sounds good to me. Should the overloads return the output range or void?

Right now, all of the functions that we have like that don't return the output range, but I don't know that it would be a bad idea if they did.

- Jonathan M Davis
May 25, 2013
On Saturday, 25 May 2013 at 01:26:19 UTC, Manu wrote:
> Freeing is a no-realtime-cost operation, since memory management is usually
> scheduled for between-scenes, or passed to other threads.
> And I've never heard of a major title that uses smart pointers, and assigns
> them around the place at runtime.
> I'm accustomed to memory management having a virtually zero cost at runtime.
> So I don't think it's biased at all (in the sense you say), I think I'm
> being quite reasonable.
>

Same goes for the GC, if you don't allocate, it wont trigger.

>
> How much floating garbage? This might be acceptable... I don't know enough
> about it.
>

It about how much garbage you produce while the GC is collecting. This won't be collected before the next cycle. You say you don't generate a lot of garbage, so the cost should be pretty low.

> That is a easy way to export a part of the load in another thread,
>> improving concurrency in the application with little effort.
>>
>
> Are you saying a concurrent GC would operate exclusively in another thread?
> How does it scan the stack of all other threads?
>
> With real time constraint, a memory overhead is better than a pause.
>

Yes, it imply a pause to scan stack/registers, but then the thread can live it's life and the heap get scanned/collected. You never need to stop the world.

> I wouldn't necessarily agree. Depends on the magnitude of each.
> What sort of magnitude are we talking?
> If you had 64mb of ram, and no virtual memory, would you be happy to
> sacrifice 20% of it? 5% of it?
>

They are so many different variations here with each pro and cons. Hard to give some hard numbers. In non VM code, you have basically 2 choices :
 - Tax on every pointer write and check a flag to know if some operations are needed. if the flag is true, you mark the old value as a root to the GC.
 - Only while collecting using page protection (seems like a better option for you as you'll not be collecting that much). The cost is way higher when collecting, but it is free when you aren't.

> Right. But what's the overhead of a scan process (that's almost entirely
> redundant work)?

Roughly proportional to the live set of object you have. It is triggered when your heap grow past a certain limit.
May 25, 2013
On Saturday, 25 May 2013 at 01:56:42 UTC, Manu wrote:
> Understand, I have no virtual-memory manager, it won't page, it's not a
> performance problem, it will just crash if I mis-calculate this value.

So the GC is kind of out.
May 25, 2013
On 25 May 2013 15:00, deadalnix <deadalnix@gmail.com> wrote:

> On Saturday, 25 May 2013 at 01:56:42 UTC, Manu wrote:
>
>> Understand, I have no virtual-memory manager, it won't page, it's not a performance problem, it will just crash if I mis-calculate this value.
>>
>
> So the GC is kind of out.
>

Yeah, I'm wondering if that's just a basic truth for embedded.
Can D implement a ref-counting GC? That would probably still be okay, since
collection is immediate.

Modern consoles and portables have plenty of memory; can use a GC, but simpler/embedded platforms probably just can't. An alternative solution still needs to be offered for that sort of hardware.


May 25, 2013
On Saturday, 25 May 2013 at 05:18:12 UTC, Manu wrote:
> On 25 May 2013 15:00, deadalnix <deadalnix@gmail.com> wrote:
>
>> On Saturday, 25 May 2013 at 01:56:42 UTC, Manu wrote:
>>
>>> Understand, I have no virtual-memory manager, it won't page, it's not a
>>> performance problem, it will just crash if I mis-calculate this value.
>>>
>>
>> So the GC is kind of out.
>>
>
> Yeah, I'm wondering if that's just a basic truth for embedded.
> Can D implement a ref-counting GC? That would probably still be okay, since
> collection is immediate.
>

This is technically possible, but you said you make few allocations. So with the tax on pointer write or the reference counting, you'll pay a lot to collect very few garbages. I'm not sure the tradeoff is worthwhile.

Paradoxically, when you create few garbage, GC are really goos as they don't need to trigger often. But if you need to add a tax on each reference write/copy, you'll probably pay more tax than you get out of it.

> Modern consoles and portables have plenty of memory; can use a GC, but
> simpler/embedded platforms probably just can't. An alternative solution
> still needs to be offered for that sort of hardware.

May 25, 2013
On 25 May 2013 15:29, deadalnix <deadalnix@gmail.com> wrote:

> On Saturday, 25 May 2013 at 05:18:12 UTC, Manu wrote:
>
>> On 25 May 2013 15:00, deadalnix <deadalnix@gmail.com> wrote:
>>
>>  On Saturday, 25 May 2013 at 01:56:42 UTC, Manu wrote:
>>>
>>>  Understand, I have no virtual-memory manager, it won't page, it's not a
>>>> performance problem, it will just crash if I mis-calculate this value.
>>>>
>>>>
>>> So the GC is kind of out.
>>>
>>>
>> Yeah, I'm wondering if that's just a basic truth for embedded.
>> Can D implement a ref-counting GC? That would probably still be okay,
>> since
>> collection is immediate.
>>
>>
> This is technically possible, but you said you make few allocations. So with the tax on pointer write or the reference counting, you'll pay a lot to collect very few garbages. I'm not sure the tradeoff is worthwhile.
>

But it would be deterministic, and if the allocations are few, the cost should be negligible.


Paradoxically, when you create few garbage, GC are really goos as they
> don't need to trigger often. But if you need to add a tax on each reference write/copy, you'll probably pay more tax than you get out of it.


They're still non-deterministic though. And unless (even if?) they're
precise, they might leak.

What does ObjC do? It seems to work okay on embedded hardware (although not
particularly memory-constrained hardware).
Didn't ObjC recently reject GC in favour of refcounting?