Greedy memory handling (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » Greedy memory handling (page 2)

September 11, 2013

Re: Greedy memory handling

Posted by Joseph Rushton Wakeling
in reply to Dmitry Olshansky

Joseph Rushton Wakeling

Posted in reply to Dmitry Olshansky

On 11/09/13 15:45, Dmitry Olshansky wrote:
> Problem is - said GC-freed memory could be then reused in some way. I can't
> imagine how you'd test that the block that is allocated is *still your old* block.

Ahh, nasty.  I'd assumed that the buffer would have been reset to null in the event that the GC freed its memory.

September 11, 2013

Re: Greedy memory handling

Posted by monarch_dodra
in reply to Joseph Rushton Wakeling

monarch_dodra

Posted in reply to Joseph Rushton Wakeling

On Wednesday, 11 September 2013 at 13:33:23 UTC, Joseph Rushton Wakeling wrote:
> On 11/09/13 15:13, monarch_dodra wrote:
>> That's somewhat better, as it would allow the GC to collect my buffer, if it
>> wants to, but I wouldn't actually know about it afterwards which leaves me screwed.
>
> Just to clarify, is this buffer meant only for internal use in your function or is it meant to be externally accessed as well?
>  I'd kind of assumed the former.
>
> Either way, isn't it sufficient to have some kind of
>
>     if (buf is null)
>     {
>         // allocate the buffer
>     }
>
> check in place?  The basic model seems right -- at the moment when you need the buffer, you check if it's allocated (and if not, allocate it as needed); you indicate to the GC that it shouldn't collect the memory; you use the buffer; and the moment it's no longer needed, you indicate to the GC that it's collectable again.
>
> It means having to be very careful to check the buffer's allocation status whenever you want to use it, but I think that's an unavoidable consequence of wanting a static variable that can be freed if needed.
>
> The alternative I thought of was something like comparing the size difference between the currently-needed buffer and the last-needed buffer (... or if you want to be over-the-top, compare to a running average:-), and if the current one is sufficiently smaller, free the old one and re-alloc a new one; but that's a bit _too_ greedy in the free-up-memory stakes, I think.

The buffer is meant strictly for internal use. It never escapes the function it is used in, which not re-entrant either.

Basically, I'm storing the buffer in a "static ubyte[]", and if there isn't enough room for what I'm doing, I simply make it grow. No problems there.

The issue I'm trying to solve is "and the moment it's no longer needed" part. The function is really just a free function, in a library. The user could use it ever only once, or use it very repeatedly, I don't know. I particular, the amount of buffer needed has a 1:1 correlation with the user's input size. The user could repeatedly call me with input in the size of a couple of bytes, or just once or twice with input in the megabytes.

I *could* just allocate and forget about it, but I was curious about having a mechanism where the buffer would just be "potentially collected" between two calls. As a form of "failsafe" if it got too greedy, or if the user just hasn't used the function in a while.

September 12, 2013

Re: Greedy memory handling

Posted by Jacob Carlborg
in reply to monarch_dodra

Jacob Carlborg

Posted in reply to monarch_dodra

On 2013-09-11 10:06, monarch_dodra wrote:
> I have a function that will *massively* benefit from having a persistent
> internal buffer it can re-use (and grow) from call to call, instead of
> re-allocating on every call.
>
> What I don't want is either of:
> 1. To set a fixed limitation of size, if the user ends up making
> repeated calls to something larger to my fixed size.
> 2. For a single big call which will allocate a HUGE internal buffer that
> will consume all my memory.
>
> What I need is some sort of lazy buffer. Basically, the allocation
> holds, but I don't want the to prevent the GC from collecting it if it
> deems it has gotten too big, or needs more memory.
>
> Any idea on how to do something like that? Or literature?

How about keeping a stack or static buffer. If that gets too small use a new buffer. When you're done with the new buffer set it to null to allow the GC to collect it. Then repeat.

-- 
/Jacob Carlborg

September 12, 2013

Re: Greedy memory handling

Posted by H. S. Teoh
in reply to Jacob Carlborg

H. S. Teoh

Posted in reply to Jacob Carlborg

On Thu, Sep 12, 2013 at 08:27:59AM +0200, Jacob Carlborg wrote:
> On 2013-09-11 10:06, monarch_dodra wrote:
> >I have a function that will *massively* benefit from having a persistent internal buffer it can re-use (and grow) from call to call, instead of re-allocating on every call.
> >
> >What I don't want is either of:
> >1. To set a fixed limitation of size, if the user ends up making
> >repeated calls to something larger to my fixed size.
> >2. For a single big call which will allocate a HUGE internal buffer
> >that will consume all my memory.
> >
> >What I need is some sort of lazy buffer. Basically, the allocation holds, but I don't want the to prevent the GC from collecting it if it deems it has gotten too big, or needs more memory.
> >
> >Any idea on how to do something like that? Or literature?
> 
> How about keeping a stack or static buffer. If that gets too small use a new buffer. When you're done with the new buffer set it to null to allow the GC to collect it. Then repeat.
[...]

The problem is, he wants to reuse the buffer next time if the GC hasn't collected it yet.

Here's an idea, though. It doesn't completely solve the problem, but it just occurred to me that "weak pointers" (i.e., ignored by the GC for the purposes of marking) can be simulated by XOR'ing the pointer value with some mask so that it's not recognized as a pointer by the GC. This can be encapsulated by a weak pointer struct that automatically does the translation:

	struct WeakPointer(T) {
		enum size_t mask = 0xdeadbeef;
		union Impl {
			T* ptr;
			size_t uintVal;
		}
		Impl impl;
		void set(T* ptr) @system {
			impl.ptr = ptr;
			impl.uintVal ^= mask;
		}
		T* get() @system {
			Impl i = impl;
			i.uintVal ^= mask;
			return i.ptr;
		}
	}

	WeakPointer!Buffer bufferRef;

	void doWork(Args...) {
		T* buffer;
		if (bufferRef.get() is null) {
			// Buffer hasn't been allocated yet
			buffer = allocateNewBuffer();
			bufferRef.set(buffer);
		} else {
			void *p;
			core.memory.GC.getAttr(p);
			if (p is null || p != bufferRef.get()) {
				// GC has collected previous buffer
				buffer = allocateNewBuffer();
				bufferRef.set(buffer);
			}
		}
		useBuffer(buffer);
		...
	}

Note that the inner if block is not 100% safe, because there's no guarantee that even if the base pointer of the block hasn't changed, the GC hasn't reallocated the block to somebody else. So this part is still yet to be solved.


T

-- 
It is widely believed that reinventing the wheel is a waste of time; but I disagree: without wheel reinventers, we would be still be stuck with wooden horse-cart wheels.

September 12, 2013

Re: Greedy memory handling

Posted by Dmitry Olshansky
in reply to H. S. Teoh

Dmitry Olshansky

Posted in reply to H. S. Teoh

12-Sep-2013 17:51, H. S. Teoh пишет:
> On Thu, Sep 12, 2013 at 08:27:59AM +0200, Jacob Carlborg wrote:
>> On 2013-09-11 10:06, monarch_dodra wrote:
>>> I have a function that will *massively* benefit from having a
>>> persistent internal buffer it can re-use (and grow) from call to
>>> call, instead of re-allocating on every call.
>>>
>>> What I don't want is either of:
>>> 1. To set a fixed limitation of size, if the user ends up making
>>> repeated calls to something larger to my fixed size.
>>> 2. For a single big call which will allocate a HUGE internal buffer
>>> that will consume all my memory.
>>>
>>> What I need is some sort of lazy buffer. Basically, the allocation
>>> holds, but I don't want the to prevent the GC from collecting it if
>>> it deems it has gotten too big, or needs more memory.
>>>
>>> Any idea on how to do something like that? Or literature?
>>
>> How about keeping a stack or static buffer. If that gets too small
>> use a new buffer. When you're done with the new buffer set it to
>> null to allow the GC to collect it. Then repeat.
> [...]
>
> The problem is, he wants to reuse the buffer next time if the GC hasn't
> collected it yet.
>
> Here's an idea, though. It doesn't completely solve the problem, but it
> just occurred to me that "weak pointers" (i.e., ignored by the GC for
> the purposes of marking) can be simulated by XOR'ing the pointer value
> with some mask so that it's not recognized as a pointer by the GC. This
> can be encapsulated by a weak pointer struct that automatically does the
> translation:
>
> 	struct WeakPointer(T) {
> 		enum size_t mask = 0xdeadbeef;
> 		union Impl {
> 			T* ptr;
> 			size_t uintVal;
> 		}
> 		Impl impl;
> 		void set(T* ptr) @system {
> 			impl.ptr = ptr;
> 			impl.uintVal ^= mask;
> 		}
> 		T* get() @system {
> 			Impl i = impl;
> 			i.uintVal ^= mask;
> 			return i.ptr;
> 		}
> 	}
>
> 	WeakPointer!Buffer bufferRef;
>
> 	void doWork(Args...) {
> 		T* buffer;
> 		if (bufferRef.get() is null) {
> 			// Buffer hasn't been allocated yet
> 			buffer = allocateNewBuffer();
> 			bufferRef.set(buffer);
> 		} else {
> 			void *p;
> 			core.memory.GC.getAttr(p);

This line above is not 100% good idea .. at least with deadbeaf as mask.

If we do know what OS you compile for we may just flip the say upper bit and get a pointer into kernel space (and surely that isn't in GC pool). Even then your last paragraph pretty much destroys it.


Better option is to have finalizer hooked up to set some flag. Then _after_ restoring the pointer we consult that flag variable.

> 			if (p is null || p != bufferRef.get()) {
> 				// GC has collected previous buffer
> 				buffer = allocateNewBuffer();
> 				bufferRef.set(buffer);
> 			}
> 		}
> 		useBuffer(buffer);
> 		...
> 	}
>
> Note that the inner if block is not 100% safe, because there's no
> guarantee that even if the base pointer of the block hasn't changed, the
> GC hasn't reallocated the block to somebody else. So this part is still
> yet to be solved.
>
>
> T
>


-- 
Dmitry Olshansky

September 12, 2013

Re: Greedy memory handling

Posted by Jacob Carlborg
in reply to H. S. Teoh

Jacob Carlborg

Posted in reply to H. S. Teoh

On 2013-09-12 15:51, H. S. Teoh wrote:

> The problem is, he wants to reuse the buffer next time if the GC hasn't
> collected it yet.

I was thinking he could reuse the stack/static buffer. Basically using two buffers, one static and one dynamic.

-- 
/Jacob Carlborg

September 12, 2013

Re: Greedy memory handling

Posted by H. S. Teoh
in reply to Dmitry Olshansky

H. S. Teoh

Posted in reply to Dmitry Olshansky

On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:
> 12-Sep-2013 17:51, H. S. Teoh пишет:
[...]
> >	struct WeakPointer(T) {
> >		enum size_t mask = 0xdeadbeef;
> >		union Impl {
> >			T* ptr;
> >			size_t uintVal;
> >		}
> >		Impl impl;
> >		void set(T* ptr) @system {
> >			impl.ptr = ptr;
> >			impl.uintVal ^= mask;
> >		}
> >		T* get() @system {
> >			Impl i = impl;
> >			i.uintVal ^= mask;
> >			return i.ptr;
> >		}
> >	}
> >
> >	WeakPointer!Buffer bufferRef;
> >
> >	void doWork(Args...) {
> >		T* buffer;
> >		if (bufferRef.get() is null) {
> >			// Buffer hasn't been allocated yet
> >			buffer = allocateNewBuffer();
> >			bufferRef.set(buffer);
> >		} else {
> >			void *p;
> >			core.memory.GC.getAttr(p);
> 
> This line above is not 100% good idea .. at least with deadbeaf as mask.
> 
> If we do know what OS you compile for we may just flip the say upper bit and get a pointer into kernel space (and surely that isn't in GC pool). Even then your last paragraph pretty much destroys it.

Well, that was just an example value. :) If we know which OS it is and how it assigns VM addresses, then we can adjust the mask appropriately.

But yeah, calling GC.getAttr is unreliable since you can't tell whether the block is what you had before, or somebody else's new data.

[...]
> Better option is to have finalizer hooked up to set some flag. Then _after_ restoring the pointer we consult that flag variable.

Good idea. The problem is, how to set a finalizer on a memory block that can change in size? The OP's original situation was that the buffer can be extended while in use, but I don't know of any D type that can associate a dtor with a ubyte[] array (note that the GC collecting the wrapper struct/class around the ubyte[] is not the same as collecting the actual memory block storing the ubyte[] -- the former can happen without the latter).

T

-- 
People tell me that I'm skeptical, but I don't believe it.

September 12, 2013

Re: Greedy memory handling

Posted by Dmitry Olshansky
in reply to H. S. Teoh

Dmitry Olshansky

Posted in reply to H. S. Teoh

12-Sep-2013 20:51, H. S. Teoh пишет:
> On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:
>> 12-Sep-2013 17:51, H. S. Teoh пишет:
> [...]
>>> 	struct WeakPointer(T) {
>>> 		enum size_t mask = 0xdeadbeef;
>>> 		union Impl {
>>> 			T* ptr;
>>> 			size_t uintVal;
>>> 		}
>>> 		Impl impl;
>>> 		void set(T* ptr) @system {
>>> 			impl.ptr = ptr;
>>> 			impl.uintVal ^= mask;
>>> 		}
>>> 		T* get() @system {
>>> 			Impl i = impl;
>>> 			i.uintVal ^= mask;
>>> 			return i.ptr;
>>> 		}
>>> 	}
>>>
>>> 	WeakPointer!Buffer bufferRef;
>>>
>>> 	void doWork(Args...) {
>>> 		T* buffer;
>>> 		if (bufferRef.get() is null) {
>>> 			// Buffer hasn't been allocated yet
>>> 			buffer = allocateNewBuffer();
>>> 			bufferRef.set(buffer);
>>> 		} else {
>>> 			void *p;
>>> 			core.memory.GC.getAttr(p);
>>
>> This line above is not 100% good idea .. at least with deadbeaf as
>> mask.
>>
>> If we do know what OS you compile for we may just flip the say upper
>> bit and get a pointer into kernel space (and surely that isn't in GC
>> pool). Even then your last paragraph pretty much destroys it.
>
> Well, that was just an example value. :) If we know which OS it is and
> how it assigns VM addresses, then we can adjust the mask appropriately.
>
> But yeah, calling GC.getAttr is unreliable since you can't tell whether
> the block is what you had before, or somebody else's new data.
>

It occured to me that there are modes where full address space is available, typically so on x86 app running on top of x64 kernel (e.g. in Windows Wow64 could do that, Linux also has so-called x32 ABI).

>
> [...]
>> Better option is to have finalizer hooked up to set some flag. Then
>> _after_ restoring the pointer we consult that flag variable.
>
> Good idea. The problem is, how to set a finalizer on a memory block that
> can change in size? The OP's original situation was that the buffer can
> be extended while in use, but I don't know of any D type that can
> associate a dtor with a ubyte[] array (note that the GC collecting the
> wrapper struct/class around the ubyte[] is not the same as collecting
> the actual memory block storing the ubyte[] -- the former can happen
> without the latter).
>

Double indirection? Allocate a class that has finalizer, hold that via weak-ref. The wrapper in turn contains a pointer to the buffer. The interesting point then is that one may allocate said buffer via C's realloc.

Then once helper struct is collected the finalizer is called and this is where we call free to cleanup C's heap.

I'm thinking this actually is going to work.

-- 
Dmitry Olshansky

September 12, 2013

Re: Greedy memory handling

Posted by H. S. Teoh
in reply to Dmitry Olshansky

H. S. Teoh

Posted in reply to Dmitry Olshansky

On Thu, Sep 12, 2013 at 11:13:30PM +0400, Dmitry Olshansky wrote:
> 12-Sep-2013 20:51, H. S. Teoh пишет:
> >On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:
[...]
> >>Better option is to have finalizer hooked up to set some flag. Then _after_ restoring the pointer we consult that flag variable.
> >
> >Good idea. The problem is, how to set a finalizer on a memory block that can change in size? The OP's original situation was that the buffer can be extended while in use, but I don't know of any D type that can associate a dtor with a ubyte[] array (note that the GC collecting the wrapper struct/class around the ubyte[] is not the same as collecting the actual memory block storing the ubyte[] -- the former can happen without the latter).
> >
> 
> Double indirection? Allocate a class that has finalizer, hold that via weak-ref. The wrapper in turn contains a pointer to the buffer. The interesting point then is that one may allocate said buffer via C's realloc.
> 
> Then once helper struct is collected the finalizer is called and this is where we call free to cleanup C's heap.
> 
> I'm thinking this actually is going to work.
[...]

Interesting idea, use C's malloc/realloc to hold the actual buffer. Only possible catch is, will that cause the GC to collect when it runs out of memory (which is the whole point of the OP's question)? I.e., does it make a difference in GC behaviour to allocate, say, 10MB from the GC vs. allocating 10MB from malloc/realloc?

Assuming we have that settled, something like this should work:

	bool isValid;
	final class BufWrapper {
		void* ptrToMallocedBuf;
		this(void* ptr) {
			// We need this, 'cos otherwise we don't know if
			// our weak ref to BufWrapper is still valid!
			isValid = true;

			ptrToMallocedBuf = ptr;
		}
		~this() {
			// If we're being collected, free the real
			// buffer too.
			free(ptrToMallocedBuf);
			isValid = false;
		}
	}

	// WeakPointer masks the pointer to BufWrapper in some suitable
	// way so that the GC will collect it when needed.
	WeakPointer!BufWrapper wrappedBufRef;

	void doWork(...) {
		void* buf;
		if (!isValid) {
			buf = realloc(null, bufSize);
			wrappedBufRef.set(buf);
		} else {
			buf = wrappedBufRef.get();
		}

		// use buf here.
	}


T

-- 
Public parking: euphemism for paid parking. -- Flora

September 12, 2013

Re: Greedy memory handling

Posted by monarch_dodra
in reply to Dmitry Olshansky

monarch_dodra

Posted in reply to Dmitry Olshansky

On Thursday, 12 September 2013 at 19:13:40 UTC, Dmitry Olshansky
wrote:
> Double indirection? Allocate a class that has finalizer, hold that via weak-ref. The wrapper in turn contains a pointer to the buffer. The interesting point then is that one may allocate said buffer via C's realloc.
>
> Then once helper struct is collected the finalizer is called and this is where we call free to cleanup C's heap.
>
> I'm thinking this actually is going to work.

Yum. I like this.

I was going to say: "At the end of the day, if the GC doesn't
*tell* us the collection happened, then the problem is not
solve-able. We'd need a way that would allow the GC to tell us
the memory was *finalized*". And then I'd go on to say "since our
GC is non-finalizing, there is simply no solution".

But then classes. Derp.

I'd be real interested in having a finalized solution. The
"details" of how memory addressing is not my strong suite, so I
wouldn't trust myself with all those union{ptr/size_t} things.

Thanks, I'll start toying around with this :)

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation