July 25, 2006
Actually, I believe it's just:

import std.gc;

// ...

ubyte[] data = new ubyte[1024 * 1024];
std.gc.removeRange(data);

This tells it, afaik, not to scan the described range for pointers.  It seems to me entirely possible that the compiler could automatically generate this code for new ubyte[] and such calls.

-[Unknown]


> "Derek Parnell" <derek@nomail.afraid.org> wrote in message news:dg25ykpt8kxw$.1r2mhu0u851l0.dlg@40tude.net...
>> On Mon, 24 Jul 2006 04:55:17 +0200, Bob W wrote:
>>
>>> /*
>>> The std.file.read() function in dmd causes a performance
>>> issue after reading large files from 100MB upwards.
>>> Reading the file seems to be no problem, but cleanup
>>> afterwards takes forever.
>> Its a GC effect. The GC is scanning through the buffer looking for
>> addresses to clean up.
> 
> Wouldn't it be possible to add some way of telling the GC not to scan something? Perhaps there's already something in std.gc, I didn't check, but I actually think the compiler could be doing this by checking the TypeInfo. I wouldn't go so far as to expect it to only scan the pointer fields of a struct, but at least it could ignore char[] and float[] (and other arrays containing non-pointer types).
> 
> I've made that Universal Machine of the programming contest (see thread below) and am running into memory problems. I have the feeling that a lot of the opcodes in the machine code are considered as pointers. Memory just keeps growing and the GC cycles take longer and longer.
> 
> It was great to write the UM without having to worry about memory, but now I'll have to worry about it and in a totally new way: trying to outsmart the GC. Either that, or malloc/memset/free : (
> 
> L. 
> 
> 
July 25, 2006
On Mon, 24 Jul 2006 22:02:31 -0700, Unknown W. Brackets wrote:

> Actually, I believe it's just:
> 
> import std.gc;
> 
> // ...
> 
> ubyte[] data = new ubyte[1024 * 1024];
> std.gc.removeRange(data);
> 
> This tells it, afaik, not to scan the described range for pointers.  It seems to me entirely possible that the compiler could automatically generate this code for new ubyte[] and such calls.

Yes, but wouldn't that RAM be deallocated only at program end? If you wanted it deallocated earlier you would still have to delete it.

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocrity!"
25/07/2006 3:09:46 PM
July 25, 2006
On Tue, 25 Jul 2006 15:15:24 +1000, Derek Parnell <derek@nomail.afraid.org> wrote:
> On Mon, 24 Jul 2006 22:02:31 -0700, Unknown W. Brackets wrote:
>
>> Actually, I believe it's just:
>>
>> import std.gc;
>>
>> // ...
>>
>> ubyte[] data = new ubyte[1024 * 1024];
>> std.gc.removeRange(data);
>>
>> This tells it, afaik, not to scan the described range for pointers.  It
>> seems to me entirely possible that the compiler could automatically
>> generate this code for new ubyte[] and such calls.
>
> Yes, but wouldn't that RAM be deallocated only at program end? If you
> wanted it deallocated earlier you would still have to delete it.

The range pointed at by the array 'data' shouldn't be scanned, but there is no reason the array reference itself cannot be scanned and therefore collected, right? And if the array reference is collected, the data will be freed, just not scanned for other pointers, right?

Regan

July 25, 2006
Yes, that's what I meant.  You'd remove the range of memory from scanning, but keep the root.

Please correct me if I'm wrong.

Thanks,
-[Unknown]


> On Tue, 25 Jul 2006 15:15:24 +1000, Derek Parnell <derek@nomail.afraid.org> wrote:
>> On Mon, 24 Jul 2006 22:02:31 -0700, Unknown W. Brackets wrote:
>>
>>> Actually, I believe it's just:
>>>
>>> import std.gc;
>>>
>>> // ...
>>>
>>> ubyte[] data = new ubyte[1024 * 1024];
>>> std.gc.removeRange(data);
>>>
>>> This tells it, afaik, not to scan the described range for pointers.  It
>>> seems to me entirely possible that the compiler could automatically
>>> generate this code for new ubyte[] and such calls.
>>
>> Yes, but wouldn't that RAM be deallocated only at program end? If you
>> wanted it deallocated earlier you would still have to delete it.
> 
> The range pointed at by the array 'data' shouldn't be scanned, but there is no reason the array reference itself cannot be scanned and therefore collected, right? And if the array reference is collected, the data will be freed, just not scanned for other pointers, right?
> 
> Regan
> 
July 25, 2006
On Tue, 25 Jul 2006 17:18:24 +1200, Regan Heath wrote:

> On Tue, 25 Jul 2006 15:15:24 +1000, Derek Parnell <derek@nomail.afraid.org> wrote:
>> On Mon, 24 Jul 2006 22:02:31 -0700, Unknown W. Brackets wrote:
>>
>>> Actually, I believe it's just:
>>>
>>> import std.gc;
>>>
>>> // ...
>>>
>>> ubyte[] data = new ubyte[1024 * 1024];
>>> std.gc.removeRange(data);
>>>
>>> This tells it, afaik, not to scan the described range for pointers.  It seems to me entirely possible that the compiler could automatically generate this code for new ubyte[] and such calls.
>>
>> Yes, but wouldn't that RAM be deallocated only at program end? If you wanted it deallocated earlier you would still have to delete it.
> 
> The range pointed at by the array 'data' shouldn't be scanned, but there is no reason the array reference itself cannot be scanned and therefore collected, right? And if the array reference is collected, the data will be freed, just not scanned for other pointers, right?

Yes that sort of makes sense. So does the parameter stack get scanned after a function returns but before the calling function takes control again, because that's where the array reference resides usually. Or is it that when a 'new' is done, the returned address is added to a list that the GC uses to free up RAM from?

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocrity!"
25/07/2006 3:29:21 PM
July 25, 2006
Derek Parnell wrote:
> On Mon, 24 Jul 2006 04:55:17 +0200, Bob W wrote:
> 
>> /*
>> The std.file.read() function in dmd causes a performance
>> issue after reading large files from 100MB upwards.
>> Reading the file seems to be no problem, but cleanup
>> afterwards takes forever.
> 
> Its a GC effect. The GC is scanning through the buffer looking for
> addresses to clean up.

I just read the response to my post and it seems the read() function should do the std.gc.removeRange on the memory containing the read file, no? There's no way that memory could contain pointers.

Of course, somebody could change the memory afterwards, replacing internal file references with memory pointers and it'll get f**** up.

L.
July 25, 2006
Derek,

Your question doesn't make complete sense to me, so I'm going to back up a bit.  Please forgive me if I patronize you, or fail to answer your question.

The garbage collector has "ranges" of memory it scans (as I'm completely sure you already know.)  For example, you could add an arbitrary range.  Consider:

void* p = malloc(100);
std.gc.addRange(p, p + cast(ptrdiff_t) 100);

This will cause it to scan the space between (p) and (p + 100) or pointers (roots.)  Removing a range does not mean, as far as I can see, that the memory it points to will never be freed; just that it will not be scanned.

An addRange() happens automatically when you new with the garbage collector.

-[Unknown]


> On Tue, 25 Jul 2006 17:18:24 +1200, Regan Heath wrote:
> 
>> On Tue, 25 Jul 2006 15:15:24 +1000, Derek Parnell  <derek@nomail.afraid.org> wrote:
>>> On Mon, 24 Jul 2006 22:02:31 -0700, Unknown W. Brackets wrote:
>>>
>>>> Actually, I believe it's just:
>>>>
>>>> import std.gc;
>>>>
>>>> // ...
>>>>
>>>> ubyte[] data = new ubyte[1024 * 1024];
>>>> std.gc.removeRange(data);
>>>>
>>>> This tells it, afaik, not to scan the described range for pointers.  It
>>>> seems to me entirely possible that the compiler could automatically
>>>> generate this code for new ubyte[] and such calls.
>>> Yes, but wouldn't that RAM be deallocated only at program end? If you
>>> wanted it deallocated earlier you would still have to delete it.
>> The range pointed at by the array 'data' shouldn't be scanned, but there  is no reason the array reference itself cannot be scanned and therefore  collected, right? And if the array reference is collected, the data will  be freed, just not scanned for other pointers, right?
> 
> Yes that sort of makes sense. So does the parameter stack get scanned after
> a function returns but before the calling function takes control again,
> because that's where the array reference resides usually. Or is it that
> when a 'new' is done, the returned address is added to a list that the GC
> uses to free up RAM from? 
> 
July 25, 2006
Yet, someone could also do this:

ubyte[] buffer = std.file.read(filename);
*(buffer.ptr + 9000) = 0;

That would probably be a Bad Thing as well, doesn't mean Phobos should worry about it...

I completely agree that Phobos' read() should have a removeRange() call there, unless it is decided to add such a thing to the standard library.

-[Unknown]


> Derek Parnell wrote:
>> On Mon, 24 Jul 2006 04:55:17 +0200, Bob W wrote:
>>
>>> /*
>>> The std.file.read() function in dmd causes a performance
>>> issue after reading large files from 100MB upwards.
>>> Reading the file seems to be no problem, but cleanup
>>> afterwards takes forever.
>>
>> Its a GC effect. The GC is scanning through the buffer looking for
>> addresses to clean up.
> 
> I just read the response to my post and it seems the read() function should do the std.gc.removeRange on the memory containing the read file, no? There's no way that memory could contain pointers.
> 
> Of course, somebody could change the memory afterwards, replacing internal file references with memory pointers and it'll get f**** up.
> 
> L.
July 25, 2006
On Mon, 24 Jul 2006 23:04:09 -0700, Unknown W. Brackets wrote:

> Derek,
> 
> Your question doesn't make complete sense to me, so I'm going to back up a bit.  Please forgive me if I patronize you, or fail to answer your question.

Not a problem.

> The garbage collector has "ranges" of memory it scans (as I'm completely
> sure you already know.)  For example, you could add an arbitrary range.
>   Consider:
> 
> void* p = malloc(100);
> std.gc.addRange(p, p + cast(ptrdiff_t) 100);
> 
> This will cause it to scan the space between (p) and (p + 100) or pointers (roots.)  Removing a range does not mean, as far as I can see, that the memory it points to will never be freed; just that it will not be scanned.

So long as the root itself is stored somewhere that the GC can find it. I guess this is done via the addRoot() call and I assume that 'new' automatically calls this.

> I completely agree that Phobos' read() should have a removeRange() call there, unless it is decided to add such a thing to the standard library.

I just tried something like that out. In file.d I changed

    buf = new byte[size];

to ...

    void *p = malloc(size);
    if (p == null)
        throw new OutOfMemoryException;
    buf = cast(byte[])p[0..size];


I assume that the " = p[x..y]" construct adds the root to the GC.

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocrity!"
25/07/2006 4:17:40 PM
July 25, 2006
Derek Parnell wrote:
>.....
>     void *p = malloc(size);
>     if (p == null)
>         throw new OutOfMemoryException;
>     buf = cast(byte[])p[0..size];
> 
> I assume that the " = p[x..y]" construct adds the root to the GC.

I doubt that.

L.