July 28, 2006
Karen Lanrap wrote:
> Walter Bright wrote:
> 
>> that would be a terrible design if it did.
> 
> Then have a look what the code below compiled (dmd 0.163; win) is doing under XPSP2.

I think you omitted the code.
July 28, 2006
Walter Bright wrote:

> I think you omitted the code.

Sorry, did not want to send that out with all those mistakes.

Anyway, here ist the code:

static this(){
    writefln("This code demonstrates some behaviour of the GC");
    writefln("Author: Karen Lanrap");
    writefln("License: public domain");
}
void usage(char[] name){
    writefln("usage: %s alloc vital ([-lg] | [-l] [-g]) ",
getBaseName(name));
    writefln("  alloc: size of all allocated memory");
    writefln("  vital: size of vital memory");
    writefln("  -l: leak memory");
    writefln("  -g: use GC");
}

const uint chunksize = 1_000_000;
const uint chainsize = 4;

class Hold{
    byte[chunksize / chainsize] data;
    Hold[chainsize] next;
}
Hold[] h;
Hold strongConnect(){
    for(int i=0; i< chainsize; i++) h[i] = new Hold;
    for(int i=0; i< chainsize; i++)
        for(int j = 0; j < chainsize; j++)
            h[i].next[j] = h[j];
    return h[rnd(0,chainsize)];
}

void main(char[][] args)
{
    h.length = chainsize;
    if(args.length < 3) usage(args[0]);
    assert(args.length > 2, "needs sizes of all and vital data");
    auto all = atoi(args[1]);
    auto vital = atoi(args[2]);
    assert(all>= vital, "vital data cannot be greater than all
data");
    args.length = 5;
    auto leak = args[3]== "-l" || args[4]== "-l"|| args[3]== "-lg";
    auto useGC = args[3]== "-g" || args[4]== "-g"|| args[3]== "-
lg";

    if(!useGC)
    {
        std.gc.disable();
    };
    fwritef(stderr, "initializing... ");
    Hold[] arr;
    arr.length = all;
    for(int i = 1; i < all; i++)
        arr[i] = strongConnect();
    fwritefln(stderr, "done.");
    do
    {
        fwritef(stderr, "[");
        for(int v = 0; v< vital; v++){
            uint inx = rnd(all-vital, vital);
            if(!leak)
            { // delete the strongConnect in arr[inx]
                for(int i=0 ; i< chainsize; i++)
                    if(arr[inx] != arr[inx].next[i])
                        delete arr[inx].next[i];
                delete arr[inx];
            }
            arr[inx] = strongConnect();
            fwritef(stderr, ".");
        }
        fwritef(stderr, "]");
    } while(true);
}
uint rnd(uint base, uint range)
{
    return
        cast(uint)(
            base
            + (1.0*range*rand()) / (uint.max+1.0)
        );
}
import std.gc, std.random, gcstats;
import std.stdio, std.string, std.outofmemory, std.conv, std.path;
July 29, 2006
Karen Lanrap wrote:
> Anyway, here ist the code:

I'm not sure what this code represents. ("vital" is not any memory allocation term I'm familiar with.)

1) Certainly, allocating huge numbers of megabyte arrays all pointing to each other is not at all normal use.

2) Memory is not going to be recycled if there exist pointers to it from other memory blocks that are in use.

3) D's GC doesn't return memory to the operating system, it keeps the "high water mark" allocated to the process. But this doesn't actually matter, since it is only consuming *virtual* address space if it is unused. Physical memory is only consumed if it is actually and actively referenced.

4) Most C malloc/free and C++ new/delete implementations don't return memory to the operating system after the free, either.

5) When you use -lg, what the code appears to do is allocate new memory blocks. But since the old blocks are *still actively pointed to*, they won't be released by the GC.
July 29, 2006
Walter Bright wrote:

> 5) When you use -lg, what the code appears to do is allocate new memory blocks. But since the old blocks are *still actively pointed to*, they won't be released by the GC.

They are not *actively pointed to* from any location and the GC is releasing them. Otherwise the program would acquire more and more memory as one can see with the "-l" option which enables only the leak but not the GC.

There are strong connected components(scc) of about 1MB size.
The only pointer to them is assigned to with a new scc, thereby
insulating this scc to garbage, ready to be collected in case they
are not deleted manually.

If they are not deleted manualy and the GC is enabled and collects them, then the OS does not swap out the other data anymore. Why?

That is exactly the scheme that was said not to happen.

In case of "<exeName> <mem> 100" there are only 100MB of data touched but the OS holds all <mem> data. That is why I believe the GC touches all of the <mem> data in search for blocks to collect.

If I am wrong, why is the OS disabled to swap out all untouched data, as soon as the GC is enabled?




July 29, 2006
Karen Lanrap wrote:
> Walter Bright wrote:
> 
>> 5) When you use -lg, what the code appears to do is allocate new
>> memory blocks. But since the old blocks are *still actively
>> pointed to*, they won't be released by the GC.
> 
> They are not *actively pointed to* from any location and the GC is releasing them. Otherwise the program would acquire more and more memory as one can see with the "-l" option which enables only the leak but not the GC.
> 
> There are strong connected components(scc) of about 1MB size.
> The only pointer to them is assigned to with a new scc, thereby insulating this scc to garbage,

"insulating this scc to garbage" ??

> ready to be collected in case they are not deleted manually.
> 
> If they are not deleted manualy and the GC is enabled and collects them, then the OS does not swap out the other data anymore. Why?

The statistics you mentioned do not contain any swapping information. Perhaps I'm not understanding what you mean by swapping.

> That is exactly the scheme that was said not to happen.

I'm having a very hard time understanding you.

> In case of "<exeName> <mem> 100" there are only 100MB of data touched but the OS holds all <mem> data. That is why I believe the GC touches all of the <mem> data in search for blocks to collect.

The GC scans the static data, the registers, and the stack for pointers. Any pointers in that to GC allocated data is called the 'root set'. Any GC allocated data that is pointed to by the 'root set' is also scanned for pointers to GC allocated data, recursively, until there are no more memory blocks to scan. Any GC allocated data that is not so pointed to is *not* scanned, and is added to the available pool of memory. (They are *not* returned to the operating system, thus the 'high water mark' I mentioned previously.)

It does not scan all the memory.

This is not a matter of belief, you can check the code yourself (it comes with Phobos), and you can turn on various logging features of it. I suggest doing that, I think you'll find it very interesting.


> If I am wrong, why is the OS disabled to swap out all untouched data, as soon as the GC is enabled?  

I think you have a very different idea of what the word "swapping" means in regards to virtual memory and GC than I do. For one thing, nothing at all in your program disables OS swapping. I don't think it is even possible to disable it - it's a very low level service.

You're obviously very interested in GC - why not pick up the book on it referenced in www.digitalmars.com/bibilography.html? It's a very good read.
July 29, 2006
Walter Bright wrote:

> The GC scans the static data, the registers, and the stack for pointers. Any pointers in that to GC allocated data is called the 'root set'. Any GC allocated data that is pointed to by the 'root set' is also scanned for pointers to GC allocated data, recursively, until there are no more memory blocks to scan.

I believed that the GC is working somehow that way. Therefore I raised the question as a "general problem". I repeat your words:

> until there are no more memory blocks to scan.

And this scanning seems to prevent the OS to "page out" the data hold in the first <mem> - <vital> locations in my code---I finally understand that he who told me, that there is no difference bewteen "swapping" and "paging", was wrong.

> why not pick up the book

No, thanks. I am not interested in theoretical foundations.

I observed a nasty behaviour of the code I wrote for the contest mentioned in <e9p2qg$18j8$1@digitaldaemon.com>: heavily paging.

After detecting and eliminating the memory leak, which lowered the "high water mark" from 2.2GB to 1.8GB (my machine holds 2GB of main memory), that paging was gone.

I was puzzled why this could happen although a GC is used.

Thanks for all the patience.
July 29, 2006
Karen Lanrap wrote:
> Walter Bright wrote:
> 
>> The GC scans the static data, the registers, and the stack for
>> pointers. Any pointers in that to GC allocated data is called
>> the 'root set'. Any GC allocated data that is pointed to by the
>> 'root set' is also scanned for pointers to GC allocated data,
>> recursively, until there are no more memory blocks to scan.
> 
> I believed that the GC is working somehow that way.

GC's do not "work somehow". They work exactly the way they are programmed to.

> Therefore I raised the question as a "general problem". I repeat your words:
>> until there are no more memory blocks to scan.

The only memory blocks scanned are those that have pointers to them. Nothing else.

> 
> And this scanning seems to prevent the OS to "page out" the data hold in the first <mem> - <vital> locations in my code---I finally understand that he who told me, that there is no difference bewteen "swapping" and "paging", was wrong.
> 
>> why not pick up the book
> 
> No, thanks. I am not interested in theoretical foundations.

It's not very fruitful for me to try helping you understand GC if you aren't interested in doing a little homework. The GC book is a lot more than theoretical mumbo-jumbo. It is well worth the effort to pick up and look at. I guarantee you'll be able to write much more effective programs that use GC if you understand it.

At at a minimum, at least you and I will be using the same language. We'll have the same understanding of what "virtual" vs "physical" memory is, what "swapping" is, and that "vital" has no meaning for this topic.


> I observed a nasty behaviour of the code I wrote for the contest mentioned in <e9p2qg$18j8$1@digitaldaemon.com>: heavily paging.
> 
> After detecting and eliminating the memory leak, which lowered the "high water mark" from 2.2GB to 1.8GB (my machine holds 2GB of main memory), that paging was gone.
> 
> I was puzzled why this could happen although a GC is used.

I can't help you when you say you're not interested in learning about GC.
July 29, 2006
Karen Lanrap wrote:
> Walter Bright wrote:
> 
>> why not pick up the book
> 
> No, thanks. I am not interested in theoretical foundations.

No offense, but if you intend to criticize GC as a general technique, then it might be useful to do so from an informed perspective.


> I observed a nasty behaviour of the code I wrote for the contest mentioned in <e9p2qg$18j8$1@digitaldaemon.com>: heavily paging.
> 
> After detecting and eliminating the memory leak, which lowered the "high water mark" from 2.2GB to 1.8GB (my machine holds 2GB of main memory), that paging was gone.
> 
> I was puzzled why this could happen although a GC is used.

Perhaps it would help to look at the GC code?


Sean
July 31, 2006
"Karen Lanrap" <karen@digitaldaemon.com> wrote in message news:Xns980F583A91B2Fdigitaldaemoncom@63.105.9.61...
> I observed a nasty behaviour of the code I wrote for the contest mentioned in <e9p2qg$18j8$1@digitaldaemon.com>: heavily paging.
>
> After detecting and eliminating the memory leak, which lowered the "high water mark" from 2.2GB to 1.8GB (my machine holds 2GB of main memory), that paging was gone.
>
> I was puzzled why this could happen although a GC is used.

I'm having the similar problems with the UM implementation. It runs slower and slower, memory use keeps growing and the scan keep taking longer and longer..

L.


August 02, 2006
Lionello Lunesu wrote:

> the scan keep taking longer and longer..

At least one who admits, that there might be a problem.

That's a funny community here. They seems to dislike theory and prefer coded examples. But when they have coded examples they pretend to not be able to understand them fully and do not report about their results. Instead they start to nitpick on words, the practical and theoretical background of the contributor, declare the example as non standard---and point to some theory of implementations of GC's.

The fact stays unexplored, that if there is no general problem with GC's the implementation of the GC must be at fault.
1 2 3 4
Next ›   Last »