Thread overview | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
January 22, 2002 Should all Arrays be more like Objects | ||||
---|---|---|---|---|
| ||||
As currently implemented Static arrays are on the stack, and you can without warnings pass them to functions or constructors as dynamic arrays, which may cause the dynamic array reference to remain live after the stack frame has gone. more alarming than this is my hypothosis that static arrays within objects are not noticed by the GC and thus if I hold onto a reference to an array within an object but not the object. The array will may be freed from under my feet by the GC. I understand the need for structures and static arrays, and the desire to get down and dirty. but you can do that and remain safe if constrained a little. as point of reference C and C++ have the register keyword, and in C is means behaves LIKE a register (not IS A register) but like one; so it has no address but in C++ it means optimise this value's use, the former is IMHO a much nicer behaviour, as a programmer I can restrict what I can do with the variable allowing the compiler to optimise if it feels in the mood to do so. if the objectives of D are to reduce the pit falls when writing OO programmes then and classes as all dynamic ... either all Arrays and Structures are dynamic too or an automatic or stack attribute for method parameters two rules (I think), references to stack items can not be stored in dynamic items and references to stack items of the current frame can not be put into any items parameters refer to. so a stack parameter can only be put into stack items of the current frame or passed to methods that take stack parameters without manual intervention. The programmer would have to create a dynamic copy of the static item. i.e. dup the array. because D is static, the compiler could do the analysis for you. Mike. |
January 22, 2002 Re: Should all Arrays be more like Objects | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mike Wynn | "Mike Wynn" <mike.wynn@l8night.co.uk> wrote in message news:a2k4gh$cc7$1@digitaldaemon.com... > As currently implemented Static arrays are on the stack, and you can without > warnings pass them to functions or constructors as dynamic arrays, which may > cause the dynamic array reference to remain live after the stack frame has gone. Yes, that is a general problem with anything on the stack you take a pointer to. > more alarming than this is my hypothosis that static arrays within objects are not noticed by the GC and thus if I hold onto a reference to an array within an object but not the object. The array will may be freed from under > my feet by the GC. No, the GC will recognize that case. > I understand the need for structures and static arrays, and the desire to > get down and dirty. but you can do that and remain safe if constrained a > little. > as point of reference C and C++ have the register keyword, and in C is means > behaves LIKE a register (not IS A register) but like one; so it has no > address > but in C++ it means optimise this value's use, the former is IMHO a much > nicer behaviour, as a programmer I can restrict what I can do with the > variable allowing the compiler to optimise if it feels in the mood to do so. > > if the objectives of D are to reduce the pit falls when writing OO > programmes then > and classes as all dynamic ... > either all Arrays and Structures are dynamic too > or an automatic or stack attribute for method parameters > two rules (I think), references to stack items can not be stored in dynamic > items > and references to stack items of the current frame can not be put into any > items parameters refer to. > so a stack parameter can only be put into stack items of the current frame > or passed to methods that take stack parameters without manual intervention. > The programmer would have to create a dynamic copy of the static item. i.e. > dup the array. > because D is static, the compiler could do the analysis for you. Those are good points, and I thought a long time about them in the design of D. I eventually came to the conclusion that needing to make copies of arrays and objects all the time was too much overhead. |
January 23, 2002 Re: Should all Arrays be more like Objects | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | The GC could check reference counts on stack objects as they go out of scope. You could throw a "ReferencesRemainException" if somebody is still pointing to it. This is much like my idea that delete should throw an exception if references remain.... -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ] |
January 23, 2002 Re: Should all Arrays be more like Objects | ||||
---|---|---|---|---|
| ||||
Posted in reply to Russ Lewis | As I read it the GC is a Mark and sweep, conservative collector. so no ref counts to use, to add them would waste lots of time for all the cases when not required and to only have them for stack arrays would require the same analysis as for automatic promotion to heap arrays. so the only way to see if it's live is to walk the whole heap every time you exit a non leaf method that has passed an array as a parameter. and to put more spanners in the works, a conservative collector can get it wrong so it may not be live, you may just have an int with the same value. Who gets the exception., the thread who's stack has been used or the thread(s) with the array. instead of going to the expence of trying to detect it, why no disallow it, or make sure it never happens if all array where objects except those within objects as the GC keeps the parent alive you never have a problem. the compiler can optimise arrays in leaf methods, and with good memory and one that does not coalesce too argessively then the right size block will be available. Mike. "Russ Lewis" <spamhole-2001-07-16@deming-os.org> wrote in message news:3C4EE3D3.30B44B51@deming-os.org... > The GC could check reference counts on stack objects as they go out of scope. > You could throw a "ReferencesRemainException" if somebody is still pointing to > it. > > This is much like my idea that delete should throw an exception if references > remain.... > > -- > The Villagers are Online! villagersonline.com > > .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] > .[ (a version.of(English).(precise.more)) is(possible) ] > ?[ you want.to(help(develop(it))) ] > > |
January 23, 2002 Re: Should all Arrays be more like Objects | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mike Wynn | Mike Wynn wrote: > As I read it the GC is a Mark and sweep, conservative collector. > so no ref counts to use, to add them would waste lots of time for all the > cases when not required and to only have them for stack arrays would require > the same analysis as for automatic promotion to heap arrays. You're right, sorry. I forgot about it. I keep thinking about it as a reference-counting collector because I think that it should be one :) NOTE TO WALTER: David Bacon of IBM Research has implemented a reference counting collector that does loop detection without any other algorithms. That means no mark-and-sweep, stack traversal, or anything like that. It's called Recycler and is implemented on the Jalapeno JVM. http://www.research.ibm.com/people/d/dfb is his home page, and http://www.research.ibm.com/people/d/dfb/papers.html#Bacon01Java describes how he does it. > instead of going to the expence of trying to detect it, why no disallow it, > or make sure it never happens > if all array where objects except those within objects as the GC keeps the > parent alive > you never have a problem. > the compiler can optimise arrays in leaf methods, and with good memory and > one that does not coalesce too argessively then the right size block will be > available. I thought about proposing a function pointer type modifier that would allow you to declare pointers (passed as parameters) that couldn't be saved after the function returns...syntax error if you did otherwise. However, that breaks this code: int StartJob(char command[]); // returns an id int WaitForJobToFinish(int ID); void MyFunc() { char command = "do stuff"; int id = StartJob(command); WaitForJobToFinish(id); }; You want to ensure that all pointers to 'command' are gone before MyFunc() exits. But we can't because the thing we're calling has to save a pointer to it between the calls to StartJob() and WaitForJobToFinish()...unless we force StartJob() to copy the buffer, but then we're getting unnecessary overhead again. I'm open to ideas here... -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ] |
January 24, 2002 Re: Should all Arrays be more like Objects | ||||
---|---|---|---|---|
| ||||
Posted in reply to Russ Lewis | "Russ Lewis" <spamhole-2001-07-16@deming-os.org> wrote in message news:3C4F034C.BD4D051A@deming-os.org... > NOTE TO WALTER: David Bacon of IBM Research has implemented a reference counting > collector that does loop detection without any other algorithms. That means no > mark-and-sweep, stack traversal, or anything like that. It's called Recycler > and is implemented on the Jalapeno JVM. > > http://www.research.ibm.com/people/d/dfb is his home page, and http://www.research.ibm.com/people/d/dfb/papers.html#Bacon01Java describes how > he does it. Thanks! But I don't think reference counting can work with D, since the only pointer to an object might be an interior pointer. I don't see how that can work with reference counts. |
January 24, 2002 Re: Should all Arrays be more like Objects | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | "Walter" <walter@digitalmars.com> wrote in message news:a2nt3q$2v4f$2@digitaldaemon.com... > Thanks! But I don't think reference counting can work with D, since the only > pointer to an object might be an interior pointer. I don't see how that can > work with reference counts. Well... There's an option that sounds interesting to me: assuming that pointers are somewhat deprecated, as they should be in a higher-level language, they could be implemented more inefficiently as a pointer to an object, and an offet within that object. Then, reference counting would work. This could be even more interesting if you consider the possibility of implementing an extra level on indirection in object pointers (references), where the pointer is an index into an array of active "real object pointers". In that case, you can easily implement a self-compacting heap. Anyway, it's an idea. Oh, and you could still have pointers into real memory by making the object be at position zero. For OS programming (if you still want to have that as an option), you could also have a special type (call it "address") which would be just an untyped raw pointer (just like in assembler), and would be used with casts. Salutaciones, JCAB |
January 24, 2002 Re: Should all Arrays be more like Objects | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | Walter wrote: > Thanks! But I don't think reference counting can work with D, since the only pointer to an object might be an interior pointer. I don't see how that can work with reference counts. I've been thinking about that. I see several possible solutions, none of which is optimal. Be sure to read (5), which I think has the most promise: SIMPLE IDEAS: 1) All pointers are actually double pointers; one element points to the actual thing being referenced, while one points to the garbage-collectable thing which it is inside of; 2) When setting (or unsetting) a pointer, do a lookup into a table of objects to find what the pointer is inside of. If this was a hash table, then you would have a reasonably good chance of getting an immediate hit as long as the pointer is NOT an interior pointer... MORE COMPLEX ONES: 4) Require that all reference-counted objects be on n-byte boundaries (perhaps 32 byte boundaries). Then, you can find the start of the object as follows: a) Put a magic value as the first 32 bits of every reference counted object b) Have a memory table (as in #2), but include in the object a pointer to that table. c) When looking for the "root" of an object, first go back to the 32-byte boundary and check for the magic value. If you don't find it, keep going back 32 bytes at a time until you do. d) When you find a magic value, guess that it's the root, and look at the "pointer into the memory table." Confirm that this pointer is to a valid address in the table and that the table at that entry point points back to this object. If so, then you have the object root. If not, return to (c) and continue. 5) Make all members of an object also be garbage collected. However, you set a flag (or a pointer or offset value) that indicates that the method of "garbage collecting" a member is simply to decrement the reference count of the parent object. a) When using a pointer to an object to get a pointer to a member of it, you check the count on the child first. If this count is 0, then you increment the count on the parent as well as the child, as you are making the child "refer" to the parent. b) Later, when you lose the pointer to the child, you decrement its count. If the child ends up "garbage collected", then you don't delete the memory...instead, you decrement the count on the parent. Then the parent might or might not get collected. c) Dynamic arrays will have to be implemented as 12 bytes instead of 8; each array has a pointer to a "root" (root of the entire array, not this slice of it), and a pointer to the "root of this slice", and a length field. Thus, you can always know where to find the reference count when an array reference is destroyed. d) You can't have any GC overhead inside structs (so while you can GC a struct, you can't GC its members). Also, it's not practical to have a GC record for every 'int' inside a class. Perhaps if you take a pointer to such elements, you have to uses a special non-GC pointer. This isn't actually so bad, if you think about it... e) If you absolutely demand GC-ing members of a class: For small members of classes, you could group them into n-byte chunks (each of which starts with a GC record) and require that all classes be allocated on a similar boundary. Thus, except for pointers into a struct (which have to be non-GC pointers), you can always find a GC record by taking the pointer and rounding down to the n-byte boundary. -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ] |
January 25, 2002 Re: Should all Arrays be more like Objects | ||||
---|---|---|---|---|
| ||||
Posted in reply to Russ Lewis | The solutions you propose I think will work, but there'll be considerable added overhead in both execution time and memory consumption. Also, some of the interoperability with C objects will no longer be possible. I've written some programs using conservative GC and benchmarked them against fast reference counting schemes (written by others). My GC outperformed them, so I'm not sold on the idea. The GC I used is the one that comes with the Phobos library source, and it's a pretty basic one. Upgrading it to a generational collector would produce a big increment in performance. -Walter "Russ Lewis" <spamhole-2001-07-16@deming-os.org> wrote in message news:3C5039CF.397B4A12@deming-os.org... > Walter wrote: > > > Thanks! But I don't think reference counting can work with D, since the only > > pointer to an object might be an interior pointer. I don't see how that can > > work with reference counts. > > I've been thinking about that. I see several possible solutions, none of which > is optimal. Be sure to read (5), which I think has the most promise: > > SIMPLE IDEAS: > > 1) All pointers are actually double pointers; one element points to the actual > thing being referenced, while one points to the garbage-collectable thing which > it is inside of; > > 2) When setting (or unsetting) a pointer, do a lookup into a table of objects to > find what the pointer is inside of. If this was a hash table, then you would > have a reasonably good chance of getting an immediate hit as long as the pointer > is NOT an interior pointer... > > MORE COMPLEX ONES: > > 4) Require that all reference-counted objects be on n-byte boundaries (perhaps > 32 byte boundaries). Then, you can find the start of the object as follows: > a) Put a magic value as the first 32 bits of every reference counted object > b) Have a memory table (as in #2), but include in the object a pointer to > that table. > c) When looking for the "root" of an object, first go back to the 32-byte > boundary and check for the magic value. If you don't find it, keep going back > 32 bytes at a time until you do. > d) When you find a magic value, guess that it's the root, and look at the > "pointer into the memory table." Confirm that this pointer is to a valid address in the table and that the table at that entry point points back to this > object. If so, then you have the object root. If not, return to (c) and > continue. > > 5) Make all members of an object also be garbage collected. However, you set a > flag (or a pointer or offset value) that indicates that the method of "garbage > collecting" a member is simply to decrement the reference count of the parent > object. > a) When using a pointer to an object to get a pointer to a member of it, you > check the count on the child first. If this count is 0, then you increment the > count on the parent as well as the child, as you are making the child "refer" to > the parent. > b) Later, when you lose the pointer to the child, you decrement its count. > If the child ends up "garbage collected", then you don't delete the memory...instead, you decrement the count on the parent. Then the parent might > or might not get collected. > c) Dynamic arrays will have to be implemented as 12 bytes instead of 8; each > array has a pointer to a "root" (root of the entire array, not this slice of > it), and a pointer to the "root of this slice", and a length field. Thus, you > can always know where to find the reference count when an array reference is > destroyed. > d) You can't have any GC overhead inside structs (so while you can GC a > struct, you can't GC its members). Also, it's not practical to have a GC record > for every 'int' inside a class. Perhaps if you take a pointer to such elements, > you have to uses a special non-GC pointer. This isn't actually so bad, if you > think about it... > e) If you absolutely demand GC-ing members of a class: For small members > of classes, you could group them into n-byte chunks (each of which starts with a > GC record) and require that all classes be allocated on a similar boundary. > Thus, except for pointers into a struct (which have to be non-GC pointers), you > can always find a GC record by taking the pointer and rounding down to the n-byte boundary. > > -- > The Villagers are Online! villagersonline.com > > .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] > .[ (a version.of(English).(precise.more)) is(possible) ] > ?[ you want.to(help(develop(it))) ] > > |
Copyright © 1999-2021 by the D Language Foundation