December 23, 2007 Re: My Language Feature Requests | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christopher Wright | "Christopher Wright" <dhasenan@gmail.com> wrote in message news:fklr1g$1uat$1@digitalmars.com... > This is to fix the stuff I botched with my other reply. > > Craig Black wrote: >> a) Disallow overloading new and delete for classes, and make classes strictly for GC, perhaps with an exception for classes instantiated on the stack using scope. > > You are just making sure that the garbage collector is handling all memory that is associated with objects. I don't see a point to this. The collector won't try to move memory that it doesn't control. It has nothing to do with the garbage collector run-time stuff. It is giving the compiler more information so that compile-time checks can be done. > > You could do bad things with overloading new/delete, but those are hardly unique situations. > Granted. There are so many ways to mess things up with pointers. It's hard to make a systems language "safe". I guess my approach would be to make it "safer". > > Then the compiler could disallow taking the address of a class field, > > since we know the resulting pointer would pointer to the GC heap. > > Note that this would be a compile-time check, and so would not degrade > > run-time performance. > > That's not necessary, since you can map a source range to a destination range. It would be a simplifying assumption that improves performance, by changing two comparisons and an addition for each pointer (plus one subtraction per move) to one comparison and one assignment for each pointer. But you're going through a large amount of memory, so that's not a serious concern, I think. > >> a) Preceding a pointer declaration with fixed would allow that pointer to take the address in the GC heap. > > It'd be undefined behavior to do otherwise. But safe as long as no collections happen before you use the pointer. > Unless I am missing something, this would require a run-time check for each pointer assignment or pointer arithmetic operation. Personally, I would make every effort to avoid this overhead. Pointers should be lightweight and fast. | |||
December 23, 2007 Re: My Language Feature Requests | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | >
> Unless I am missing something, this would require a run-time check for each pointer assignment or pointer arithmetic operation. Personally, I would make every effort to avoid this overhead. Pointers should be lightweight and fast.
I hereby detract this statement. These run-time checks could be optional. They could be like asserts or bounds-checking that is removed in release mode. Then all that compile-time stuff I mentioned before would be unnecessary I think.
| |||
December 23, 2007 Re: My Language Feature Requests | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Frits van Bommel | Frits van Bommel wrote: > Christopher Wright wrote: >> While there is a fixed reference to the GC object, it is pinned. If that reference is rebound to another GC object, the original object is unpinned and the new one is pinned. >> >> How to mark these is a difficult problem. On a 64-bit machine, I'd say you just use the most significant bit as a flag; you're not going to use petabytes of address space. > > Since "fixedness" as proposed would be a compile-time property, and you already need metadata to find pointers to implement a moving GC, such a flag could be in that metadata instead of in the pointer itself. (The OffsetTypeInfo could say "there's a pointer at offset 8, of type Object, and it's fixed") Yes, I thought of that. Currently, however, offTi isn't populated. Just like the Interface* that's supposed to be the first element of each interface's vtbl pointer. It would be useful if it existed, but no cheese. > If run-time pinning is used instead (where whether the GC cell pointed to by a pointer is pinned is not known at compile time), it could be a simple (synchronized) counter that starts out at 0 for each memory cell, that's incremented when pinned and decremented when unpinned. The GC is then only allowed to move cells whose counter is 0. You would do both. During a collection, you mark each block to see if it's referenced, and mark it again if it's got a fixed reference. Then you collect every section that's not referenced and optionally move the sections that aren't marked as pinned to a more advantageous layout. If you are proposing a compile-time garbage collector, one that determines when to delete an object using static analysis, I will be quite impressed if you come up with an implementation. | |||
December 23, 2007 Re: My Language Feature Requests | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | Craig Black wrote: > >> It requires you to store a struct by reference. Thus, performance hit. > > No it doesn't. Structs will be able to be allocated on the stack, without any referencing. As an OPTION, you will be able to store a struct by reference. C++ does this very same thing and it is very efficient. Slicing problem: struct A { int i, j; } struct B : A { long k; } A foo (A a) { static assert (a.sizeof == 8); assert (a.sizeof == 8); return a; } B b; b.k = 14; assert (b.sizeof == 16); b = foo(b); assert (b.k == 14); // FAIL Polymorphic structs *have* to be reference types, unless you determine stack layout at runtime. And not only that, you have to modify stack layout after you've created a stack frame. The only saving grace is that you won't have to do that for a stack frame higher than the current one. >>> Yes, that will work, but requires a run-time check (and a branch). The run-time overhead for what you propose might end up being trivial, but I think it could be done at compile-time. >> >> I'm not so sure. You'd have to make it undefined behavior to assign a non-fixed address to a fixed pointer. The reverse is fine, of course. >> >> Since class references are pointers, you'd have to have the fixed storage class apply to them as well. Any reference type, really. > > Yes and all class fields would be fixed as well, unless the class object was instantiated using scope. This means that when you take the address of them, it results in a fixed pointer. You're saying: class Foo { int i; } Foo f = new Foo(); int* i_ptr = &f.i; That would be a compile error? f is not fixed; I don't care if the bits in i_ptr change, or the bits in the reference f. Why should I? Just because I took the address of f.i and stored it in an unfixed pointer, the garbage collector, which has full authority to change the pointer I just got, can't move *f? Why? | |||
December 23, 2007 Re: My Language Feature Requests | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | Craig Black wrote: > > "Christopher Wright" <dhasenan@gmail.com> wrote in message news:fklr1g$1uat$1@digitalmars.com... >> This is to fix the stuff I botched with my other reply. >> >> Craig Black wrote: >>> a) Disallow overloading new and delete for classes, and make classes strictly for GC, perhaps with an exception for classes instantiated on the stack using scope. >> >> You are just making sure that the garbage collector is handling all memory that is associated with objects. I don't see a point to this. The collector won't try to move memory that it doesn't control. > > It has nothing to do with the garbage collector run-time stuff. It is giving the compiler more information so that compile-time checks can be done. The point of overloading new and delete is to work around the garbage collector. It's not smart enough, it doesn't have the knowledge about my specific problem, so I'm going to fix the problem myself. The most common situation is, I want to manually allocate the memory for the variable, and I don't want the garbage collector to know about this object. >> You could do bad things with overloading new/delete, but those are hardly unique situations. >> > > Granted. There are so many ways to mess things up with pointers. It's hard to make a systems language "safe". I guess my approach would be to make it "safer". I don't see that. I mean, if D didn't have arrays, you couldn't ever get an array bounds error; if it didn't have pointers, you would have trouble segfaulting; but those are too useful. I've manually created objects without using the new operator or the constructor. It's ugly. It's error-prone. Overloading new is safer, when you just want to control how the memory is allocated. (I couldn't avoid it because I didn't want to use a constructor.) >> > Then the compiler could disallow taking the address of a class field, >> > since we know the resulting pointer would pointer to the GC heap. >> > Note that this would be a compile-time check, and so would not degrade >> > run-time performance. >> >> That's not necessary, since you can map a source range to a destination range. It would be a simplifying assumption that improves performance, by changing two comparisons and an addition for each pointer (plus one subtraction per move) to one comparison and one assignment for each pointer. But you're going through a large amount of memory, so that's not a serious concern, I think. >> >>> a) Preceding a pointer declaration with fixed would allow that pointer to take the address in the GC heap. >> >> It'd be undefined behavior to do otherwise. But safe as long as no collections happen before you use the pointer. > > Unless I am missing something, this would require a run-time check for each pointer assignment or pointer arithmetic operation. Undefined behavior means there are no checks preventing it, but bad things can happen if you do it, so be careful, and it isn't Walter's fault if it explodes in your face. The point is, it might be a useful thing, in which case you wouldn't want to disallow it. But either way, checking it is too expensive, so calling it undefined behavior should suffice. | |||
December 23, 2007 Re: My Language Feature Requests | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christopher Wright | > Polymorphic structs *have* to be reference types, unless you determine stack layout at runtime. And not only that, you have to modify stack layout after you've created a stack frame. The only saving grace is that you won't have to do that for a stack frame higher than the current one. Right, but that's not a problem if you disallow polymorphism for stack objects. This is what C++ does and it works very well. Rather than generating a run-time assertion, your code would simply not compile. If you want polymorphism then you have to instantiate then you would have to instantiate the struct on the heap. struct A { int i, j; } struct B : A { long k; } A foo (A a) { return a; } B b; b = foo(b); // compile error: instance of struct B can't be implicitly converted to an instance of struct A Anyway, this is all moot anyway, because I've thought of an easier solution. Pointers can be checked at run-time to determine if they address the GC heap. This check could be removed when compiling in release mode, so there will be no performance degradation. So there's no need to dissallow new and delete for classes and we don't need struct polymorphism. >You're saying: >class Foo { > int i; >} > >Foo f = new Foo(); >int* i_ptr = &f.i; > >That would be a compile error? f is not fixed; I don't care if the bits in i_ptr change, or the bits in the reference f. Why should I? > >Just because I took the address of f.i and stored it in an unfixed pointer, the garbage collector, which has full authority to change the pointer I just got, can't move *f? > >Why? I'm not really sure what you are asking. If the GC moves the relocates f, then i_ptr no longer points the appropriate location. Isn't that obvious? Are you suggesting that the GC relocate i_ptr as well? No GC I know of relocates raw pointers, so there's probably a good technical reason why they don't. I'm not a GC expert though. -Craig | |||
December 23, 2007 Re: My Language Feature Requests | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christopher Wright | >> Granted. There are so many ways to mess things up with pointers. It's hard to make a systems language "safe". I guess my approach would be to make it "safer". > > I don't see that. I mean, if D didn't have arrays, you couldn't ever get an array bounds error; if it didn't have pointers, you would have trouble segfaulting; but those are too useful. > > I've manually created objects without using the new operator or the constructor. It's ugly. It's error-prone. Overloading new is safer, when you just want to control how the memory is allocated. (I couldn't avoid it because I didn't want to use a constructor.) I was not proposing that anyone rely on "manually created objects without using the new operator or the constructor". I was proposing that the capability to allocate an object on the malloc heap would be moved to structs, so that structs behaved like C++ aggregate types. However, I realize now that this is no longer necessary, because a run-time check that enforces pointer restrictions could be removed in release mode. > Undefined behavior means there are no checks preventing it, but bad things can happen if you do it, so be careful, and it isn't Walter's fault if it explodes in your face. > > The point is, it might be a useful thing, in which case you wouldn't want to disallow it. But either way, checking it is too expensive, so calling it undefined behavior should suffice. Again the run-time check could be removed in release mode, so no harm done. | |||
December 23, 2007 Re: My Language Feature Requests | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | Craig Black wrote: >> Polymorphic structs *have* to be reference types, unless you determine stack layout at runtime. And not only that, you have to modify stack layout after you've created a stack frame. The only saving grace is that you won't have to do that for a stack frame higher than the current one. > > Right, but that's not a problem if you disallow polymorphism for stack objects. This is what C++ does and it works very well. Rather than generating a run-time assertion, your code would simply not compile. If you want polymorphism then you have to instantiate then you would have to instantiate the struct on the heap. Ideally you'd determine whether your polymorphic struct has inheritors or base classes and, if so, put it on the heap, else put it on the stack. This is why it'd be better to keep structs as they are, but have by-value classes. > struct A { int i, j; } > struct B : A { long k; } > > A foo (A a) { return a; } > B b; > b = foo(b); // compile error: instance of struct B can't be implicitly converted to an instance of struct A > > Anyway, this is all moot anyway, because I've thought of an easier solution. Pointers can be checked at run-time to determine if they address the GC heap. This check could be removed when compiling in release mode, so there will be no performance degradation. That's the current system, and it's basically what I've been saying all this time. I guess I was unclear. But you can't remove the check in release mode: static import std.c.stdlib; static import std.gc; void main () { // gc memory auto o = new Object(); o = null; // not gc memory void* ptr = std.c.stdlib.malloc(128); // gc memory (and memory leak) ptr = null; ptr = std.gc.malloc(512); // At this point, no reference to o, so it's deleted. // The first malloc'd memory still exists and can't ever be // collected. ptr = std.gc.malloc(8); // Now the gc collected the previous 512-byte buffer; of course, // the 128-byte buffer still exists. } > So there's no need to dissallow new and delete for classes and we don't need struct polymorphism. Well, we don't need any kind of polymorphism, but quite separate from the rest of the requests, struct polymorphism would be useful. Though I wouldn't refer to them as structs if they're polymorphic, since you really can't put them on the stack, and so they have to be reference types with value semantics. >> You're saying: >> class Foo { >> int i; >> } >> >> Foo f = new Foo(); >> int* i_ptr = &f.i; >> >> That would be a compile error? f is not fixed; I don't care if the bits in i_ptr change, or the bits in the reference f. Why should I? >> >> Just because I took the address of f.i and stored it in an unfixed pointer, the garbage collector, which has full authority to change the pointer I just got, can't move *f? >> >> Why? > > I'm not really sure what you are asking. If the GC moves the relocates f, then i_ptr no longer points the appropriate location. Isn't that obvious? > > Are you suggesting that the GC relocate i_ptr as well? No GC I know of relocates raw pointers, so there's probably a good technical reason why they don't. I'm not a GC expert though. Because no language besides D allows you to take the address of an class member and has native garbage collection, and D doesn't have a moving collector. There's the Boehm collector for C++, but that's not a moving collector. C# has unsafe blocks in which you can use pointers, but in those blocks you don't have a garbage collector running. And...that's it. Maybe we'll see something revolutionary with Objective-C 2.0, but probably not, and it's not here yet. You're already moving raw pointers, so you may as well move all of them. Otherwise you're eliminating a decently general use case or causing random segfaults or losing a lot of efficiency in memory layout (which will make future collections quicker, too) for pretty much nothing. | |||
December 24, 2007 Re: My Language Feature Requests | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christopher Wright | "Christopher Wright" <dhasenan@gmail.com> wrote in message news:fkmoj4$k0u$1@digitalmars.com... > Craig Black wrote: >>> Polymorphic structs *have* to be reference types, unless you determine stack layout at runtime. And not only that, you have to modify stack layout after you've created a stack frame. The only saving grace is that you won't have to do that for a stack frame higher than the current one. >> >> Right, but that's not a problem if you disallow polymorphism for stack objects. This is what C++ does and it works very well. Rather than generating a run-time assertion, your code would simply not compile. If you want polymorphism then you have to instantiate then you would have to instantiate the struct on the heap. > > Ideally you'd determine whether your polymorphic struct has inheritors or base classes and, if so, put it on the heap, else put it on the stack. That's the approach for both C++ structs and classes. But this decision is made by the programmer, not the compiler. I don't see how the compiler could do this, since it is impossible to have knowledge of subclasses that exist in an external library. The compiler would have to revert to a worst case, and put everything on the heap. > This is why it'd be better to keep structs as they are, but have by-value classes. By-value classes is just another way to do the same thing, but inferior IMO. >> Anyway, this is all moot anyway, because I've thought of an easier solution. Pointers can be checked at run-time to determine if they address the GC heap. This check could be removed when compiling in release mode, so there will be no performance degradation. > > That's the current system, and it's basically what I've been saying all this time. I guess I was unclear. But you can't remove the check in release mode: Jeez. It's like we've been speaking two different languages the whole time. I'm not talking about turning off the garbage collector in release mode. I'm talking about run-time checks that prohibit raw pointers from pointing to the GC heap. The same thing you said should be "undefined behavior". Thus, it would be undefined behavior in release mode, but in debug mode there would be a check. Like array bounds-checking. >> So there's no need to dissallow new and delete for classes and we don't need struct polymorphism. > > Well, we don't need any kind of polymorphism, but quite separate from the rest of the requests, struct polymorphism would be useful. Though I wouldn't refer to them as structs if they're polymorphic, since you really can't put them on the stack, and so they have to be reference types with value semantics. Sorry, but I don't see the novelty of "reference types with value semantics". What would it be useful for? The reason I am pushing improvements to structs is that I know it will allow for more versatile aggregate types that aren't allocated on the heap. It's important that they are not allocated on the heap because that is more efficient. From my perspective, your proposal does nothing for performance, since there is still a heap allocation. >>> You're saying: >>> class Foo { >>> int i; >>> } >>> >>> Foo f = new Foo(); >>> int* i_ptr = &f.i; >>> >>> That would be a compile error? f is not fixed; I don't care if the bits in i_ptr change, or the bits in the reference f. Why should I? >>> >>> Just because I took the address of f.i and stored it in an unfixed pointer, the garbage collector, which has full authority to change the pointer I just got, can't move *f? >>> >>> Why? >> >> I'm not really sure what you are asking. If the GC moves the relocates f, then i_ptr no longer points the appropriate location. Isn't that obvious? >> >> Are you suggesting that the GC relocate i_ptr as well? No GC I know of relocates raw pointers, so there's probably a good technical reason why they don't. I'm not a GC expert though. > > Because no language besides D allows you to take the address of an class member and has native garbage collection, and D doesn't have a moving collector. There's the Boehm collector for C++, but that's not a moving collector. C# has unsafe blocks in which you can use pointers, but in those blocks you don't have a garbage collector running. And...that's it. Maybe we'll see something revolutionary with Objective-C 2.0, but probably not, and it's not here yet. > > You're already moving raw pointers, so you may as well move all of them. Otherwise you're eliminating a decently general use case or causing random segfaults or losing a lot of efficiency in memory layout (which will make future collections quicker, too) for pretty much nothing. Heck, you may be right about this. Like I said I'm no GC expert. | |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply