Thread overview
Garbage Collector - One Last Question
Jun 09, 2004
Arcane Jill
Jun 09, 2004
Walter
Jun 09, 2004
Arcane Jill
Jun 10, 2004
Walter
Jun 10, 2004
Arcane Jill
Jun 10, 2004
Mike Swieton
Jun 11, 2004
Arcane Jill
Jun 10, 2004
Martin M. Pedersen
Jun 11, 2004
Stewart Gordon
Jun 09, 2004
Matthew
June 09, 2004
The garbage collector is free to move objects about in memory for defragmentation purposes. May I request that when this happens, either

(a) The original location be securely wiped so that no trace of the original
data remains. (A simple memset() will do this, providing it's not optimized
away), OR

(b) A callback mechanism exist, so that the class which owns the data be notified, so that it may perform the move itself.

Of course, you'll probably want to NOT do this for most data. Perhaps a special attribute (might I suggest the keyword "sensitive") could enable this behavior for only that data for which it matters.

Arcane Jill


June 09, 2004
May I suggest instead that secure data be allocated (using overloaded new and delete) in a separate memory pool. User instantiated class objects would just contain references to this secure pool, and not contain any sensitive data directly. Having a secure pool has several advantages:

1) being a specific area of memory, it can be 'locked' without needing to
lock the whole gc heap
2) on program exit or failure, it can be securely wiped
3) you can completely control the semantics of it

"Arcane Jill" <Arcane_member@pathlink.com> wrote in message news:ca7nv5$dhr$1@digitaldaemon.com...
> The garbage collector is free to move objects about in memory for defragmentation purposes. May I request that when this happens, either
>
> (a) The original location be securely wiped so that no trace of the
original
> data remains. (A simple memset() will do this, providing it's not
optimized
> away), OR
>
> (b) A callback mechanism exist, so that the class which owns the data be notified, so that it may perform the move itself.
>
> Of course, you'll probably want to NOT do this for most data. Perhaps a
special
> attribute (might I suggest the keyword "sensitive") could enable this
behavior
> for only that data for which it matters.
>
> Arcane Jill
>
>


June 09, 2004
In article <ca7pj8$g0b$1@digitaldaemon.com>, Walter says...
>
>May I suggest instead that secure data be allocated (using overloaded new and delete) in a separate memory pool. User instantiated class objects would just contain references to this secure pool, and not contain any sensitive data directly.

Good idea ... except for one small problem. As we've just recently cleared up, if you put data in an area not managed by the GC, you will never be notified when it becomes no longer reachable, so there is no way to know when to delete it.

So you would be back to requiring an explicit delete(), which, in the circumstances I have in mind is not really an option. Basically I'd then have to write a secondary GC in order to discover whether or not it was okay to delete it.

There ARE ways around this problem though. I can think of at least two off the top of my head. However, these ideas of mine would not be of practical use unless it were possible to overload operator new in general (instead of on a per-class basis). That is, I can't (currently) say:

>    RegExp re = new(MyAllocator) RegExp(pattern, attributes);

because (currently) I can only add operator new overloads to my own classes, not to existing classes in Phobos. In C++ you can make a custom new that works for EVERYTHING. Of course, I might have got this wrong, in which case please tell me.

It's not just classes either. I'd want my operator new to be able to allocate arrays too. That is, I'd want to be able to replace:

>   char[] r;
>   r.length = 100;

with

>   char[] password;
>   password.length = new(MyAlloctor) 100; // or some similar syntax

>Having a secure pool has several advantages:

Yes it does, but unless we can overload new GENERALLY, instead of only for specific classes, that advantage is lost. Any chance we can have a GLOBAL overload for new?

I don't think that my idea (in previous post) would slow down the gc though. After all, if data were NOT marked as sensitive, the gc would behave exactly as it does now. (Er, I mean, exactly as it would do if it did relocation). Only data marked as sensitive would have to be wiped, and, in practice, that is likely to be rare.

Jill



June 09, 2004
I do like the idea of an object/class getting a callback when an instance is about to be moved.

How about it being a classinfo attribute, and only accessible to D implementation code and to that class. If the classinfo callback is null, then the GC proceeds as normal. If not, then it gives the callback a tinkle.

Whether or not you'd want to have the call be able to cancel the move, or split it into pre-move and post-move, etc. is up for debate

"Walter" <newshound@digitalmars.com> wrote in message news:ca7pj8$g0b$1@digitaldaemon.com...
> May I suggest instead that secure data be allocated (using overloaded new and delete) in a separate memory pool. User instantiated class objects would just contain references to this secure pool, and not contain any sensitive data directly. Having a secure pool has several advantages:
>
> 1) being a specific area of memory, it can be 'locked' without needing to
> lock the whole gc heap
> 2) on program exit or failure, it can be securely wiped
> 3) you can completely control the semantics of it
>
> "Arcane Jill" <Arcane_member@pathlink.com> wrote in message news:ca7nv5$dhr$1@digitaldaemon.com...
> > The garbage collector is free to move objects about in memory for defragmentation purposes. May I request that when this happens, either
> >
> > (a) The original location be securely wiped so that no trace of the
> original
> > data remains. (A simple memset() will do this, providing it's not
> optimized
> > away), OR
> >
> > (b) A callback mechanism exist, so that the class which owns the data be notified, so that it may perform the move itself.
> >
> > Of course, you'll probably want to NOT do this for most data. Perhaps a
> special
> > attribute (might I suggest the keyword "sensitive") could enable this
> behavior
> > for only that data for which it matters.
> >
> > Arcane Jill
> >
> >
>
>


June 10, 2004
I just can't figure out why you'd need a global operator new. I've used such in C++ now and then, and always wound up backing it out because it screws things up (for example, it prevents linking in an existing library that relies on the default new's semantics).

What doing the extra layer enables you to do is control the reference counting to it, the user of the class doesn't need to, and you wouldn't need to worry about user class references being copied about willy-nilly. Increment the reference count on construction of the user object, and decrement on destruction. When 0, secure delete the hidden security data. When you reach a 'sync' point, invoke a 'reaper' that secure deletes any remaining secure data items.

If you're doing a web server thing, you can do a secure delete on any items that haven't been collected after a fixed time has elapsed (like 15 minutes).

The secure delete 'reaper' also needn't recycle the memory; any user objects still live should check to see if the data is still 'live' before accessing it, and fail reasonably if it has been reaped.

I actually kind of like the idea of a reaper thread that goes through and expires any secure data that is more than X minutes old. It's a nice backup against bugs where data is inadvertantly held on to (can't guarantee this doesn't happen in C++, just store a reference into static data. Voila, it lives forever, copy constructor or not).


June 10, 2004
"Arcane Jill" <Arcane_member@pathlink.com> wrote in message news:ca7ra3$iji$1@digitaldaemon.com...
> Good idea ... except for one small problem. As we've just recently cleared
up,
> if you put data in an area not managed by the GC, you will never be
notified
> when it becomes no longer reachable, so there is no way to know when to
delete
> it.

How about having an object on the normal heap, which has a destructor the
wipes out the sensitive data?
Keep a reference to the heap-object and the sensitive data together, or
reference the sensitive
data through the heap-object. Then the destructor and the GC will take care
of it for you.

Regards,
Martin


June 10, 2004
Many, many thanks to everyone who put their ideas into this. I am happy to report that I've completely solved the problem now, and I don't need to ask Walter to implement _anything_ new (no pun intended).

All this has made me realize that, while I could think of one or two tweaks to the language which might make it easier for systems programmers, it would be FAR more appropriate for me to put a concrete proposal together, complete with rationale for everything, and why it would help everyone (instead of just me). So I'll do that in my own time, and I'll take my time over it so I don't waste everyone's time with dumb ideas that I change my mind about a few hours later because I've thought of something else.

In the meantime - all of my problems are solved, and I'm happy. (Now all I've got to do is go and write the code). Thanks to everyone who joined in this discussion - most especially Walter, whose patience is truly amazing.

Jill


I'll just reply to this...


In article <ca89ts$18th$1@digitaldaemon.com>, Walter says...
>
>I just can't figure out why you'd need a global operator new. I've used such in C++ now and then, and always wound up backing it out because it screws things up (for example, it prevents linking in an existing library that relies on the default new's semantics).

I'll spend some time putting a sensible proposal together. I'm sure it will all make sense if I lay down all the arguments reasonably, like Norbert did with multidimensional arrays. In the meantime, you can just forget it, as I don't need it for now.



>What doing the extra layer enables you to do is control the reference counting to it, the user of the class doesn't need to, and you wouldn't need to worry about user class references being copied about willy-nilly. Increment the reference count on construction of the user object, and decrement on destruction. When 0, secure delete the hidden security data. When you reach a 'sync' point, invoke a 'reaper' that secure deletes any remaining secure data items.

Yes, you are correct.


>If you're doing a web server thing, you can do a secure delete on any items that haven't been collected after a fixed time has elapsed (like 15 minutes).

Now THAT is a possibility I hadn't thought of. Cheers!


>The secure delete 'reaper' also needn't recycle the memory; any user objects still live should check to see if the data is still 'live' before accessing it, and fail reasonably if it has been reaped.

Nice!


>I actually kind of like the idea of a reaper thread that goes through and expires any secure data that is more than X minutes old. It's a nice backup against bugs where data is inadvertantly held on to (can't guarantee this doesn't happen in C++, just store a reference into static data. Voila, it lives forever, copy constructor or not).

Walter, you're a genius. (Although I solved my problem another way and don't need this). I will bear that in mind for the future.


June 10, 2004
On Thu, 10 Jun 2004 06:25:23 +0000, Arcane Jill wrote:

> 
> Many, many thanks to everyone who put their ideas into this. I am happy to report that I've completely solved the problem now, and I don't need to ask Walter to implement _anything_ new (no pun intended).

If it's not something terrible domain-specific, why don't you share your solution?

Mike Swieton
__
In case you haven't realized it, building computer systems is hard.
	- Martin Fowler

June 11, 2004
Arcane Jill wrote:

<snip>
> Good idea ... except for one small problem. As we've just recently cleared up, if you put data in an area not managed by the GC, you will never be notified
> when it becomes no longer reachable, so there is no way to know when to delete
> it.
<snip>

My impression was that you were going to have an object wrapper, which would remain in the heap, around the secure data.

After all, IINM your Int class is already an object wrapper around a dynamic array.  Then all you need to do is have Int malloc/free the actual data content.  This would also mean that the explicit memory management remains internal to your class.

To make sure it's wiped on exit, all you then need is to make sure gc_term gets called.

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment.  Please keep replies on the 'group where everyone may benefit.
June 11, 2004
In article <pan.2004.06.10.23.54.20.311894@swieton.net>, Mike Swieton says...
>
>On Thu, 10 Jun 2004 06:25:23 +0000, Arcane Jill wrote:
>
>> 
>> Many, many thanks to everyone who put their ideas into this. I am happy to report that I've completely solved the problem now, and I don't need to ask Walter to implement _anything_ new (no pun intended).
>
>If it's not something terrible domain-specific, why don't you share your solution?

Sure. Here's the simplified "proof of concept" explanation. What you do is you declare a class like this:

>   class A
>   {
>       ubyte* p;
>   }

In the constructor, you call malloc() to get some memory, store its address in
p, throwing an exception if malloc() returns null. In the destructor, you call
free(p).

Now, because this is a class WITHOUT a custom new(), it will be managed by the garbage collector. This means that its destructor will, eventually, be called. Since that destructor frees the malloc'ed memory, you have no memory leak. And all of the memory that you got through malloc() will never be touched by the garbage collector (except indirectly when it calls the destructor).

A more serious class would allow for resizing, wiping, and so on, and of course
if you want, you can replace malloc() and free() with your own custom
allocation/deallocation pair.

And that's pretty much it really.

Arcane Jill