What's the proper way to discard a doubly linked list? (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » D » What's the proper way to discard a doubly linked list? (page 2)

January 23, 2004

Re: What's the proper way to discard a doubly linked list?

Posted by Ilya Minkov
in reply to Stewart Gordon

Ilya Minkov

Posted in reply to Stewart Gordon

Stewart Gordon wrote:
> Makes sense, though I'm still a bit puzzled, as it also says:
> "#  Do not store integer values into pointers.
> # Do not store magic values into pointers, other than null."
> 
> What would be the consequence of this? 

Currently, i guess, none. If complex pointer checks get optimised out someday, you could get a segfault in course of collection or even just a diagnostic exception. Besides, it's just a very dangerous strategy in application programming.

> Speaking of which, are Windows API handles really pointers, as the type definitions imply?

In an implementation i know all HANDLEs are defined as ints in D. Doesn't make sense to call them pointers, since (1) they may accept constant values and (2) you cannot dereference them yourself, as even if they are pointers they only are valid within the Windows API implementation and cannot be dereferenced in application software. Besides, noone really knows whether they are pointers, or even just some kinds of unique IDs.

> "# If you must share the same storage location between pointers and non-pointer types, use a union to do it so the garbage collector knows about it."
> 
> How would the GC know about it, unless the union keeps a record of which member is 'active'?

It wouldn't know, it would just scan anyway. The conservative GC has to be safe and thus err on the side of assuming anything which is not known is a pointer. In particular, no information exists over stack and over freestanding structs. Only objects and parts thereof get type information. However, this could possibly be extended to free-floating structs someday...

-eye

January 24, 2004

Re: What's the proper way to discard a doubly linked list?

Posted by Georg Wrede
in reply to Ilya Minkov

Georg Wrede

Posted in reply to Ilya Minkov

In article <burvq5$1eq0$1@digitaldaemon.com>, Ilya Minkov says...
>
>Stewart Gordon wrote:
>> Makes sense, though I'm still a bit puzzled, as it also says:
>> "#  Do not store integer values into pointers.
>> # Do not store magic values into pointers, other than null."
>> 
>> What would be the consequence of this?

Actually, the documentation should be changed here! It should say "do not ever store anything in a pointer". You should only assign it values of other pointers (as in foo = bar), or the null value.

Since many D users are former C(++) users, it is customary to do all kinds of "wizardry" with pointers. For example, if one wants to make his own upCaseString(), then those people would probably use a pointer, and increment it as an int, to make it point at the next character. The spirit of D is to allow such (stupiditiy), but at yor own risk. (C users have a hard time remembering to leave pointers alone because in C you're the only one ever to use these values, and you "own" the entire program. In D there are always at least two users for these values: you and the GC!)

Now, suppose one forgets to use a copy of the original pointer. Later, when GC occurs, then the GC would find no pointer pointing at the string (or actually at the beginning of the area allocated for the string data). The GC would mark this area as unused. Which obviously results in the area being overwritten with other data, sooner or later.

>> "# If you must share the same storage location between pointers and non-pointer types, use a union to do it so the garbage collector knows about it."
>> 
>> How would the GC know about it, unless the union keeps a record of which member is 'active'?

It doesn't. The above quote means that if you _really need_ to
have an int in which you have the pointer value (say, you want to
examine its numeral value, do some language research, study memory
management, whatever -- or you're developing a data moving GC and
you want to study how it changes the location of things), then don't
just have it in an int. If you did, then the GC would not see it as
a pointer to data, and it would leave it alone.

So, to _both_ have the GC know that it is a pointer to data _and_
for you to be able to easily study the values in the pointer, then
make a union. The union is nothing that happens in reality,
it just tells the compiler that you want to have two different
names (and types) for a particular item: to you it's both an int
and a pointer.

January 24, 2004

Re: What's the proper way to discard a doubly linked list?

Posted by Ilya Minkov
in reply to Georg Wrede

Ilya Minkov

Posted in reply to Georg Wrede

Georg Wrede wrote:
> Actually, the documentation should be changed here! It should
> say "do not ever store anything in a pointer". You should only
> assign it values of other pointers (as in foo = bar), or the
> null value.

We have a number of allowed operations which result in new pointer adresses. Such as pointer increments/decrements or pointer adressing. Walking outside allocated memory regions automatically produces invalid values. Such cannot be used further.

> Since many D users are former C(++) users, it is customary to
> do all kinds of "wizardry" with pointers. For example, if one
> wants to make his own upCaseString(), then those people would
> probably use a pointer, and increment it as an int, to make it point at the next character. The spirit of D is to allow
> such (stupiditiy), but at yor own risk. (C users have a hard
> time remembering to leave pointers alone because in C you're
> the only one ever to use these values, and you "own" the
> entire program. In D there are always at least two users for
> these values: you and the GC!)

I'm sorry, i have to disagree. Though the practice may be dangerous a bit too often, it is legal. It is not "incrementing as an int", increment of an int increments it by 1. As opposed to that, increment of a pointer is a completely different thing and increments it by the size of the stored object. Quasi advances by one in an array. This practice by itself would be safe, if the array could complain if run over, and would give a standard way not to run past its end. Unfortunately, in C it wasn't until the C++ and the STL.

The very same idea of walking pointers was adopted by STL - and i find it good so - but with providing means to not run past the end due to end marker notion. And by generalizing the pointer to be not just memory pointer, but any kind of cursor at all.

> Now, suppose one forgets to use a copy of the original pointer.
> Later, when GC occurs, then the GC would find no pointer pointing
> at the string (or actually at the beginning of the area allocated for the string data). The GC would mark this area as unused. Which
> obviously results in the area being overwritten with other data,
> sooner or later.

It's no problem. GC doesn't need a pointer into the beginning of an allocated region. Pointers into the middle of the region should suffice. Even the pointer to one next element after the end of the array should suffice, since it's called the end marker in C++ iterator lingo.

And remember that the calling stack also has another pointer to the region, so it won't die. Only a region which is practically by no means acessible would.

-eye

January 25, 2004

Re: What's the proper way to discard a doubly linked list?

Posted by Georg Wrede
in reply to Ilya Minkov

Georg Wrede

Posted in reply to Ilya Minkov

In article <buu68m$1u0j$1@digitaldaemon.com>, Ilya Minkov says...
>
>Georg Wrede wrote:
>> Actually, the documentation should be changed here! It should say "do not ever store anything in a pointer". You should only assign it values of other pointers (as in foo = bar), or the null value.
>
>We have a number of allowed operations which result in new pointer adresses. Such as pointer increments/decrements or pointer adressing.

What I meant was to not mess with the pointer by bypassing the language's methods. Of course all operations on the pointer that are done with the language, as opposed to directly manipulating the bit pattern, are ok.

>Walking outside allocated memory regions automatically produces invalid values. Such cannot be used further.
>
>> Since many D users are former C(++) users, it is customary to do all kinds of "wizardry" with pointers. For example, if one
>
>I'm sorry, i have to disagree. Though the practice may be dangerous a bit too often, it is legal. It is not "incrementing as an int", increment of an int increments it by 1.

Incrementing a pointer (in C(++) and D) is ok, so we agree on that. The example described a situation where you'd increment it as an int as opposed to incrementing it the proper way. I should have come up with a better example situation.

>> Now, suppose one forgets to use a copy of the original pointer. Later, when GC occurs, then the GC would find no pointer pointing at the string (or actually at the beginning of the area allocated for the string data). The GC would mark this area as unused. Which obviously results in the area being overwritten with other data, sooner or later.
>
>It's no problem. GC doesn't need a pointer into the beginning of an allocated region. Pointers into the middle of the region should suffice.

Yes, they should. This depends on the particular GC, though, so I wouldn't do this in production code, since the customer may use some other GC.

>Even the pointer to one next element after the end of the array should suffice, since it's called the end marker in C++ iterator lingo.

I did my Master's thesis about the STL.

>And remember that the calling stack also has another pointer to the region, so it won't die.

Again, the example could have been of a more thought-out situation.

>Only a region which is practically by no means acessible would.

This was the case here, and I took it for granted everyone knew.

March 12, 2004

Re: What's the proper way to discard a doubly linked list?

Posted by Walter
in reply to Russ Lewis

Walter

Posted in reply to Russ Lewis

"Russ Lewis" <spamhole-2001-07-16@deming-os.org> wrote in message news:bubrab$2kqb$1@digitaldaemon.com...
> One of the downsides of this type of collector, however, is that it can
> get false positives; if one object in your doubly-linked list is at
> address 0xdeadbeef, and you happen to have an integer somewhere in your
> program that has the value 0xdeadbeef, then the garbage collector will
> assume that it is a pointer.  (The garbage collector doesn't know about
>   the types of the variables it is scanning.)  In that case, the
> doubly-linked list will not get garbage collected immediately; however,
> whenever your integer value changes, then the next gc pass will clean it
up.

This is definitely a potential problem. In practice, it doesn't happen much. But it *can* happen, and the best way to deal with it is to not have islands that are too large. What I tend to do with large data structure islands is that when I know I'm done with them, I'll 'break it up' by nulling internal references in it so that if a chunk does get 'pinned' by a random integer value, it won't in turn pin the whole data structure.

March 12, 2004

Re: What's the proper way to discard a doubly linked list?

Posted by Walter
in reply to Georg Wrede

Walter

Posted in reply to Georg Wrede

"Georg Wrede" <Georg_member@pathlink.com> wrote in message news:budo3m$2krn$1@digitaldaemon.com...
> DMD 0.76 docs say:
>
> "Do not store pointers into int variables using casts and other tricks.
The
> garbage collector does not scan non-pointer types for roots."
>
> This would contradict the 0xdeadbeef example problem. That is, according to the documentation such a problem does not exist!?

What that means is the gc can use type information to reduce the number and effect of false positives. (The current gc is a bit primitive in that it does not yet do this, but the door needs to be left open to that improvement.) This cannot be perfect, though, because of unions and the stack.

March 12, 2004

Re: What's the proper way to discard a doubly linked list?

Posted by Walter
in reply to Georg Wrede

Walter

Posted in reply to Georg Wrede

"Georg Wrede" <Georg_member@pathlink.com> wrote in message news:buu08u$1kqg$1@digitaldaemon.com...
> Since many D users are former C(++) users, it is customary to do all kinds of "wizardry" with pointers. For example, if one wants to make his own upCaseString(), then those people would probably use a pointer, and increment it as an int, to make it point at the next character. The spirit of D is to allow such (stupiditiy), but at yor own risk. (C users have a hard time remembering to leave pointers alone because in C you're the only one ever to use these values, and you "own" the entire program. In D there are always at least two users for these values: you and the GC!)
>
> Now, suppose one forgets to use a copy of the original pointer. Later, when GC occurs, then the GC would find no pointer pointing at the string (or actually at the beginning of the area allocated for the string data). The GC would mark this area as unused. Which obviously results in the area being overwritten with other data, sooner or later.

This is an important issue. D's gc does not require a pointer to the beginning of the object, a pointer to any byte in the object is considered as being a valid reference to the entire object. So, you don't need to keep around a copy of the original pointer, and using pointers to walk a string array will work correctly.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation