Jump to page: 1 218  
Page
Thread overview
Escaping the Tyranny of the GC: std.rcstring, first blood
Sep 15, 2014
Vladimir Panteleev
Sep 15, 2014
Vladimir Panteleev
Sep 15, 2014
deadalnix
Sep 15, 2014
Rikki Cattermole
Sep 15, 2014
Jakob Ovrum
Sep 15, 2014
Rikki Cattermole
Sep 15, 2014
Jakob Ovrum
Sep 15, 2014
John Colvin
Sep 15, 2014
monarch_dodra
Sep 15, 2014
Marc Schütz
Sep 15, 2014
monarch_dodra
Sep 15, 2014
Kagamin
Sep 15, 2014
Wyatt
Sep 15, 2014
Jacob Carlborg
Sep 15, 2014
Jakob Ovrum
Sep 15, 2014
Jakob Ovrum
Sep 15, 2014
Jakob Ovrum
Sep 15, 2014
Sean Kelly
Sep 15, 2014
Rainer Schuetze
Sep 15, 2014
Rainer Schuetze
Sep 15, 2014
po
Sep 15, 2014
Ola Fosheim Gr
Sep 15, 2014
po
Sep 15, 2014
Ola Fosheim Gr
Sep 16, 2014
Rainer Schuetze
Sep 16, 2014
Rainer Schuetze
Sep 16, 2014
Kagamin
Sep 16, 2014
po
Sep 15, 2014
Rainer Schuetze
Sep 16, 2014
Sean Kelly
Sep 15, 2014
Rainer Schuetze
Sep 21, 2014
Rainer Schuetze
Sep 21, 2014
Timon Gehr
Sep 15, 2014
Sean Kelly
Sep 15, 2014
Rainer Schuetze
Sep 15, 2014
bearophile
Sep 15, 2014
bearophile
Sep 23, 2014
Manu
Sep 23, 2014
Manu
Sep 23, 2014
Manu
Sep 23, 2014
deadalnix
Sep 23, 2014
Manu
Sep 23, 2014
Dmitry Olshansky
Sep 23, 2014
Manu
Sep 24, 2014
Dmitry Olshansky
Sep 24, 2014
Dmitry Olshansky
Sep 24, 2014
Manu
Sep 24, 2014
Manu
Sep 25, 2014
Manu
Sep 25, 2014
Manu
Sep 25, 2014
Marc Schütz
Sep 25, 2014
Walter Bright
Sep 25, 2014
Manu
Sep 25, 2014
Walter Bright
Sep 25, 2014
Walter Bright
Sep 25, 2014
bearophile
Sep 28, 2014
Marco Leise
Sep 28, 2014
bearophile
Sep 28, 2014
Paulo Pinto
Sep 25, 2014
Paulo Pinto
Sep 26, 2014
Dmitry Olshansky
Sep 27, 2014
Foo
Sep 27, 2014
Dmitry Olshansky
Sep 27, 2014
Dmitry Olshansky
Sep 27, 2014
Foo
Sep 27, 2014
Dmitry Olshansky
Sep 27, 2014
Marc Schütz
Sep 27, 2014
Marc Schütz
Sep 27, 2014
Dmitry Olshansky
Sep 28, 2014
Marc Schütz
Sep 27, 2014
Dmitry Olshansky
Sep 28, 2014
Dmitry Olshansky
Sep 28, 2014
Dmitry Olshansky
Sep 25, 2014
Walter Bright
Sep 15, 2014
Jakob Ovrum
Sep 15, 2014
Marco Leise
Sep 15, 2014
Sean Kelly
Sep 15, 2014
Sean Kelly
Sep 17, 2014
Dicebot
Sep 17, 2014
Piotrek
Sep 19, 2014
Dicebot
Sep 20, 2014
Dicebot
Sep 21, 2014
Dmitry Olshansky
Sep 21, 2014
Dicebot
Sep 21, 2014
Dmitry Olshansky
Sep 21, 2014
Paulo Pinto
Sep 21, 2014
Paulo Pinto
Sep 22, 2014
Googler Lurker
Sep 22, 2014
Dmitry Olshansky
Sep 27, 2014
Dmitry Olshansky
Sep 27, 2014
Dmitry Olshansky
Sep 23, 2014
Sean Kelly
Sep 22, 2014
Dmitry Olshansky
Sep 23, 2014
Dmitry Olshansky
Sep 23, 2014
Wyatt
Sep 23, 2014
Wyatt
Sep 23, 2014
Meta
Sep 23, 2014
ketmar
Sep 24, 2014
deadalnix
Sep 21, 2014
Kagamin
Sep 22, 2014
Kagamin
Sep 20, 2014
Nordlöw
Sep 20, 2014
Nordlöw
Sep 20, 2014
Nordlöw
Sep 20, 2014
Nordlöw
Sep 22, 2014
Nordlöw
Sep 24, 2014
Nordlöw
Sep 24, 2014
Nordlöw
Sep 24, 2014
Nordlöw
Sep 24, 2014
Nordlöw
Sep 24, 2014
Nordlöw
September 15, 2014
Walter, Brad, myself, and a couple of others have had a couple of quite exciting ideas regarding code that is configurable to use the GC or alternate resource management strategies. One thing that became obvious to us is we need to have a reference counted string in the standard library. That would be usable with applications that want to benefit from comfortable string manipulation whilst using classic reference counting for memory management. I'll get into more details into the mechanisms that would allow the stdlib to provide functionality for both GC strings and RC strings; for now let's say that we hope and aim for swapping between these with ease. We hope that at one point people would be able to change one line of code, rebuild, and get either GC or RC automatically (for Phobos and their own code).

The road there is long, but it starts with the proverbial first step. As it were, I have a rough draft of a almost-drop-in replacement of string (aka immutable(char)[]). Destroy with maximum prejudice:

http://dpaste.dzfl.pl/817283c163f5

For now RCString supports only immutable char as element type. That means you can't modify individual characters in an RCString object but you can take slices, append to it, etc. - just as you can with string. A compact reference counting scheme is complemented with a small buffer optimization, so performance should be fairly decent.

Somewhat surprisingly, pure constructors and inout took good care of qualified semantics (you can convert a mutable to an immutable string and back safely). I'm not sure whether semantics there are a bit too lax, but at least for RCString they turned out to work beautifully and without too much fuss.

The one wrinkle is that you need to wrap string literals "abc" with explicit constructor calls, e.g. RCString("abc"). This puts RCString on a lower footing than built-in strings and makes swapping configurations a tad more difficult.

Currently I've customized RCString with the allocation policy, which I hurriedly reduced to just one function with the semantics of realloc. That will probably change in a future pass; the point for now is that allocation is somewhat modularized away from the string workings.

So, please fire away. I'd appreciate it if you used RCString in lieu of string and note the differences. The closer we get to parity in semantics, the better.


Thanks,

Andrei
September 15, 2014
On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu wrote:
> The road there is long, but it starts with the proverbial first step.

An unrelated question, but how will reference counting work with classes?

I've recently switched my networking library's raw memory wrapper to reference counting (previously it just relied on the GC for cleanup), and it was going well until I've hit a brick wall.

The thing with reference counting is it doesn't seem to make sense to do it half-way. Everything across the ownership chain must be reference counted, because otherwise the non-ref-counted link will hold on to its ref-counted child objects forever (until the next GC cycle).

In my case, the classes in my applications were holding on to my reference-counted structs, and wouldn't let go until they were eventually garbage-collected. I can't convert the classes to structs because I need their inheritance/polymorphism, and I don't see an obvious way to refcount classes (RefCounted explicitly does not support classes).

Am I overlooking something?
September 15, 2014
On 9/14/14, 7:52 PM, Vladimir Panteleev wrote:
> On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu wrote:
>> The road there is long, but it starts with the proverbial first step.
>
> An unrelated question, but how will reference counting work with classes?
>
> I've recently switched my networking library's raw memory wrapper to
> reference counting (previously it just relied on the GC for cleanup),
> and it was going well until I've hit a brick wall.
>
> The thing with reference counting is it doesn't seem to make sense to do
> it half-way. Everything across the ownership chain must be reference
> counted, because otherwise the non-ref-counted link will hold on to its
> ref-counted child objects forever (until the next GC cycle).
>
> In my case, the classes in my applications were holding on to my
> reference-counted structs, and wouldn't let go until they were
> eventually garbage-collected. I can't convert the classes to structs
> because I need their inheritance/polymorphism, and I don't see an
> obvious way to refcount classes (RefCounted explicitly does not support
> classes).
>
> Am I overlooking something?

At least for the time being, bona fide class objects with refcounted members will hold on to them until they're manually freed or a GC cycle comes about.

We're thinking of a number of schemes for reference counted objects, and we think a bottom-up approach to design would work well here: try a simple design and assess its limitations. In this case, it would be great if you tried to use RefCounted with your class objects and figure out what its limitations are.


Thanks,

Andrei

September 15, 2014
I don't want to be the smart ass that did nothing and complains about what other did, but I'll be it anyway.

It doesn't look very scalable to me to implement various versions of modules with various memory management schemes. Inevitably, these will have different subtle variation in semantic, different set of bugs, it is twice as many work to maintain and so on.

Have you tried to explore solution where an allocator is passed to functions (as far as I can tell, this wasn't very successful in C++, but D greater metaprogramming capabilities may offer better solutions than C++'s) ?

Another option is to use output ranges. This look like an area that is way underused in D. It looks like it is possible for the allocation policy to be part of the output range, and so we can let users decide without duplication bunch of code.

Finally, concepts like isolated allow the compiler to insert free in the generated code in a safe manner. In the same way, it is possible to remove a bunch of GC allocation by sticking some passes in the middle of the optimizer (
September 15, 2014
> The one wrinkle is that you need to wrap string literals "abc" with explicit constructor calls, e.g. RCString("abc"). This puts RCString on a lower footing than built-in strings and makes swapping configurations a tad more difficult.

A few ideas:

import std.traits : isSomeString;
auto refCounted(T)(T value) if (isSomeString!T) {
	static if (is(T == string))
		return new RCXString!(immutable char)(value);
	//....
	static assert(0);
}

static assert("abc".refCounted == "abc");

Wrapper type scenario. May look nicer.

Other which would require a language change of:

struct MyType {
 string value;
 alias value this;

 this(string value) {
  this.value = value;
 }
}

static assert("abc".MyType == "abc");

*shudder* does remind me a little too much of the Jade programming language with its casts like that.

There is one other thing which I don't think people are taking too seriously is my idea of using with statement to swap out e.g. the GC during runtime.

with(myAllocator) {
 Foo foo = new Foo; // calls the allocator to assign memory for new instance of Foo
}
// tell allocator to free foo

with(myAllocator) {
 Foo foo = new Foo; // calls the allocator to assign memory for new instance of Foo
 myFunc(foo);
}
// if myFunc modifies foo or if myFunc passes foo to another function then:
//  tell GC it has to free it when able to
// otherwise:
//  tell allocator to free foo

class MyAllocator : Allocator {
 void opWithIn(string file = __MODULE__, int line = __LINE__, string function = ?) {
  GC.pushAllocator(this);
 }

 void opWithOut(string file = __MODULE__, int line = __LINE__, string function = ?) {
  GC.popAllocator();
 }
}

By using the with statement this is possible:
void something() {
 with(new RCAllocator) {
  string value = "Hello World!"; // allocates memory via RCAllocator
 } // frees here
}
September 15, 2014
On 9/14/14, 9:50 PM, deadalnix wrote:
> I don't want to be the smart ass that did nothing and complains about
> what other did, but I'll be it anyway.
>
> It doesn't look very scalable to me to implement various versions of
> modules with various memory management schemes. Inevitably, these will
> have different subtle variation in semantic, different set of bugs, it
> is twice as many work to maintain and so on.

I've got to give it to you - it's rare to get a review on a design that hasn't been described yet :o).

There is no code duplication.

> Have you tried to explore solution where an allocator is passed to
> functions (as far as I can tell, this wasn't very successful in C++, but
> D greater metaprogramming capabilities may offer better solutions than
> C++'s) ?
>
> Another option is to use output ranges. This look like an area that is
> way underused in D. It looks like it is possible for the allocation
> policy to be part of the output range, and so we can let users decide
> without duplication bunch of code.

I've been thinking for a long time about these:

1. Output ranges;

2. Allocator objects;

3. Reference counting and encapsulations thereof.

Each has a certain attractiveness, particularly when thought of in the context of stdlib which tends to use limited, confined allocation patterns.

Took me a while to figure there's some red herring tracking there. I probably half convinced Walter too.

The issue is these techniques seem they overlap at all, but in fact the overlap is rather thin. In fact, output ranges are rather limited: they only fit the bill when (a) only output needs to be allocated, and (b) output is produced linearly. Outside these applications, there's simply no use.

As soon as thought enters more complex applications, the lure of allocators becomes audible. Pass an allocator into the algorithm, they say, and you've successfully pushed up memory allocation policy from the algorithm into the client.

The reality is allocators are low-level, unstructured devices that allocate memory but are not apt at managing it beyond blindly responding to client calls "allocate this much memory, now take it back". The many subtleties associated with actual _management_ of memory via reference counting (evidence: http://dpaste.dzfl.pl/817283c163f5) are completely lost on allocators.

I am convinced that we need to improve the lot of people who want to use the stdlib without a garbage collector, or with minimal use of it (more on that later). To do so, it is obvious we need good alternative abstractions, and reference counting is an obvious contender.

> Finally, concepts like isolated allow the compiler to insert free in the
> generated code in a safe manner. In the same way, it is possible to
> remove a bunch of GC allocation by sticking some passes in the middle of
> the optimizer (

)


Andrei
September 15, 2014
On 9/14/14, 9:51 PM, Rikki Cattermole wrote:
>
> static assert("abc".refCounted == "abc");

The idea is we want to write things like:

String s = "abc";

and have it be either refcounted or "classic" depending on the definition of String. With a user-defined String, you need:

String s = String("abc");

or

auto s = String("abc");


Andrei

September 15, 2014
On Monday, 15 September 2014 at 05:50:36 UTC, Andrei Alexandrescu wrote:
> and have it be either refcounted or "classic" depending on the definition of String. With a user-defined String, you need:
>
> String s = String("abc");

The following works fine:

RCString s = "abc";

It will call RCString.this with "abc". The problem is passing string literals or slices to functions that receive RCString.
September 15, 2014
On 15/09/2014 5:51 p.m., Andrei Alexandrescu wrote:
> On 9/14/14, 9:51 PM, Rikki Cattermole wrote:
>>
>> static assert("abc".refCounted == "abc");
>
> The idea is we want to write things like:
>
> String s = "abc";
>
> and have it be either refcounted or "classic" depending on the
> definition of String. With a user-defined String, you need:
>
> String s = String("abc");
>
> or
>
> auto s = String("abc");
>
>
> Andrei
>

Yeah I thought so.
Still I think the whole with statement would be a better direction to go, but what ever. I'll drop it.
September 15, 2014
On 9/14/14, 10:55 PM, Jakob Ovrum wrote:
> On Monday, 15 September 2014 at 05:50:36 UTC, Andrei Alexandrescu wrote:
>> and have it be either refcounted or "classic" depending on the
>> definition of String. With a user-defined String, you need:
>>
>> String s = String("abc");
>
> The following works fine:
>
> RCString s = "abc";
>
> It will call RCString.this with "abc". The problem is passing string
> literals or slices to functions that receive RCString.

Yah, sorry for the confusion. -- Andrei
« First   ‹ Prev
1 2 3 4 5 6 7 8 9 10 11