February 06, 2014
On 2/6/14, 9:19 AM, H. S. Teoh wrote:
> On Thu, Feb 06, 2014 at 04:30:31PM +0000, Dicebot wrote:
>> On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu
>> wrote:
>>> // lib code
>>> struct RCSlice(T) { ... }
>>> alias rcstring = RCSlice!(immutable char);
>>> rcstring rc!(string s) { ... }
>>>
>>> // user code
>>> auto s1 = buildPath!("hello", "world");
>>> auto s2 = buildPath!(rc!"hello", rc!"world");
>>>
>>> In this example s1 will have type string and s2 will have type
>>> rcstring.
>>
>> Looks unnecessary restrictive. Why can't one build rc-string from
>> stack buffers or Array!char from rc-strings? Type of output buffer
>> does not have to do anything with input.
>
> Agree. Phobos algorithms that populate a data sink should migrate toward
> using output ranges instead of returning a predetermined type. This will
> not only address ARC needs, but a bunch of other things as well (output
> range support/use in Phobos is still rather scanty at the moment).

I will mention again that output ranges lead to quite a bit more code on the caller site. They do give great control, but I'm hoping for something more convenient.

Andrei

February 06, 2014
On Thu, Feb 06, 2014 at 09:56:14AM -0800, Andrei Alexandrescu wrote:
> On 2/6/14, 9:19 AM, H. S. Teoh wrote:
> >On Thu, Feb 06, 2014 at 04:30:31PM +0000, Dicebot wrote:
> >>On Thursday, 6 February 2014 at 16:25:37 UTC, Andrei Alexandrescu wrote:
> >>>// lib code
> >>>struct RCSlice(T) { ... }
> >>>alias rcstring = RCSlice!(immutable char);
> >>>rcstring rc!(string s) { ... }
> >>>
> >>>// user code
> >>>auto s1 = buildPath!("hello", "world");
> >>>auto s2 = buildPath!(rc!"hello", rc!"world");
> >>>
> >>>In this example s1 will have type string and s2 will have type rcstring.
> >>
> >>Looks unnecessary restrictive. Why can't one build rc-string from stack buffers or Array!char from rc-strings? Type of output buffer does not have to do anything with input.
> >
> >Agree. Phobos algorithms that populate a data sink should migrate toward using output ranges instead of returning a predetermined type. This will not only address ARC needs, but a bunch of other things as well (output range support/use in Phobos is still rather scanty at the moment).
> 
> I will mention again that output ranges lead to quite a bit more code on the caller site. They do give great control, but I'm hoping for something more convenient.
[...]

That's only because the current output range API consists of only a single .put method. Please see the other thread started by Adam Ruppe: we should spend some time to think about how we can streamline output ranges so that they can be used just as easily as input ranges -- y'know, with UFCS chaining and such, that doesn't require a ton of boilerplate like the current process of: declare output range, pass to function, get data from result, pass to next function, etc.. This is primarily a syntactical problem, not a logical one, and since we're so good at syntactic bikeshedding, we should be able to solve this relatively easily, right? ;-)


T

-- 
Life is unfair. Ask too much from it, and it may decide you don't deserve what you have now either.
February 06, 2014
On Thursday, 6 February 2014 at 17:56:15 UTC, Andrei Alexandrescu wrote:
> I will mention again that output ranges lead to quite a bit more code on the caller site.

People are asking for control over memory management. You can't then complain that you get control over memory management!

I'd furthermore like to note that there's no reason why we can't have the best of both worlds through default parameters and/or different names.

Suppose our thing is defined as this:

T[] toUpper(T, OR = GCSink!T)(in T[] data, OR output = OR()) {
    output.start();
    foreach(d; data)
       output.put(d & ~0x20);
    return output.finish();
}

struct GCSink(T) {
    // so this is a reference type
    private struct Impl {
        T[] data;
        void put(T t) { data ~= t; }
        T[] finish() { return data; }
    }
    Impl* impl;
    alias impl this;
    void start() {
        impl = new Impl;
    }
}

// an output range into an existing array container
struct StaticSink(T) {
    T[] container;
    this(T[] c) { container = c; }
    size_t size;
    void start() { size = 0; }
    void put(T t) { container[size++] = t; }
    T[] finish() { return container[0 .. size]; }
}
StaticSink!T staticSink(T)(T[] t) {
    return StaticSink!T(t);
}

void main() {
    import std.stdio;
    writeln(toUpper("cool")); // default: GC
    char[10] buffer;
    auto received = toUpper("cool", staticSink(buffer[])); // custom static sink
    assert(buffer.ptr is received.ptr);
    assert(received == "COOL");
}
February 06, 2014
On 2/6/14, 10:15 AM, H. S. Teoh wrote:
> That's only because the current output range API consists of only a
> single .put method. Please see the other thread started by Adam Ruppe:
> we should spend some time to think about how we can streamline output
> ranges so that they can be used just as easily as input ranges --
> y'know, with UFCS chaining and such, that doesn't require a ton of
> boilerplate like the current process of: declare output range, pass to
> function, get data from result, pass to next function, etc.. This is
> primarily a syntactical problem, not a logical one, and since we're so
> good at syntactic bikeshedding, we should be able to solve this
> relatively easily, right? ;-)

I don't think it's that easy. For example the output range must be passed as a ref parameter into the function, which is already introducing friction.

FWIW things we can add are to output ranges:

~= for convenience
.flush() or .done() to mark the end of several writes
.clear() to clear the range (useful if e.g. it's implemented as a slice with appending)


Andrei


February 06, 2014
06-Feb-2014 22:29, Andrei Alexandrescu пишет:
> On 2/6/14, 10:15 AM, H. S. Teoh wrote:
>> That's only because the current output range API consists of only a
>> single .put method. Please see the other thread started by Adam Ruppe:
>> we should spend some time to think about how we can streamline output
>> ranges so that they can be used just as easily as input ranges --
>> y'know, with UFCS chaining and such, that doesn't require a ton of
>> boilerplate like the current process of: declare output range, pass to
>> function, get data from result, pass to next function, etc.. This is
>> primarily a syntactical problem, not a logical one, and since we're so
>> good at syntactic bikeshedding, we should be able to solve this
>> relatively easily, right? ;-)
>
> I don't think it's that easy. For example the output range must be
> passed as a ref parameter into the function, which is already
> introducing friction.
>
> FWIW things we can add are to output ranges:
>
> ~= for convenience
> .flush() or .done() to mark the end of several writes
> .clear() to clear the range (useful if e.g. it's implemented as a slice
> with appending)
>

.reserve(n) to notify underlying sink that it n items are coming (it should preallocate etc.)


-- 
Dmitry Olshansky
February 06, 2014
On Thursday, 6 February 2014 at 17:54:38 UTC, Andrei Alexandrescu wrote:
> Malloced!T and GC!T suggests parameterization by the type of the allocator.

Not necessarily, different allocators with the same free could return the same type. The key point is the knowledge of how to free it is encapsulated there in some way.

> RefCounted!T is a whole different thing, because it doesn't encode allocation strategy but instead memory reclamation tactics.

Malloced!T also encodes reclamation tactics: ~this() { free(ptr); }

You could also call it Unique!T(&free): the malloced pointer is unique and must be released with free. That cvers the same ground in more generic way. (Surely refcounted!T needs to know what happens when count==0 too.)
February 06, 2014
On 2014-02-06 15:47:05 +0000, Johannes Pfau said:

> Am Thu, 6 Feb 2014 14:37:59 +0300
> schrieb Max Klyga <max.klyga@gmail.com>:
> 
>> 
>> My point is that we should not ruin the language ease of use. We do
>> need to deal with Phobos internal allocations, but we should not
>> switch to ARC as a default memory management scheme.
> 
> snip

I wholeheartedly agree that we should define methods in phobos taking output buffers/ranges.
One of the reasons Tango xml parser was the fastest in the world was because almost every method/function in Tango was takinig output buffer as argument and never allocated unless asked specifically.

This would allow everyone chosing a method of memory management most suited for their domain.

February 06, 2014
On 2/6/14, 11:14 AM, Adam D. Ruppe wrote:
> On Thursday, 6 February 2014 at 17:54:38 UTC, Andrei Alexandrescu wrote:
>> Malloced!T and GC!T suggests parameterization by the type of the
>> allocator.
>
> Not necessarily, different allocators with the same free could return
> the same type. The key point is the knowledge of how to free it is
> encapsulated there in some way.
>
>> RefCounted!T is a whole different thing, because it doesn't encode
>> allocation strategy but instead memory reclamation tactics.
>
> Malloced!T also encodes reclamation tactics: ~this() { free(ptr); }

So if T is int[] and you have taken a slice into it...?

> You could also call it Unique!T(&free): the malloced pointer is unique
> and must be released with free. That cvers the same ground in more
> generic way. (Surely refcounted!T needs to know what happens when
> count==0 too.)

I'm not sure I understand what you are talking about.


Andrei
February 06, 2014
On Thursday, 6 February 2014 at 19:33:39 UTC, Andrei Alexandrescu wrote:
> So if T is int[] and you have taken a slice into it...?

If you escape it, congratulations, you have a memory safety bug. Have fun tracking it down.

You could also offer refcounted slicing, of course (wrapping the Unique thing in a refcounter would work), or you could be converted to the church of scope where the compiler will help you catch these bugs without run time cost.

> I'm not sure I understand what you are talking about.

When the reference count reaches zero, what happens? This changes based on the allocation method: you might call GC.free, you might call free(), you might do nothing, The destructor needs to know, otherwise the refcounting achieves exactly nothing! We can encode this in the type or use a function pointer for it.

struct RefCounted(T) {
    private struct Impl {
        // potential double indirection lol
        private T payload;
        private size_t count;
        private void function(T) free;
        T getPayload() { return payload; } // so it is an lvalue
        alias getPayload this;
    }
    Impl* impl;
    alias impl this;
    this(T t, void function(T) free) {
       impl = new Impl; // some kind allocation at startup lol
                        // naturally, this could also be malloc
                        // or take a generic allocator form the user
                        // but refcounted definitely needs some pointer love
       impl.payload = t;
       impl.count = 1;
       impl.free = free; // gotta store this so we can free later
    }
    this(this) {
       impl.count++;
    }
    ~this() {
       impl.count--;
       if(impl.count == 0) {
           // how do we know how to free it?
           impl.free(impl.payload);

           // delete impl; GC.free(impl) core.stdc.stdlib.free(impl);
           // whatever
           impl = null;
       }
    }
}


In this example, we take the reference we're counting in the constructor... which means it is already allocated. So logically, the user code should tell it how to deallocate it too. We can't just call a global free, we take a pointer instead.


So this would work kinda like this:

import core.stdc.stdlib;
int[] stuff = malloc(int.sizeof * 5)[0 .. 5];
auto counted = RefCounted!(int[])(stuff, (int[] stuff) { free(stuff.ptr); });



The allocator is not encoded in the type, but ref counted does need to know what happens when the final reference is gone. It takes a function pointer from the user for that.


This is a generically refcounting type. It isn't maximally efficient but it also works with arbitrary inputs allocated by any means.

Unique!T could do something similar, but unique would disable its postblit instead of incrementing a refcount.
1 2 3
Next ›   Last »