October 14, 2010
On Thu, Oct 14, 2010 at 10:45 AM, Andrei Alexandrescu <andrei at erdani.com>wrote:
>
>
> 3. Same discussion about decode. This is actually more important because you might want to decode streams of dchar. This is how many streams will come through, even though they are technically Ascii.
>
> Accepting string and wstring causes problems as they don't define length,
and of course may contain non-ascii characters.  Is the phobos-standard way to assume they contain only ascii characters and take the array length as the string length?  (with checks in non-release mode)


@property ubyte[] front();
>


Regarding Daniel's approach with char/byte level ranges through and through,
> in an ideal world I'd agree. But I fear that the implementation would not be as efficient. (I suggest you benchmark it against Masahiro's.) Also, practically, more often than not I'll want to work one chunk at a time, not one byte/char at a time.
>

I could be wrong, but I don't think I've seen any ranges in phobos that return more than one logical unit from front.  I understand this is common with streams, but I'm unsure what the phobos policy is on doing this with ranges.

A set of ranges that convert char[4] <=> ubyte[3] could be an option, but the output would need to be flattened in order to use it with many algorithms.  You may want to convert by chunks, but you usually want to work with a range of ubyte/char, not a range of ubyte[]/char[].

I will do some benchmarks to see how much worse the range-based solution performs.  I assume doesn't compare very well for the buffer -> buffer case, but still fares better some applications. (eg. searching and other things that don't need to convert the entire range).

Daniel.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20101014/2b44f21a/attachment.html>
October 14, 2010
On Thu, Oct 14, 2010 at 12:12 PM, Masahiro Nakagawa <repeatedly at gmail.com>wrote:

> 1. The template parameters '!' and '/' are not justified. They should be
>> runtime parameters. Rule of thumb: use generic code when you stand to profit.
>>
>
> I don't understand this point.
> Please tell me the merit of runtime parameters.
> I can't imagine such situations.
>
>
I actually really like the Base64Impl.  It allows you to build DecodeMap statically, and provides a very nice api for using custom base64 variations. I'm not sure how you'd do anything this nice with runtime parameters.

alias Base64('^', '&', Base64.NoPadding) CustomBase64;
CustomBase64.encode(data)
CustomBase64.decode(encodeddata)

vs

encode(data, '^', '&', Base64.NoPadding);
decode(encodeddata, '^', '&', Base64.NoPadding);

I didn't really see this until I started using it. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20101014/d8753749/attachment-0001.html>
October 14, 2010
The decode length doesn't have to be exact.  Just make it a reasonably close upper bound.  That's generally only used for preallocating a destination buffer anyway.

On Oct 14, 2010, at 1:21 AM, Daniel Murphy wrote:

> On Thu, Oct 14, 2010 at 4:59 AM, Masahiro Nakagawa <repeatedly at gmail.com> wrote:
> I wait no-caching implementation. I think Range should not have useless state.
> In addition, current implementation seems to be still more complex than necessary.
> 
> 
> 
> I thought you might say that.
> http://yebblies.com/rangebase64nocache.d
> These ranges only read the bits that they actually need off the input, only converting one output item per popFront.  I don't consider this a better solution.  I haven't adapted length to work with this version, but I will if this implementation is actually going to be used.
> 
>  I'd welcome any ideas/suggestions on how to simplify the implementation of either version, especially the Decoder length, which is considerably ugly.
> 
> 
> I don't think so. Range is a good concept, but not all.
> I think usability is a most important factor for module API.
> 
> 
> 
> Base64's API is following:
> 
>  encodeLength(length);
>  encode(src, dst);
>  encode(src);
>  encoder(returns Encoder!(char[]) or Encoder!(char))
>  decodeLength(length);
>  decode(src, dst);
>  decode(src);
>  decoder(returns Decoder!(ubyte[]) or Decoder!(ubyte))
> 
> Do you see any problems?
> 
> Looks fine to me.
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos

October 15, 2010
Daniel Murphy <yebblies at gmail.com> wrote:
> Ranges typically model one iteration of an algorithm, and this works best if each iteration reads N items and outputs 1.  Is this along the right lines? I understand that pushing converted data is part of the conversion process, but that is not the only way to do it.
> 
> The conversion function approach works when you want to convert the data,
> then use it.  I'd assume this is the common case.
> useConvertedData( convert( originalData ) );
> or
> convert(originalData, buffer);
> 
> The range approach works better when you want to do something with your data as if it was the converted data.
> 
> find( converter(originalData), something );
> auto n = calculateChecksum( converter(originalData) );
> 
> Both are forms of conversion and each is most useful in different situations.

Agreed, I can see the usefulness and correctness in both usages.  Now Masahiro said that his API would support both functions and ranges, so let's wait for the new base64 module.

Oh, and the nocache decorator range: I was surprised that was possible!  Though the range still has a state, I think it's a clever solution.


Shin
October 21, 2010
On Thu, 14 Oct 2010 17:21:38 +0900, Daniel Murphy <yebblies at gmail.com> wrote:

> On Thu, Oct 14, 2010 at 4:59 AM, Masahiro Nakagawa <repeatedly at gmail.com>wrote:
>
>> I wait no-caching implementation. I think Range should not have useless
>> state.
>> In addition, current implementation seems to be still more complex than
>> necessary.
>>
>>
>>
> I thought you might say that.
> http://yebblies.com/rangebase64nocache.d
> These ranges only read the bits that they actually need off the input,
> only
> converting one output item per popFront.  I don't consider this a better
> solution.  I haven't adapted length to work with this version, but I
> will if
> this implementation is actually going to be used.

I merged Daniel's code and encode(decode) can take InputRange that has length property.

http://bitbucket.org/repeatedly/scrap/src/tip/base64.d

Please check the code.


Masahiro
October 25, 2010
I replace std.base64 with this module in two days.


Masahiro

2010?10?21?6:36 Masahiro Nakagawa <repeatedly at gmail.com>:

> On Thu, 14 Oct 2010 17:21:38 +0900, Daniel Murphy <yebblies at gmail.com> wrote:
>
>  On Thu, Oct 14, 2010 at 4:59 AM, Masahiro Nakagawa <repeatedly at gmail.com
>> >wrote:
>>
>>  I wait no-caching implementation. I think Range should not have useless
>>> state.
>>> In addition, current implementation seems to be still more complex than
>>> necessary.
>>>
>>>
>>>
>>>  I thought you might say that.
>> http://yebblies.com/rangebase64nocache.d
>> These ranges only read the bits that they actually need off the input,
>> only
>> converting one output item per popFront.  I don't consider this a better
>> solution.  I haven't adapted length to work with this version, but I will
>> if
>> this implementation is actually going to be used.
>>
>
> I merged Daniel's code and encode(decode) can take InputRange that has
> length property.
>
>
> http://bitbucket.org/repeatedly/scrap/src/tip/base64.d
>
> Please check the code.
>
>
> Masahiro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20101025/75f43ca6/attachment-0001.html>
October 27, 2010
Looks awesome!

A couple of things:
> The output range code should use put(range, data) instead of
range.put(data)
> The whole static if at 1136 is dead code and refers to a variable that is
never declared
> The unittest block at 1471 doesn't compile for me, but this might be
because I'm still using 2.047
> The import comments should probably be replaced with true selective
imports (unless this is to work around the protection level bugs or
something)

Hope I haven't been too annoying,
Daniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20101027/35895d6b/attachment.html>
October 27, 2010
On Tue, 26 Oct 2010 23:41:52 +0900, Daniel Murphy <yebblies at gmail.com> wrote:

> Looks awesome!
>
> A couple of things:
>> The output range code should use put(range, data) instead of
> range.put(data)
>> The whole static if at 1136 is dead code and refers to a variable that is
> never declared

Applied!

http://bitbucket.org/repeatedly/scrap/changeset/faebc7c954d2

>> The unittest block at 1471 doesn't compile for me, but this might be
> because I'm still using 2.047
>> The import comments should probably be replaced with true selective
> imports (unless this is to work around the protection level bugs or
> something)

I want to use selective import, but... http://d.puremagic.com/issues/show_bug.cgi?id=314


Masahiro
October 27, 2010
On Wed, Oct 27, 2010 at 2:20 AM, Masahiro Nakagawa <repeatedly at gmail.com>wrote:

> Applied!
>

 Great!

The only thing that still bugs me a little is the error messages:
"Cannot call popFront on empty range" should probably say something like
"Cannot call popFront on Encoder with no data remaining", but it's really
superficial.
The static asserts could also be improved (I know I did them in the first
place)

static assert(Map62th < 'A' || Map62th > 'Z', "Character '" ~ Map62th ~ "'
cannot be used twice");
static assert(Map62th != NoPadding, "'\\0' is not a valid base64
character");

Decoder currently advances partially on each call to front!  This code
should be moved into popFront, or at the very least the result should be
cached.
Encoder doesn't advance, but currently re-encodes the data on each call to
front.  This should also be moved into popFront.

Decoder.front also currently uses array appending to build the input data.
 It might be worth only reading several bytes off the next element of the
input in order to avoid unnecessary allocations.

If this is only supposed to work with foreach, an opApply based design might be a better fit.

Daniel.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20101027/3462fc7d/attachment.html>
October 29, 2010
On Wed, 27 Oct 2010 17:10:16 +0900, Daniel Murphy <yebblies at gmail.com> wrote:

> On Wed, Oct 27, 2010 at 2:20 AM, Masahiro Nakagawa <repeatedly at gmail.com>wrote:
>
>> Applied!
>>
>
>  Great!
>
> The only thing that still bugs me a little is the error messages:
> "Cannot call popFront on empty range" should probably say something like
> "Cannot call popFront on Encoder with no data remaining", but it's really
> superficial.
> The static asserts could also be improved (I know I did them in the first
> place)
>
> static assert(Map62th < 'A' || Map62th > 'Z', "Character '" ~ Map62th ~
> "'
> cannot be used twice");
> static assert(Map62th != NoPadding, "'\\0' is not a valid base64
> character");

You are right. Thanks!

http://bitbucket.org/repeatedly/scrap/changeset/70f0d0f16ad1

> Decoder currently advances partially on each call to front!  This code
> should be moved into popFront, or at the very least the result should be
> cached.
> Encoder doesn't advance, but currently re-encodes the data on each call
> to
> front.  This should also be moved into popFront.

Your suggestion increases struct size and decreases performance. But, I agree.

http://bitbucket.org/repeatedly/scrap/changeset/75ee04cdd87b


Masahiro