June 05, 2013 Re: std.compress | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On 6/5/2013 2:19 PM, Jonathan M Davis wrote:
> On Wednesday, June 05, 2013 13:48:40 Walter Bright wrote:
>> On 6/5/2013 11:36 AM, David Nadlinger wrote:
>>> It also doesn't utilize template constraints, reinvents
>>> isRandomAccessRange && hasSlicing under a poor name,
>>
>> Didn't want to include all of Phobos, which happens if you import std.range.
>
> Which is why Dmitry was suggesting that we have separate modules for the
> traits rather than sticking them in the same modules as the functions.
I know we have to fix the granularity issue in Phobos. But it isn't fixed at the moment, and it isn't std.compress' mission to fix it.
|
June 05, 2013 Re: std.compress | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On 6/5/2013 2:42 PM, Dmitry Olshansky wrote:
> if you limit their hunger for std.stdio/std.format to unittests where appropriate.
I agree that those two modules in particular need to be excised from being imported willy-nilly.
|
June 05, 2013 Re: std.compress | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On 6/5/2013 2:23 PM, Jonathan M Davis wrote: > That was pretty much the point of DIP 37. If it was only about being able to > import packages, then you could just do foo/all.d and have all.d publicly > import all of the foo package - many people do this now. But DIP 37 will allow > us to do this in a way which makes it possible to split up a module into a > package in place without breaking code. So, we'll be able to split larger > modules such as std.algorithm and std.datetime into packages without forcing > everyone to change their code. Voila! https://github.com/D-Programming-Language/dmd/pull/2139 |
June 06, 2013 Re: std.compress | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Wednesday, June 05, 2013 15:16:56 Walter Bright wrote:
> On 6/5/2013 2:19 PM, Jonathan M Davis wrote:
> > On Wednesday, June 05, 2013 13:48:40 Walter Bright wrote:
> >> On 6/5/2013 11:36 AM, David Nadlinger wrote:
> >>> It also doesn't utilize template constraints, reinvents isRandomAccessRange && hasSlicing under a poor name,
> >>
> >> Didn't want to include all of Phobos, which happens if you import std.range.>
> > Which is why Dmitry was suggesting that we have separate modules for the traits rather than sticking them in the same modules as the functions.
>
> I know we have to fix the granularity issue in Phobos. But it isn't fixed at the moment, and it isn't std.compress' mission to fix it.
That may be, but I'd argue that it's better to use the standard traits rather than rolling your own (especially if the user is going to be seeing them when they give incorrect arguments to a template), and then that dependence on std.range can be fixed later when we fix it for everything else. But clearly, you rate module independence far higher than I do.
- Jonathan M Davis
|
June 06, 2013 Re: std.compress | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | Am Thu, 06 Jun 2013 00:00:19 +0200 schrieb "Adam D. Ruppe" <destructionator@gmail.com>: > $ time make > dmd -version=withoaut_custom_runtime_reflection > -debug=allocations -m64 -debug -c object.d minimal.d invariant.d > memory.d -gc -defaultlib= -debuglib= > gcc minimal.o object.o invariant.o memory.o -o minimal -g -m64 > -nostdlib Typo: withoaut_custom_runtime_reflection -> without_custom_runtime_reflection -- Marco |
June 06, 2013 Re: std.compress | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Am Wed, 05 Jun 2013 13:36:06 -0400 schrieb Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org>: > I think it's worth discussing the interface much more than the approach to modularization. > > Walter's algo traffics in InputRange!ubyte and offers an InputRange!ubyte. That makes sense in some situations, but often trafficking one ubyte at a time may be not only inefficient but also the wrong granularity. Consider: > > auto data = File("input.txt").byChunk().compress(); I realized that yesterday. Now that my circular buffer implementation appears to be thread safe enough *cough* to run some tests I realized that when I produce 64 KiB input blocks and mark them as read byte-by-byte I get a massive overhead. > That won't work because byChunk deals in ubyte[], not ubyte. How do we fix this while keeping everybody efficient? My first attempt will keep a byte-wise interface but use larger buffer chunks inside the circular buffer, so that some of the checks (most notably: consumer marks a byte of buffer as read, can the producer continue?) can be delayed until e.g. 4 KiB or whatever is available have been processed. > I talked to Walter and during this work he figured a lot of things about how ranges work and how they generate code. Turns out that the range equivalent of a tight loop is slightly less efficient with dmd because a range must keep its state together, which is harder to enregister than a bunch of automatic variables. Then again, for performance critical applications dmd has long lost touch with GCC and LLVM, so while of big interest for its author and sometimes in "D is slow" threads, there are good alternatives. (I think LLVM is a real enabler in the current explosion of programming languages.) > Right now we have joiner(), which given several ranges of T, offers a range of T. Developing along that idea, we should have two opposite functions: itemize() and collect(). > > itemize() takes a range of ranges of T and offers a range of T. For example, given a range of T[], offers a range of T. > > collect() takes a range of T and offers a range of T[]. The number of items in each chunk can be a parameter. > > With that in tow, we can set things up such that compress() and expand() traffic in ranges of ranges of ubyte (or simply ranges of ubyte[]), which ensures work at maximum speed. Then the matter of adapting to and fro ranges of ubyte is a simple matter of chaining a call to itemize() or collect(). > > Destroy. That's neat from a usability point of view. Does this mean collect() will always return a range of built-in arrays, since it uses arrays as the internal buffer? > I salute this code. It is concise, well engineered, well written, just as general as it needs, uses the right features in the right places, and does real work. A perfect example to follow. The only thing I'd add is this convenience function: > > void putBits(bool[] bits...) { > foreach (b; bits) putBit(b); > } > > That reduces a bunch of code, and leaves room for future optimizations inside putBits. > > > Andrei Putting single bits into the buffer is an important functionality, but I'm not convinced of this solution. Real code has the bits in integer variables or could use: buffer.putBits!3(0b111); - or - buffer.putBits(3, 0b111); It is always more efficient than using a variadic number of function parameters or even a bool[]. In many cases the bits can be ORed on the current write position directly. One other feature worth adding might be to reverse the bit order before putting them into or after pulling them out of the buffer. It happens once in gzip code I've seen. -- Marco |
June 06, 2013 Re: std.compress | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | On Wednesday, 5 June 2013 at 22:06:09 UTC, Adam D. Ruppe wrote:
> you know it might be a decent idea to change std.stdio to use scoped imports and have the writeln that specializes on string to not import anything else and see what happens on the hello world case.
I think using scoped import should be a requirement in the standard library.
|
June 06, 2013 Re: std.compress | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | On Thursday, 6 June 2013 at 02:20:45 UTC, Marco Leise wrote:
> Typo: withoaut_custom_runtime_reflection
> -> without_custom_runtime_reflection
That was actually intentional: I wanted to turn that version off so it would compile more code for the test, and adding a random letter was quicker than actually deleting the whole thing (especially since I'll put it back later).
|
June 06, 2013 Re: std.compress | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Wednesday, 5 June 2013 at 20:49:19 UTC, Walter Bright wrote:
>> uses C printf (!) in the examples,
>
> Again, trying to make it lightweight.
… and wrong, in more than one way.
David
(What happens if the input is larger than 'uint.max / 2 + 1'? Than 'uint.max' on a 'size_t.sizeof == 8' target?)
|
June 07, 2013 Re: std.compress | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On Thursday, 6 June 2013 at 20:50:08 UTC, David Nadlinger wrote:
> On Wednesday, 5 June 2013 at 20:49:19 UTC, Walter Bright wrote:
>>> uses C printf (!) in the examples,
>>
>> Again, trying to make it lightweight.
>
> … and wrong, in more than one way.
>
> David
>
>
> (What happens if the input is larger than 'uint.max / 2 + 1'? Than 'uint.max' on a 'size_t.sizeof == 8' target?)
Not sure what you're getting at. I can only see 2 uncommented calls to printf, and neither do any formatting, so I'm not sure how they'd be wrong?
|
Copyright © 1999-2021 by the D Language Foundation