June 05, 2013
On 6/4/2013 5:13 PM, Marco Leise wrote:
> Probably seek time if the files are scattered and not in cache.
> That's hardly a show stopper unless you have 17.156 files like
> the Java Runtime. But they 'solved' it by zipping them up.


Actually, I've often thought of making dmd able to read everything it needs out of a zip file.
June 05, 2013
Am Mon, 03 Jun 2013 20:44:04 -0700
schrieb Walter Bright <newshound2@digitalmars.com>:

> Comments welcome.

LZW is a nice and fast general purpose algorithm and I welcome its addition to Phobos to build file format readers from it (MS-DOS compress, GIF, TIFF) or even just to compress data on the fly in RAM. Most people seem to have moved on to zlib though for pretty much anything else.

Actually I just happened to attempt something similar. Influenced by your talk about modularity and bioinfornatic's micro benchmarking with reading FASTA files I try to wrap up the concepts of bit streams and algorithms processing them. But some of my design goals are different:

a) Not-Invented-Here must take precedence. :D
b) There is no other measure than bytes/second.
c) Every algorithm must run in its own thread for maximal
   parallelism. (like Unix process piping)

So it is not about parallel algorithms, but building processing pipelines that work like Unix where only circular buffers need to be shared from one algorithm to the next.

Am Mon, 3 Jun 2013 23:40:06 -0700
schrieb Timothee Cour <thelastmammoth@gmail.com>:

> A)
> there already is std.zlib; why not have:
> std.compress.zlib: public import std.zlib
> std.compress.lzw: put this new module there instead of in std.compress
> std.compress.image.png
> std.compress.image.jpg

Yes and no. Compression algorithms should be in std.compress and share the same API, but image file formats in std.image.* or std.fileformat.*. You don't look into std.compress when you want to open *.bmps and *.jpgs.

Am Tue, 04 Jun 2013 01:00:03 -0700
schrieb Walter Bright <newshound2@digitalmars.com>:

> On 6/3/2013 11:40 PM, Timothee Cour wrote:
> > D)
> > CircularBuffer belongs somewhere else; maybe std.range or std.container
> 
> I have mixed feelings about that. If you'll notice, std.compress doesn't have any imports! I wanted to make at least one module that doesn't pull in 100% of everything in Phobos (one of my pet peeves).

I have nothing to add to the discussion on THAT matter, but
a compromise should be found between few massive imports (D)
and hundreds of tiny imports (Java). :)

-- 
Marco

June 05, 2013
Am Tue, 04 Jun 2013 17:58:01 -0700
schrieb Walter Bright <newshound2@digitalmars.com>:

> On 6/4/2013 5:13 PM, Marco Leise wrote:
> > Probably seek time if the files are scattered and not in cache. That's hardly a show stopper unless you have 17.156 files like the Java Runtime. But they 'solved' it by zipping them up.
> 
> 
> Actually, I've often thought of making dmd able to read everything it needs out of a zip file.

That would have been difficult for editors and IDEs that can look up file names from include paths only when they are not zipped up. It is good the way it is.

-- 
Marco

June 05, 2013
On 06/04/2013 11:52 AM, Peter Alexander wrote:
> Well, the fix is currently in an unapproved DIP. I have no idea whether
> Walter intends to accept it or reject it. The discussion thread just
> seems to have died off.
>
> http://wiki.dlang.org/DIP22

I should really submit some ideas from my implementation to the DIP.
https://github.com/D-Programming-Language/dmd/pull/739

June 05, 2013
On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
> On 6/4/2013 11:43 AM, Timothee Cour wrote:
>> writing generic code.
>> same reason as why we prefer:
>> auto y=to!double(x) over auto y=to_double(x);
>
> The situations aren't comparable. The to!double case is parameterizing with a type, the compress one is not. Secondly, compress(lzw) does ABSOLUTELY NOTHING but turn around and call lzw. It adds nothing.

That "absolutely" based on limited personal experience is the biggest D's problem.
June 05, 2013
On 6/5/13 12:44 AM, Max Samukha wrote:
> On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
>> On 6/4/2013 11:43 AM, Timothee Cour wrote:
>>> writing generic code.
>>> same reason as why we prefer:
>>> auto y=to!double(x) over auto y=to_double(x);
>>
>> The situations aren't comparable. The to!double case is parameterizing
>> with a type, the compress one is not. Secondly, compress(lzw) does
>> ABSOLUTELY NOTHING but turn around and call lzw. It adds nothing.
>
> That "absolutely" based on limited personal experience is the biggest
> D's problem.

It's a point, but "biggest" is also kind of too much and based on limited personal experience :o).

Andrei
June 05, 2013
On Wednesday, 5 June 2013 at 04:54:46 UTC, Andrei Alexandrescu wrote:
> On 6/5/13 12:44 AM, Max Samukha wrote:
>> On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
>>> On 6/4/2013 11:43 AM, Timothee Cour wrote:
>>>> writing generic code.
>>>> same reason as why we prefer:
>>>> auto y=to!double(x) over auto y=to_double(x);
>>>
>>> The situations aren't comparable. The to!double case is parameterizing
>>> with a type, the compress one is not. Secondly, compress(lzw) does
>>> ABSOLUTELY NOTHING but turn around and call lzw. It adds nothing.
>>
>> That "absolutely" based on limited personal experience is the biggest
>> D's problem.
>
> It's a point, but "biggest" is also kind of too much and based on limited personal experience :o).
>
> Andrei

Yeah, I noticed that.

June 05, 2013
On 6/4/2013 9:44 PM, Max Samukha wrote:
> On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
>> On 6/4/2013 11:43 AM, Timothee Cour wrote:
>>> writing generic code.
>>> same reason as why we prefer:
>>> auto y=to!double(x) over auto y=to_double(x);
>>
>> The situations aren't comparable. The to!double case is parameterizing with a
>> type, the compress one is not. Secondly, compress(lzw) does ABSOLUTELY NOTHING
>> but turn around and call lzw. It adds nothing.
>
> That "absolutely" based on limited personal experience is the biggest D's problem.

I've seen an awful lot of abstractions over the years that provided zero value.

You need to provide a compelling use case to justify another layer of complexity. "generic code" is not a compelling use case. It's already generic.

Note how these components are to be used:

    src.lzwCompress.copy(dst);

Your proposal is:

    src.compress(lzw).copy(dst);

I.e. zero value, as so far all compress() does is call lzw().

The whole point of range-based pipeline programming is you can just plug in different components. There is no demonstrated use case for adding another layer.

I am actually wrong in saying it has zero value. It has negative value :-)
June 05, 2013
On Wednesday, 5 June 2013 at 06:18:54 UTC, Walter Bright wrote:
> On 6/4/2013 9:44 PM, Max Samukha wrote:
>> On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
>>> On 6/4/2013 11:43 AM, Timothee Cour wrote:
>>>> writing generic code.
>>>> same reason as why we prefer:
>>>> auto y=to!double(x) over auto y=to_double(x);
>>>
>>> The situations aren't comparable. The to!double case is parameterizing with a
>>> type, the compress one is not. Secondly, compress(lzw) does ABSOLUTELY NOTHING
>>> but turn around and call lzw. It adds nothing.
>>
>> That "absolutely" based on limited personal experience is the biggest D's problem.
>
> I've seen an awful lot of abstractions over the years that provided zero value.

I understand. But I've also seen a lot of abstractions over the years that seemed useless initially but were discovered to be extremely useful later (Bayes theorem is an example - it took 300 years to find a concrete use for it). So "a compelling use case" is not a sufficient criterion for evaluating usefulness of abstractions.

>
> You need to provide a compelling use case to justify another layer of complexity. "generic code" is not a compelling use case. It's already generic.
>
> Note how these components are to be used:
>
>     src.lzwCompress.copy(dst);
>
> Your proposal is:
>
>     src.compress(lzw).copy(dst);
>
> I.e. zero value, as so far all compress() does is call lzw().

That's not my proposal. Honestly I didn't even take a close look at it. I just felt like it was time to attack you - there is an explicit permission for casual trolling you gave.

>
> The whole point of range-based pipeline programming is you can just plug in different components. There is no demonstrated use case for adding another layer.

Ok.

>
> I am actually wrong in saying it has zero value. It has negative value :-)

In this particular case, maybe.
June 05, 2013
What I suggested in my original post didn't involve any indirection/abstraction; simply a renaming to be consistent with existing zlib (see my points A+B in my 1st post on this thread):

std.compress.zlib.compress
std.compress.zlib.uncompress
std.compress.lzw.compress
std.compress.lzw.uncompress

same reason we have: std.file.write, std.stdio.write, etc, and not std.fileWrite, std.stdioWrite.

On Tue, Jun 4, 2013 at 11:18 PM, Walter Bright <newshound2@digitalmars.com>wrote:

> On 6/4/2013 9:44 PM, Max Samukha wrote:
>
>> On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
>>
>>> On 6/4/2013 11:43 AM, Timothee Cour wrote:
>>>
>>>> writing generic code.
>>>> same reason as why we prefer:
>>>> auto y=to!double(x) over auto y=to_double(x);
>>>>
>>>
>>> The situations aren't comparable. The to!double case is parameterizing
>>> with a
>>> type, the compress one is not. Secondly, compress(lzw) does ABSOLUTELY
>>> NOTHING
>>> but turn around and call lzw. It adds nothing.
>>>
>>
>> That "absolutely" based on limited personal experience is the biggest D's problem.
>>
>
> I've seen an awful lot of abstractions over the years that provided zero value.
>
> You need to provide a compelling use case to justify another layer of complexity. "generic code" is not a compelling use case. It's already generic.
>
> Note how these components are to be used:
>
>     src.lzwCompress.copy(dst);
>
> Your proposal is:
>
>     src.compress(lzw).copy(dst);
>
> I.e. zero value, as so far all compress() does is call lzw().
>
> The whole point of range-based pipeline programming is you can just plug in different components. There is no demonstrated use case for adding another layer.
>
> I am actually wrong in saying it has zero value. It has negative value :-)
>