June 05, 2013
On Wednesday, 5 June 2013 at 12:55:50 UTC, Andrei Alexandrescu
wrote:
> On 6/5/13 2:55 AM, Timothee Cour wrote:
>> What I suggested in my original post didn't involve any
>> indirection/abstraction; simply a renaming to be consistent with
>> existing zlib (see my points A+B in my 1st post on this thread):
>>
>> std.compress.zlib.compress
>> std.compress.zlib.uncompress
>> std.compress.lzw.compress
>> std.compress.lzw.uncompress
>
> I think that's nice.

+1. D has many powerful features for handling module namespacing (e.g. "import lzw = std.compress.lzw"), let's enable people to make use of them.

David
June 05, 2013
On Wednesday, 5 June 2013 at 14:17:43 UTC, David Nadlinger wrote:
> On Wednesday, 5 June 2013 at 11:30:10 UTC, John Colvin wrote:
>> Although I think you're right about having smaller modules, I generally find it easier to browse through a larger file than many smaller files.
>>
>> Multiple files is ok if you know what you're looking for (grep) but when you're just trying to scan across a system to get a feel for how it's working, juggling many files is a real pita.
>
> Use an editor with a file tree sidebar? Quite on the contrary, I find many files to be much preferable, because you automatically have "bookmarks" in the source to come back to, and having the functionality already grouped in manageable logical units saves you from inferring that structure again, as it is the case when scrolling through a huge file.
>
> On a lighter note, if it's really a problem for you that module files are too small, what about just concatenating all the files in a given directory using a little shell magic? ;)
>
> David

Agreed.

To be honest, it's a trivial matter easily solved by a variety of tools, but I'm often just lazy and end up reading code with gedit or similar.
June 05, 2013
On Wednesday, 5 June 2013 at 04:54:46 UTC, Andrei Alexandrescu wrote:
>> That "absolutely" based on limited personal experience is the biggest
>> D's problem.
>
> It's a point, but "biggest" is also kind of too much and based on limited personal experience :o).
>
> Andrei

Hey, if you ever need someone who can reliably answer with limited personal experience, I'm available. :-)
June 05, 2013
On Wednesday, June 05, 2013 14:02:37 Jakob Ovrum wrote:
> We have a standard library in disagreement with the language's encapsulation mechanics. The module/package system in D is almost ignored in Phobos (and that's probably why the package system still has all these little things needing ironing out). It seems to owe influence to typical C and C++ library structure, which is simply suboptimal in D's module system.

I honestly don't see how Phobos is in disagreement with the module system. No, it doesn't use hierarchy as much as it should, and there are a few modules that are overly large (like std.algorithm or std.datetime), but for the most part, I don't see any problem with its level of encapsulation. It's mainly just its organization which could have been better. My primary objection here is that it seems ridiculous to me create lots of tiny modules. I hate how Java does that sort of thing, but there you're _forced_ to in many cases, whereas we have the opportunity to actually group things together in a single module where appropriate. And having whole modules with only one or two functions is way too small IMHO, and that seems to be what we're proposing here.

- Jonathan M Davis
June 05, 2013
On Wednesday, 5 June 2013 at 17:21:01 UTC, Jonathan M Davis wrote:
> On Wednesday, June 05, 2013 14:02:37 Jakob Ovrum wrote:
>> We have a standard library in disagreement with the language's
>> encapsulation mechanics. The module/package system in D is almost
>> ignored in Phobos (and that's probably why the package system
>> still has all these little things needing ironing out). It seems
>> to owe influence to typical C and C++ library structure, which is
>> simply suboptimal in D's module system.
>
> I honestly don't see how Phobos is in disagreement with the module system. No,
> it doesn't use hierarchy as much as it should, and there are a few modules
> that are overly large (like std.algorithm or std.datetime), but for the most
> part, I don't see any problem with its level of encapsulation. It's mainly
> just its organization which could have been better. My primary objection here
> is that it seems ridiculous to me create lots of tiny modules. I hate how Java
> does that sort of thing, but there you're _forced_ to in many cases, whereas
> we have the opportunity to actually group things together in a single module
> where appropriate. And having whole modules with only one or two functions is
> way too small IMHO, and that seems to be what we're proposing here.
>
> - Jonathan M Davis

I agree with one or two functions it's far too small, but I'm in favour of having only one or two top-level classes/structs per module (there will be exceptional cases but in general)

For examples:
std.regex - I think it would be better if each implementation had its own module, plus a separate module for the parts common to all of them. Importing std.regex would publicly import the lot using the new package system.

std.range - module for tests, ie. isXXX and hasXXX, module for algorithms ie. retro, take, etc., module for class wrappers

std.datetime - split each class/struct into own module, systime alone is ~8000 lines
June 05, 2013
05-Jun-2013 16:16, Tiago Martinez пишет:
> On Tuesday, 4 June 2013 at 03:44:05 UTC, Walter Bright wrote:
>> https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d
>>
>> I wrote this to add components to compress and expand ranges.
>>
>> Highlights:
>>
>> 1. doesn't do any memory allocation
>> 2. can handle arbitrarily large sets of data
>> 3. it's lazy
>> 4. takes an InputRange, and outputs an InputRange
>>
>> Comments welcome.
>
> I may have misunderstood something, but the code does not
> implement LZW (a variant of LZ78), but a variant of LZ77 (i.e.
> deflate/ZIP).

+1
I thought to chime in with this too, keywords are:
sliding window ===> LZ77
dictionary ===> LZW

> See https://en.wikipedia.org/wiki/LZ77_and_LZ78


-- 
Dmitry Olshansky
June 05, 2013
On 6/5/2013 10:46 AM, Dmitry Olshansky wrote:
> 05-Jun-2013 16:16, Tiago Martinez пишет:
>> I may have misunderstood something, but the code does not
>> implement LZW (a variant of LZ78), but a variant of LZ77 (i.e.
>> deflate/ZIP).
>
> +1
> I thought to chime in with this too, keywords are:
> sliding window ===> LZ77
> dictionary ===> LZW
>
>> See https://en.wikipedia.org/wiki/LZ77_and_LZ78

Thanks, you're both right.

June 05, 2013
On Wed, Jun 05, 2013 at 01:20:48PM -0400, Jonathan M Davis wrote:
> On Wednesday, June 05, 2013 14:02:37 Jakob Ovrum wrote:
> > We have a standard library in disagreement with the language's encapsulation mechanics. The module/package system in D is almost ignored in Phobos (and that's probably why the package system still has all these little things needing ironing out). It seems to owe influence to typical C and C++ library structure, which is simply suboptimal in D's module system.
> 
> I honestly don't see how Phobos is in disagreement with the module system. No, it doesn't use hierarchy as much as it should, and there are a few modules that are overly large (like std.algorithm or std.datetime), but for the most part, I don't see any problem with its level of encapsulation. It's mainly just its organization which could have been better. My primary objection here is that it seems ridiculous to me create lots of tiny modules. I hate how Java does that sort of thing, but there you're _forced_ to in many cases, whereas we have the opportunity to actually group things together in a single module where appropriate. And having whole modules with only one or two functions is way too small IMHO, and that seems to be what we're proposing here.
[...]

As Andrei pointed out, I think we need to look at this not from a size perspective (number of lines, number of functions, etc.), but from an API perspective: do these functions/structs belong together, or are they only marginally related? More precisely, if some user code uses function X, is that code equally likely to also use Y? Are there common use cases in which only Y is used, not X?

If the use of function X almost always implies the use of function Y (and vice versa), then they belong in the same module. Otherwise, I'd say they are candidates for splitting up.

If function X uses function Z, and function Y also uses function Z, but the use of X does not necessarily imply the use of Y (and vice versa), then I'd argue that X, Y, and Z should be in separate modules to maximize reuse and reduce the amount of code you have to pull in (you shouldn't be forced to pull in Z just because you use X which calls Y, which Z happens to also call).

This may be a bit heavy-handed for user code, but for Phobos, the standard library, I think the bar should be set higher. After all, one of the stated goals of Phobos is that you shouldn't need to pull in a whole ton of code just because you call a single function. Right now I think we're a bit short of that goal.


T

-- 
All men are mortal. Socrates is mortal. Therefore all men are Socrates.
June 05, 2013
On Wednesday, 5 June 2013 at 07:00:14 UTC, Jonathan M Davis wrote:
> So, you want to create whole modules for each compression algorithm? That
> seems like overkill to me. What Walter currently has isn't even 1000 lines
> long (and that's including the CircularBuffer helper struct). Splitting it up
> like that seems like over-modularation to me.
>
> - Jonathan M Daivs

Well, as the author of a 15,000 lines datetime module, I think your opinion is a little biased.

*I* think 1,000 lines is a perfect size for a module.
June 05, 2013
Am 04.06.2013 22:20, schrieb Walter Bright:
> On 6/4/2013 12:41 PM, Peter Alexander wrote:
>> I think this is over-engineering. It's unlikely that an application
>> will need to
>> support multiple compression algorithms in the same piece of code, and
>> even if
>> it did, it would be trivial to implement this on top of the simple
>> interface
>> that Walter is using.
>
> Yup. My experience with abstractions that have no use cases is all the
> wrong things get abstracted. And by my experience, I include every one
> I've seen other people write as well as my own.
>
> My favorite is windows.h. It was originally written for 16 bit Windows,
> and had all kinds of abstractions to make it portable for a future 32
> bit Windows. Unfortunately, apparently nobody working on windows.h had
> any experience with 32 bit code, and the abstractions turned out to be
> all wrong.

Yep, it brings back some memories.