std.compress (page 8)

On Wed, 05 Jun 2013 11:17:06 +0100, Jacob Carlborg <doob@me.com> wrote: > On 2013-06-05 02:58, Walter Bright wrote: > >> Actually, I've often thought of making dmd able to read everything it >> needs out of a zip file. > > I think it's better to have a proper package manager. I think it's better to have both :) R -- Using Opera's revolutionary email client: http://www.opera.com/mail/

On 2013-06-05 09:38, Jonathan M Davis wrote: > Maybe some do, but many don't, and 1000 lines is _far_ from too much. If we > start making modules that small, we're going to end up with tons of them to > wade through to find anything. I completely agree with Walter and he mad my point a lot better than I could. -- /Jacob Carlborg

On Wednesday, 5 June 2013 at 07:00:14 UTC, Jonathan M Davis wrote: > So, you want to create whole modules for each compression algorithm? That > seems like overkill to me. What Walter currently has isn't even 1000 lines > long (and that's including the CircularBuffer helper struct). Splitting it up > like that seems like over-modularation to me. Modules are the unit of encapsulation in D (private), so they should always be as small as possible. As Andrei would say: Destroyed? David

On Wednesday, 5 June 2013 at 08:11:14 UTC, Walter Bright wrote: > On 6/5/2013 12:38 AM, Jonathan M Davis wrote: >> Maybe some do, but many don't, and 1000 lines is _far_ from too much. If we >> start making modules that small, we're going to end up with tons of them to >> wade through to find anything. > > 1. It isn't any harder to find things in multiple files than in one file. Although I think you're right about having smaller modules, I generally find it easier to browse through a larger file than many smaller files. Multiple files is ok if you know what you're looking for (grep) but when you're just trying to scan across a system to get a feel for how it's working, juggling many files is a real pita.

On Wednesday, 5 June 2013 at 11:30:10 UTC, John Colvin wrote: > On Wednesday, 5 June 2013 at 08:11:14 UTC, Walter Bright wrote: >> On 6/5/2013 12:38 AM, Jonathan M Davis wrote: >>> Maybe some do, but many don't, and 1000 lines is _far_ from too much. If we >>> start making modules that small, we're going to end up with tons of them to >>> wade through to find anything. >> >> 1. It isn't any harder to find things in multiple files than in one file. > > Although I think you're right about having smaller modules, I generally find it easier to browse through a larger file than many smaller files. > > Multiple files is ok if you know what you're looking for (grep) but when you're just trying to scan across a system to get a feel for how it's working, juggling many files is a real pita. Surely you would know which compression algorithm you wanted to change? If it's a general renaming or something not specific to a particular use then a file search is necessary anyway.

June 05, 2013

Re: std.compress

Posted by Jakob Ovrum
in reply to Jonathan M Davis

Permalink

Jakob Ovrum

Posted in reply to Jonathan M Davis

Permalink

On Wednesday, 5 June 2013 at 07:39:12 UTC, Jonathan M Davis wrote:
> Maybe some do, but many don't, and 1000 lines is _far_ from too much. If we
> start making modules that small, we're going to end up with tons of them to
> wade through to find anything.
>
> - Jonathan M Davis

We have a standard library in disagreement with the language's encapsulation mechanics. The module/package system in D is almost ignored in Phobos (and that's probably why the package system still has all these little things needing ironing out). It seems to owe influence to typical C and C++ library structure, which is simply suboptimal in D's module system.

Third-party libraries tend to do a much better job at this. For example, Tango goes all out and embraces the package and module system, and the result is an extremely organized tree of modules with appropriate granularity. Code isn't hard to find because everything isn't just dumped into (bloated) blobs in a flat structure like in Phobos; it's organized into a tree. It seems like a no-brainer with the D language, and Phobos is the only D library I know that doesn't embrace this style of organization. The result is awful coupling throughout; with Phobos, we can't even write Hello World without pulling in half of the standard library.

It's not just about the actual dependencies a module has, but the perceived dependencies; important from a readability perspective. I know a lot of D programmers embrace selective imports when working with Phobos, because just seeing a plain import statement such as "import std.datetime;" tells you very little about what the importing module actually does, and it's harder to figure out exactly where unqualified symbols come from when reading the module's code.

I think the programmer should have a choice of convenience versus readability/fine dependency management when importing. The current module system does a decent job at enabling this already, and it's bound to get better with improvements like DIP37. Scripts and certain application code may want to prioritize productivity over finely managed dependencies, while library code - especially the *standard* library! - should definitely aim for lean coupling that makes sense.

To that end, I think a lot of improvements can be made without breaking user code, but I'd be very much willing to see all kinds of breakage if it means we can get rid of the present standard library of substandard quality. The language may have been declared stable, but Phobos is in no laudable state.

On Tuesday, 4 June 2013 at 03:44:05 UTC, Walter Bright wrote: > https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d > > I wrote this to add components to compress and expand ranges. > > Highlights: > > 1. doesn't do any memory allocation > 2. can handle arbitrarily large sets of data > 3. it's lazy > 4. takes an InputRange, and outputs an InputRange > > Comments welcome. I may have misunderstood something, but the code does not implement LZW (a variant of LZ78), but a variant of LZ77 (i.e. deflate/ZIP). See https://en.wikipedia.org/wiki/LZ77_and_LZ78

On 6/5/13 2:55 AM, Timothee Cour wrote: > What I suggested in my original post didn't involve any > indirection/abstraction; simply a renaming to be consistent with > existing zlib (see my points A+B in my 1st post on this thread): > > std.compress.zlib.compress > std.compress.zlib.uncompress > std.compress.lzw.compress > std.compress.lzw.uncompress I think that's nice. Andrei

On Wednesday, 5 June 2013 at 11:57:19 UTC, Diggory wrote: > On Wednesday, 5 June 2013 at 11:30:10 UTC, John Colvin wrote: >> On Wednesday, 5 June 2013 at 08:11:14 UTC, Walter Bright wrote: >>> On 6/5/2013 12:38 AM, Jonathan M Davis wrote: >>>> Maybe some do, but many don't, and 1000 lines is _far_ from too much. If we >>>> start making modules that small, we're going to end up with tons of them to >>>> wade through to find anything. >>> >>> 1. It isn't any harder to find things in multiple files than in one file. >> >> Although I think you're right about having smaller modules, I generally find it easier to browse through a larger file than many smaller files. >> >> Multiple files is ok if you know what you're looking for (grep) but when you're just trying to scan across a system to get a feel for how it's working, juggling many files is a real pita. > > Surely you would know which compression algorithm you wanted to change? If it's a general renaming or something not specific to a particular use then a file search is necessary anyway. I eas speaking more generally, about phobos as a whole.

On Wednesday, 5 June 2013 at 11:30:10 UTC, John Colvin wrote: > Although I think you're right about having smaller modules, I generally find it easier to browse through a larger file than many smaller files. > > Multiple files is ok if you know what you're looking for (grep) but when you're just trying to scan across a system to get a feel for how it's working, juggling many files is a real pita. Use an editor with a file tree sidebar? Quite on the contrary, I find many files to be much preferable, because you automatically have "bookmarks" in the source to come back to, and having the functionality already grouped in manageable logical units saves you from inferring that structure again, as it is the case when scrolling through a huge file. On a lighter note, if it's really a problem for you that module files are too small, what about just concatenating all the files in a given directory using a little shell magic? ;) David

Forums