Thread overview | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
February 19, 2014 Read and write gzip files easily. | ||||
---|---|---|---|---|
| ||||
Hi there, I'm new to D and have a lot of learning ahead of me. It would be extremely helpful to me if someone with D experience could show me some code examples. I'd like to neatly read and write gzipped files for my work. I have read several threads on these forums on the topic of std.zlib or std.zip and I haven't been able to figure it out. Here's a Python script that does what I want. Can you please show me an example in D that does the same thing? <code> #!/usr/bin/env python import gzip # Read a gzipped file and print the contents line by line. with gzip.open("input.gz") as stream: for line in stream: print line # Write some text to a gzipped file. with gzip.open("output.gz", "w") as stream: stream.write("some output goes here\n") </code> I have a second request. I would like to start using D more in my work, and in particular I would like to use and extend the BioD library. Artem Tarasov made a nice module to handle BGZF, and I would like to see an example like my Python code above using Artem's module. Read more about BGZF: http://blastedbio.blogspot.com/2011/11/bgzf-blocked-bigger-better-gzip.html BioD: https://github.com/biod/BioD/blob/d2bea0a0da63eb820fcf11ae367456b2c367ec04/bio/core/bgzf/compress.d |
February 19, 2014 Re: Read and write gzip files easily. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kamil Slowikowski Attachments:
| Wow, that's unexpected :)
Unfortunately, there's no standard module for processing gzip/bz2. The former can be dealt with using etc.c.zlib, but there's no convenient interface for working with file as a stream. Thus, the easiest way that I know of is as follows:
import std.stdio, std.process;
auto pipe = pipeShell("gunzip -c " ~ filename); // replace with pigz if you
wish
File input = pipe.stdout;
Regarding your second request, this forum is not an appropriate place to provide usage examples for a library, so that will go into a private e-mail.
On Wed, Feb 19, 2014 at 7:51 PM, Kamil Slowikowski <kslowikowski@gmail.com>wrote:
>
> I have a second request. I would like to start using D more in my work, and in particular I would like to use and extend the BioD library. Artem Tarasov made a nice module to handle BGZF, and I would like to see an example like my Python code above using Artem's module.
>
|
February 19, 2014 Re: Read and write gzip files easily. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kamil Slowikowski | On Wednesday, 19 February 2014 at 15:51:53 UTC, Kamil Slowikowski wrote: > Hi there, I'm new to D and have a lot of learning ahead of me. It would > be extremely helpful to me if someone with D experience could show me > some code examples. > > I'd like to neatly read and write gzipped files for my work. I have read > several threads on these forums on the topic of std.zlib or std.zip and I haven't been able to figure it out. > > Here's a Python script that does what I want. Can you please show me > an example in D that does the same thing? > > <code> > #!/usr/bin/env python > > import gzip > > # Read a gzipped file and print the contents line by line. > with gzip.open("input.gz") as stream: > for line in stream: > print line > > # Write some text to a gzipped file. > with gzip.open("output.gz", "w") as stream: > stream.write("some output goes here\n") > </code> > > > I have a second request. I would like to start using D more in my work, > and in particular I would like to use and extend the BioD library. Artem > Tarasov made a nice module to handle BGZF, and I would like to see an > example like my Python code above using Artem's module. > > Read more about BGZF: > http://blastedbio.blogspot.com/2011/11/bgzf-blocked-bigger-better-gzip.html > > BioD: > https://github.com/biod/BioD/blob/d2bea0a0da63eb820fcf11ae367456b2c367ec04/bio/core/bgzf/compress.d It is not part of the standard library, but you may want to have a look at the GzipInputStream in vibeD. http://vibed.org/api/vibe.stream.zlib/GzipInputStream |
February 19, 2014 Re: Read and write gzip files easily. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Craig Dillabaugh | On Wednesday, 19 February 2014 at 16:32:54 UTC, Craig Dillabaugh wrote:
> On Wednesday, 19 February 2014 at 15:51:53 UTC, Kamil Slowikowski wrote:
>
> It is not part of the standard library, but you may want to have a look at the GzipInputStream in vibeD.
>
> http://vibed.org/api/vibe.stream.zlib/GzipInputStream
Also meant to add, this thread belongs in the D.learn forum rather than here.
|
February 19, 2014 Re: Read and write gzip files easily. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Artem Tarasov | On Wednesday, 19 February 2014 at 16:27:32 UTC, Artem Tarasov wrote: > Unfortunately, there's no standard module for processing gzip/bz2. std.zlib handles gzip but it doesn't present a file nor range interface over it. This will work though: void main() { import std.zlib; import std.stdio; auto uc = new UnCompress(); foreach(chunk; File("testd.gz").byChunk(1024)) { auto uncompressed = uc.uncompress(chunk); writeln(cast(string) uncompressed); } // also look at anything left in the buffer writeln(cast(string) uc.flush()); } And if you are writing, use new Compress(HeaderFormat.gzip) then call the compress method and write what it returns to teh file. |
February 19, 2014 Re: Read and write gzip files easily. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe Attachments:
| Ah, indeed. I dismissed it because it allocates on each call, and heavy GC usage in multithreaded app is a performance killer. On Wed, Feb 19, 2014 at 8:36 PM, Adam D. Ruppe <destructionator@gmail.com>wrote: > > std.zlib handles gzip but it doesn't present a file nor range interface over it. > |
February 19, 2014 Re: Read and write gzip files easily. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Artem Tarasov | On Wednesday, 19 February 2014 at 16:27:32 UTC, Artem Tarasov wrote: > the easiest way that I > know of is as follows: > > import std.stdio, std.process; > auto pipe = pipeShell("gunzip -c " ~ filename); // replace with pigz if you > wish > File input = pipe.stdout; Artem, thank you! I've used a similar trick in the past with Python because calling the system's gzip or pigz in a subprocess.Pipe is faster than using the python gzip module. I'm very glad to see how easy it is in D. > Regarding your second request, this forum is not an appropriate place to > provide usage examples for a library, so that will go into a private e-mail. Thanks, again! I'm looking forward to hearing from you :) @Adam D. Ruppe Thanks for your example! I couldn't find such an example anywhere on the web. @Craig Dillabaugh Please feel free to move the thread, sorry for posting in the wrong place. |
February 19, 2014 Re: Read and write gzip files easily. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kamil Slowikowski | On Wednesday, 19 February 2014 at 15:51:53 UTC, Kamil Slowikowski wrote:
> Hi there, I'm new to D and have a lot of learning ahead of me. It would
> be extremely helpful to me if someone with D experience could show me
> some code examples.
>
> I'd like to neatly read and write gzipped files for my work. I have read
> several threads on these forums on the topic of std.zlib or std.zip and I haven't been able to figure it out.
>
> Here's a Python script that does what I want. Can you please show me
> an example in D that does the same thing?
>
> <code>
> #!/usr/bin/env python
>
> import gzip
>
> # Read a gzipped file and print the contents line by line.
> with gzip.open("input.gz") as stream:
> for line in stream:
> print line
>
> # Write some text to a gzipped file.
> with gzip.open("output.gz", "w") as stream:
> stream.write("some output goes here\n")
> </code>
>
>
> I have a second request. I would like to start using D more in my work,
> and in particular I would like to use and extend the BioD library. Artem
> Tarasov made a nice module to handle BGZF, and I would like to see an
> example like my Python code above using Artem's module.
>
> Read more about BGZF:
> http://blastedbio.blogspot.com/2011/11/bgzf-blocked-bigger-better-gzip.html
>
> BioD:
> https://github.com/biod/BioD/blob/d2bea0a0da63eb820fcf11ae367456b2c367ec04/bio/core/bgzf/compress.d
Witaj Kamil :)
Feel free to also visit #d channel on freenode IRC network.
|
February 19, 2014 Re: Read and write gzip files easily. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kamil Slowikowski | > @Craig Dillabaugh
> Please feel free to move the thread, sorry for posting in the wrong place.
Actually, the thread can't be moved I believe, it is here forever.
Not a big deal though, lots of people new to D post questions here and miss the D.learn forum, so you are not alone. Since I didn't have a good answer to your original question I decided I should let you know about D.learn.
|
February 19, 2014 Re: Read and write gzip files easily. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | On Wednesday, 19 February 2014 at 16:36:29 UTC, Adam D. Ruppe wrote:
> On Wednesday, 19 February 2014 at 16:27:32 UTC, Artem Tarasov wrote:
>> Unfortunately, there's no standard module for processing gzip/bz2.
>
> std.zlib handles gzip but it doesn't present a file nor range interface over it.
>
> This will work though:
>
> void main() {
> import std.zlib;
> import std.stdio;
> auto uc = new UnCompress();
>
> foreach(chunk; File("testd.gz").byChunk(1024)) {
> auto uncompressed = uc.uncompress(chunk);
> writeln(cast(string) uncompressed);
> }
>
> // also look at anything left in the buffer
> writeln(cast(string) uc.flush());
> }
>
Regrettably, the above code has a bug. Currently, std.zlib stores a reference to the buffer passed to it, and since byChunk reuses the buffer, the code will fail when uncompressing multiple chunks.
|
Copyright © 1999-2021 by the D Language Foundation