April 13, 2016
On Wednesday, 13 April 2016 at 16:34:16 UTC, Jon D wrote:
> Thanks Rory, Puming. I'll look into this and see how best to make it fit. I'm realizing also there's one additional capability it'd be nice to have in dub for tools like this, which in an option to install the executables somewhere that can be easily be put on the path. Still, even without this there'd be benefit to having them fetched via dub.

You don't need to put anything on path to run utils from dub packages. `dub run` will take care of setting necessary envionment (without messing with the system):

dub fetch package_with_apps
dub run package_with_apps:app1 --flags args
April 13, 2016
On Wednesday, 13 April 2016 at 12:36:56 UTC, Dejan Lekic wrote:
> On Tuesday, 12 April 2016 at 00:50:24 UTC, Jon D wrote:
>>
>> I've open sourced a set of command line utilities for manipulating tab-separated value files.
>
> I rarely need TSV files, but I deal with CSV files every day.
> - It would be nice to test your implementation against std.csv (it can use TAB as separator). Did you try to compare the two?

No, I didn't try using the std.csv library utilities. The utilities all take a delimiter, so comma can be specified, but that won't handle CSV escaping.

For myself, I'd be more inclined to add TSV-CSV converters rather than adding native CSV support to each tool, but if you're working with CSV all the time that'd be nuisance.

If you want, you can try rewriting the inner loop of one of the tools to use csvNextToken rather than algorithm.splitter. tsv-select would be the easiest of the tools to try. It'd also be necessary to replace the writeln for the output to properly add CSV escapes.

--Jon
April 13, 2016
On Wednesday, 13 April 2016 at 17:01:33 UTC, Dicebot wrote:
> On Wednesday, 13 April 2016 at 16:34:16 UTC, Jon D wrote:
>> Thanks Rory, Puming. I'll look into this and see how best to make it fit. I'm realizing also there's one additional capability it'd be nice to have in dub for tools like this, which in an option to install the executables somewhere that can be easily be put on the path. Still, even without this there'd be benefit to having them fetched via dub.
>
> You don't need to put anything on path to run utils from dub packages. `dub run` will take care of setting necessary envionment (without messing with the system):
>
> dub fetch package_with_apps
> dub run package_with_apps:app1 --flags args

These are command line utilities, along the lines of unix 'cut', 'grep', etc, intended to be used as part of unix pipeline. It'd be less convenient to be invoking them via dub. They really should be on the path themselves.

--Jon
April 13, 2016
On 04/11/2016 08:50 PM, Jon D wrote:
> Hi all,
>
> I've open sourced a set of command line utilities for manipulating
> tab-separated value files. They are complementary to traditional unix
> tools like cut, grep, etc. They're useful for manipulating large data
> files. I use them when prepping files for R and similar tools. These
> tools were part of my 'explore D' programming exercises.
>
> The tools are here: https://github.com/eBay/tsv-utils-dlang
>
> They are likely of interest primarily to people regularly working with
> large files, though others might find the performance benchmarks of
> interest as well (included in the README).
>
> I'd welcome any feedback, either on the apps or the code. Intention is
> that the code be reasonable example programs. And, I may write a blog
> post about my D explorations at some point, they'd be referenced in such
> an article.
>
> --Jon

Looking great. Thanks!

https://www.facebook.com/dlang.org/posts/1275477382465940

https://twitter.com/D_Programming/status/720310640531808261

https://www.reddit.com/r/programming/comments/4ems6a/commandline_utilities_for_large_tabseparated/


Andrei

April 13, 2016
On Wednesday, 13 April 2016 at 17:21:58 UTC, Jon D wrote:
>> You don't need to put anything on path to run utils from dub packages. `dub run` will take care of setting necessary envionment (without messing with the system):
>>
>> dub fetch package_with_apps
>> dub run package_with_apps:app1 --flags args
>
> These are command line utilities, along the lines of unix 'cut', 'grep', etc, intended to be used as part of unix pipeline. It'd be less convenient to be invoking them via dub. They really should be on the path themselves.

Sure, that would be beyond dub scope though. Making binary packages is independent of build system or source layout (and is highly platform-specific). The `dun run` feature is mostly helpful when you need to use one such tool as part of a build process for another dub package.
April 13, 2016
On Wednesday, 13 April 2016 at 18:22:21 UTC, Dicebot wrote:
> On Wednesday, 13 April 2016 at 17:21:58 UTC, Jon D wrote:
>>> You don't need to put anything on path to run utils from dub packages. `dub run` will take care of setting necessary envionment (without messing with the system):
>>>
>>> dub fetch package_with_apps
>>> dub run package_with_apps:app1 --flags args
>>
>> These are command line utilities, along the lines of unix 'cut', 'grep', etc, intended to be used as part of unix pipeline. It'd be less convenient to be invoking them via dub. They really should be on the path themselves.
>
> Sure, that would be beyond dub scope though. Making binary packages is independent of build system or source layout (and is highly platform-specific). The `dun run` feature is mostly helpful when you need to use one such tool as part of a build process for another dub package.

Right. So, partly what I'm wondering is if during the normal dub fetch/run cycle there might be an opportunity to print a message the user with some info to help them add the tools to their path. I haven't used dub much, so I'll have to look into it more. But there should be some way to make it reasonably easy and clear. It'll probably be a few days before I can get to this, but I would like to get them in the package registry.

--Jon
April 13, 2016
On 04/13/2016 09:48 PM, Jon D wrote:
> Right. So, partly what I'm wondering is if during the normal dub fetch/run cycle there might be an opportunity to print a message the user with some info to help them add the tools to their path. I haven't used dub much, so I'll have to look into it more. But there should be some way to make it reasonably easy and clear. It'll probably be a few days before I can get to this, but I would like to get them in the package registry.

This is wrong direction. Users of those tools should not even ever need to have dub installed or know about it existence - dub is strictly a developer tool. Instead, whoever distributes the utils should use dub to build them and use generated artifacts to prepare distribution package.
April 13, 2016
On 4/11/2016 5:50 PM, Jon D wrote:
> I'd welcome any feedback, either on the apps or the code. Intention is that the
> code be reasonable example programs. And, I may write a blog post about my D
> explorations at some point, they'd be referenced in such an article.


You've got questions on:


https://www.reddit.com/r/programming/comments/4ems6a/commandline_utilities_for_large_tabseparated/

!! As the author, it'd be nice to do an AMA there.
April 13, 2016
On Wednesday, 13 April 2016 at 19:52:30 UTC, Walter Bright wrote:
> On 4/11/2016 5:50 PM, Jon D wrote:
>> I'd welcome any feedback, either on the apps or the code. Intention is that the
>> code be reasonable example programs. And, I may write a blog post about my D
>> explorations at some point, they'd be referenced in such an article.
>
>
> You've got questions on:
>
>
> https://www.reddit.com/r/programming/comments/4ems6a/commandline_utilities_for_large_tabseparated/
>
> !! As the author, it'd be nice to do an AMA there.

Thanks for posting there and letting me know. I responded and will watch the thread.

What do you mean by an "AMA"?
April 13, 2016
On 04/13/2016 01:40 PM, Jon D wrote:

> What do you mean by an "AMA"?

It means "(I'm the author), Ask Me Anything".

Ali