Thread overview | ||||||||
---|---|---|---|---|---|---|---|---|
|
March 15, 2011 [improve-it] Parsing NG archive and sorting by post-count | ||||
---|---|---|---|---|
| ||||
I thought about making a kind of code-golf contest (stackoverflow usually has these contests). Only I would focus on improving each others code. So here's my idea of the day: Parse the newsgroup archive files from http://www.digitalmars.com/NewsGroup.html, and for each .html file output another .html file which has a list of topics sorted in post count order. Sure, there is NG software which does this automatically. But this is about doing it in D. Here's my implementation: https://gist.github.com/871631 Download a few .html files, save them in their own folder. Then copy my script into a .d file in the same folder, and just run it with RDMD. It will output the files in a `output`subfolder. It works on Windows, since that's all I've tested it with. There's a few things I've noticed: Using just a simple hash with the post count as the Key type wouldn't work. There are many topics which have the same post count number, and AA's can't hold duplicates. So I worked around this by making a wrapper which hides all the details of storing duplicates and traversal, I've called it `CommonAA`. I've also implemented an `allSatisfy` function which works on runtime arguments. There's a similar function in std.typetuple, but its only useful for compile-time arguments. There's probably a similar method someplace in std.algorithm, but I was too lazy to check. I thought it would be nice to have. I can see some ways to improve this. For one, I could have used Regex instead of indexOf. I could have also tried to avoid using a wrapper, however I haven't figured out a way to do this while having duplicate key types and having to sort them while keeping the Key types linked to the Values. Anywho, let's see you improve my code! It's just for fun and maybe we'll learn some tricks from one another. Have fun! |
March 15, 2011 Re: [improve-it] Parsing NG archive and sorting by post-count | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrej Mitrovic | Andrej Mitrovic: > I've also implemented an `allSatisfy` function which works on runtime arguments. There's a similar function in std.typetuple, but its only useful for compile-time arguments. There's probably a similar method someplace in std.algorithm, but I was too lazy to check. I thought it would be nice to have. http://d.puremagic.com/issues/show_bug.cgi?id=4405 > Anywho, let's see you improve my code! It's just for fun and maybe we'll learn some tricks from one another. Have fun! I suggest you to add unit tests and Contracts to your CommonAA() and allSatisfy() :-) Have you tried to replace this: if (key in payload) { payload[key] ~= val; } else { payload[key] = [val]; } With just: payload[key] ~= val; I suggest to replace this: sortedKeys.sort; With: sortedKeys.sort(); Bye, bearophile |
March 15, 2011 Re: [improve-it] Parsing NG archive and sorting by post-count | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 3/15/11, bearophile <bearophileHUGS@lycos.com> wrote: > Andrej Mitrovic: > >> I've also implemented an `allSatisfy` function which works on runtime arguments. There's a similar function in std.typetuple, but its only useful for compile-time arguments. There's probably a similar method someplace in std.algorithm, but I was too lazy to check. I thought it would be nice to have. > > http://d.puremagic.com/issues/show_bug.cgi?id=4405 Cool, I was afraid I was reinventing the wheel. > I suggest you to add unit tests and Contracts to your CommonAA() and > allSatisfy() :-) allSatisfy definitely doesn't work for a bunch of cases, like passing a delegate instead of a literal. And CommonAA doesn't take into account things like removing elements, etc. It's definitely a half-ass implementation. :p > > Have you tried to replace this: > > if (key in payload) > { > payload[key] ~= val; > } > else > { > payload[key] = [val]; > } > > With just: > > payload[key] ~= val; > Good catch. Since the value type is an array I could simply append to it. Although one didn't exist yet, so I figure I had to assign something to an empty spot in an AA. Oh well.. > > I suggest to replace this: > sortedKeys.sort; > > With: > sortedKeys.sort(); > Yes, I prefer it that way too. Since DMD doesn't complain about it (is sort even a property?), I missed it. Thanks for the input. |
March 15, 2011 Re: [improve-it] Parsing NG archive and sorting by post-count | ||||
---|---|---|---|---|
| ||||
On 3/15/11, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
>
>>
>> I suggest to replace this:
>> sortedKeys.sort;
>>
>> With:
>> sortedKeys.sort();
>>
>
> Yes, I prefer it that way too.
Correction: DMD complains about having parentheses, in fact it's an error: ngparser.d(28): Error: undefined identifier module ngparser.sort
So I've had to remove them. And again that's that uninformative error message which I don't like.
|
March 15, 2011 Re: [improve-it] Parsing NG archive and sorting by post-count | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrej Mitrovic | Andrej Mitrovic: > Correction: DMD complains about having parentheses, in fact it's an error: ngparser.d(28): Error: undefined identifier module ngparser.sort > > So I've had to remove them. And again that's that uninformative error message which I don't like. Sorry, this time the uninformative text was mine :-) When I have suggested you to add the () after the sort, I meant to suggest you to use the std.algorithm sort instead of the deprecated built-in one, because the built-in one is slow and it has bad bugs, like this one I've found: http://d.puremagic.com/issues/show_bug.cgi?id=2819 Bye, bearophile |
March 16, 2011 Re: [improve-it] Parsing NG archive and sorting by post-count | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 3/16/11, bearophile <bearophileHUGS@lycos.com> wrote:
> I meant to suggest you to use the
> std.algorithm sort instead of the deprecated built-in one, because the
> built-in one is slow and it has bad bugs, like this one I've found:
> http://d.puremagic.com/issues/show_bug.cgi?id=2819
Thanks, I didn't know about the bugs. .
|
Copyright © 1999-2021 by the D Language Foundation