sorting associative array's keys by values (page 2)

June 16, 2012

Re: sorting associative array's keys by values

Posted by Jonathan M Davis

Permalink

Jonathan M Davis

Permalink

On Sunday, June 17, 2012 00:28:07 maarten van damme wrote:
> I wanted to catch it because I could not for the life of me understand how downloading the exact same page twice could alter it's contents in such a way that it causes the program to crash.
> 
> There's something really strange going on (or maybe I'm just too tired
> to see the obvious)
> My code literally reads
> 
> 				if(debugtemp){
> 					writeln(tradeDocument[0..100]);
> 					writeln(tradeDocument.countUntil("<div class=\""));
> 					stdout.flush();
> 				}
> 
> And the output I get is
> 
> P-BODY SHRUG</h1>
>                             <span class="level">Level 1</span><br
> />(Uncraftable )
> 
> 150
> <div class="item unique" id="7088388" data="620,20,6"
> search="level:1;craftable: false;">
> 
> 0
> 
> Now either I'm going crazy or std.algorithm sees ghosts apearing.
> 
> That isn't the worst part, that is
> 
> tradeDocument=tradeDocument[1..$];
> 
> tradeDocument=tradeDocument[tradeDocument.countUntil("<div
> class=\"")..$];//go to the next element
> 
> crashing immediatly after the previous piece of code. So in reality std.algorithm sees an ellement that isn't there, says it's at index 0 and a second later my program crashes on the statement x=x[0..$] What on earth is going on?
> 
> I use methods from
> import std.algorithm;
> import std.array;
> import std.stdio;
> import std.string;
> and no pointers, aren't they supposed to be a subset of safeD so we
> can rule out memory corruption?
> (I've confirmed this behaviour on two different machines)

If anything in there is calling @trusted code - either directly or indirectly - then it's possible that the @trusted code which is being called is wrong (since in that case, it's the programmer guaranteeing that it's @safe, not the compiler).

It's also possible that you've run into a code generation bug.

What's the shortest possible code that you have which reproduces the problem that you can give? It needs to be narrowed down before it can be determined what exactly is going wrong, and I don't think that you've given enough code in your posts for anyone else to compile it and see the failure on their machine.

- Jonathan M Davis

Ah, wait a second. After playing a bit with the try catches and actually writing some proper debug output I found out the problem was with https://github.com/Bystroushaak/DHTTPClient/blob/master/dhttpclient.d. It doesn't allways download the complete webpage. I should've written better tests I guess... Thank you for your time.

On 06/16/2012 03:28 PM, maarten van damme wrote: > And the output I get is It is possible that some part of the code is reusing a string buffer. For example, File.byLine does that. > tradeDocument=tradeDocument[1..$]; For that to work, you must ensure that tradeDocument has at least 2 elements. > tradeDocument=tradeDocument[tradeDocument.countUntil("<div > class=\"")..$];//go to the next element For that to work, you must have ensured that tradeDocument contains "<div class=\"". Otherwise countUntil() returns -1, an invalid index. If those have not been ensured, I see two bugs in the lines above. Ali -- D Programming Language Tutorial: http://ddili.org/ders/d.en/index.html

It should allways contain </html> so it has more then 2 elements and there is a note section that starts with "<div class" where I stop parsing and break out of the loop so those two should've been statisfied. The problem was (I think) the downloader. Now I get waaay less frequent crashes (yes, there is still a little bug somewhere but I'll catch it)

On Sunday, June 17, 2012 01:35:56 maarten van damme wrote: > It should allways contain </html> so it has more then 2 elements and there is a note section that starts with "<div class" where I stop parsing and break out of the loop so those two should've been statisfied. The problem was (I think) the downloader. Now I get waaay less frequent crashes (yes, there is still a little bug somewhere but I'll catch it) It _should_ always have </html>, but that string is input from outside the program under which you have no control, so you need to check it or at least check that any find or similar function that you use which can return an invalid index when </html> isn't there doesn't return an invalid index. Otherwise, you're going to get RangeErrors when it isn't there. Array slicing assumes that you've checked beforehand that the indices are valid, and it's a bug in your program if they're not. - Jonathan M Davis

well, the page I parse is automatically generated and thus allways contains </html>. (if the page completely downloaded which it didn't). The second error I found (my mistake) is that newlines get scrambled up severely when downloading the page causing the markers I try to find to sometimes break down on two different lines. replacechars from std.string crashed too with the error "invalid utf sequence". Now I use std.array's replace and everything (appears) to be working. If only I'd learned regex, I think that would've saved me quite some time.

On Sunday, June 17, 2012 13:07:24 maarten van damme wrote: > well, the page I parse is automatically generated and thus allways contains </html>. That may be true, but if your code assumes that </html> is there and it ever isn't for any reason, then you're going to get a RangeError thrown in non- release and undefined behavior in -release. It's generally a bad idea to assume that something is correct like that when you have zero control over it within your program, even if it's almost certainly correct. Something could go wrong, and it's better to throw an exception or do some other sort of error handling than it is to get an Error, or worse, undefined behavior. - Jonathan M Davis

Ok, everything worked (nearly perfect). Sometimes the webpage gets completely messed up ("<" changes to d134 or something like that) but the tests handle it rather well. That's why I decided to improve it a bit and use treads. I've always been afraid of d treads because I never really got the grasp of that message passing system but hey, there is a first time for everything. Maybe I was a bit optimistic as I now already run into troubles. I have two arrays of the form int[string] To pass them from the spawned function back to the spawner I wanted to use tid.send( array1,array2); This was illegal, they have to be either immutable or shared. I've read somwhere that D tries to avoid shared where possible (with that message passing system) so I went for immutable and changed it over to tid.send(cast(immutable(int[string]))array1,...) (you get the idea) Not only does this look ugly, it's looks like what I'm doing isn't really correct. But anyway, onto the next part, receiving it. I used auto msg=receiveOnly!(int[string],int[string]); of course this fails because I send immutable items and want to receive mutables I then changed it to auto msg=receiveOnly!(immutable int[string],immutable int[string]); Compilation now fails with the very informative C:\D\dmd2\windows\bin\..\..\src\phobos\std\concurrency.d(612): Error: can only i nitialize const member _field_field_0 inside constructor C:\D\dmd2\windows\bin\..\..\src\phobos\std\concurrency.d(612): Error: can only i nitialize const member _field_field_1 inside constructor Why is treading made so damn hard? what have I done wrong now?

another problem, when I do use shared. My code is int amountTreads = 20; if(upperLimit-lowerLimit<amountTreads) amountTreads=upperLimit-lowerLimit; int perTrade = (upperLimit - lowerLimit) / amountTreads; for (int x = 0; x < amountTreads; x++) { //we're at the last trade, parse all the rest if (amountTreads - x == 1) { spawn(&parse, lowerLimit + x * perTrade, upperLimit - x * perTrade, thisTid); } else { spawn(&parse, lowerLimit + x * perTrade, lowerLimit + (x + 1) * perTrade - 1, thisTid); } } //wait for all those treads and process the response for (int x = 0; x < amountTreads; x++) { auto msg=receiveOnly!(shared int[string],shared int[string]); int[string] tempDemands = cast(int[string])msg[0]; int[string] tempOffers = cast(int[string])msg[1]; foreach (string key; tempDemands.keys){ demands[key] += tempDemands[key]; } foreach (string key; tempOffers.keys){ offers[key] += tempOffers[key]; } writeln("ok, we're at ",x); } but when every tread has stopped working , that for-loop has only been executed 17 times.

Forums