February 07, 2014
On Thursday, 6 February 2014 at 22:20:38 UTC, Dicebot wrote:
> On Thursday, 6 February 2014 at 22:18:10 UTC, Brad Anderson wrote:
>> You should probably validate utf from all foreign sources. Catch a problem with it as it comes in rather than in some arbitrary part of your program.
>>
>> http://dlang.org/phobos/std_utf.html#.validate
>
> pure @safe void validate(S)(in S str) if (isSomeString!S);
>
> Throws:
> UTFException if str is not well-formed.

And somewhere in the world, darkness fell forever on a bright and beautiful countryside.  The monsters poured forth and devoured everything in sight, given strength by that unbelievable abomination of a function design.
February 07, 2014
On Thursday, 6 February 2014 at 22:56:45 UTC, Adam D. Ruppe wrote:
> On Thursday, 6 February 2014 at 21:38:03 UTC, Dicebot wrote:
>> Any application that operates on some external user input will be subject to DoS attack vector if it uses Phobos directly.
>
> Hmm, I hadn't considered that. Maybe exceptions could be handled automatically though due to the facts that there are rarely more than one in flight at any time and they typically don't live for long:
> [snipped lengthy example]

I really like vibe.d.  A lot.  But the way HTTP parse errors are handled is a disaster.  Do you know what happened when I was testing vibe.d recently and I sent it a bad request?  It sent a stack trace as a responses.  A stack trace!  To a client!  I was speechless.  Needless to say, I don't support the idea of further enabling this design, regardless of whether it can be made a pinnacle of elegance.
February 07, 2014
On Friday, 7 February 2014 at 03:19:32 UTC, Sean Kelly wrote:
> It sent a stack trace as a responses.  A stack trace!  To a client!  I was speechless.

lol, my cgi.d will do that too if you compile with -debug.... I find it convenient at times. (It also sends it to stderr but when doing cgi apps, that means digging into the apache log which is a pain compared to just looking at the browser)
February 07, 2014
On Friday, 7 February 2014 at 03:14:45 UTC, Sean Kelly wrote:
> On Thursday, 6 February 2014 at 22:20:38 UTC, Dicebot wrote:
>> UTFException if str is not well-formed.
>
> unbelievable abomination of a function design.

Yeah, that is absurd. It is a bad, bad sign when almost every time you use a function, you write

bool ok = true;
try validate(s); catch(UTFException) ok = false;
if(!ok) {}

yet that's how i use validate...

fun fact, my little toy scripting language supports
var a = try foo();; // if foo throws, a == the exception object

but it's a toy scripting language, ugly crap is allowed there :)
February 07, 2014
On Friday, 7 February 2014 at 01:31:17 UTC, Ola Fosheim Grøstad wrote:
> On Friday, 7 February 2014 at 01:23:44 UTC, Walter Bright wrote:
>> Right. If you're:
>>
>> 1. using throws as control flow logic
> [...]
>> you're doing it wrong.
>
> I disagree.
>
> REST based web services tend to use throws all the time. It is a an effective and clean way to break all transactions that are in progress throughout the call chain when you cannot carry through a request, or if the request returns nothing.

But let this be up to the programmer working on the service, not imposed on them by the API.  Then if they run into something like this DoS issue they can fix it.  My experience with these services is that performance is critical and bad input is common, because people are always trying to hack your shit.

Where I work, people are serious about performance, our daily volume is ridiculous, and our goal is five nine's of uptime across the board.  At the same time, really good asynchronous programmers are about as rare as water on the moon.  So something like vibe.d, where mid-level programmers could write correct code that still performs well thanks to the underlying event model, would be a godsend.  But only if I really can get what I pay for.

The thing I think a lot of people don't realize these days is that performance per watt is just about the most important thing there is.  Data centers are expensive, slow to build, and rack space is limited.  If you can find a way to increase the concurrent load per box by, say, an order of magnitude by choosing a different language or programming model or whatever, there's a real economic motivation to do so.

Java gets by by having a really good GC and a low barrier of entry, but its scalability is really pretty poor all things considered.  On the other hand, C/C++ scales tremendously but then you're stuck with the burden those languages impose in terms of semantic complexity, bug frequency, and so on.  D seems really promising here but can't rely on having a fantastic incremental GC like Java, and so I think it's a mistake to use Java as a model for how to manage memory.  And maybe Java just got it wrong anyway.  I know some people who had to go to ridiculous lengths to avoid GC collection cycles in Java because a collection in the app took _20_seconds_ to complete.  Now maybe the application was poorly designed or they should have been using an aftermarket GC, but even so.

Finally, library programming is the one place where premature optimization really is a good idea, because you can never be sure how people will be using your code.  That allocation may not be a big deal to you or 98% of your users, but for the one big client who calls that routine in a tight inner loop or operates at volumes you never conceived of it's a deal breaker.  I really don't want Phobos to be the deal breaker :-)
February 07, 2014
On Thursday, 6 February 2014 at 21:38:03 UTC, Dicebot wrote:
> Hardly so. Any exception allocation can trigger GC collection cycle and Phobos does not provide any other way to handle data errors. Any application that operates on some external user input will be subject to DoS attack vector if it uses Phobos directly.

Thinking about this more it'd probably be a good idea to use the
type system to segregate non-validated user input from the rest
of your program. UnvalidatedString or something.
UnvalidatedString.validate() returns a string you can then use in
the regular fashion. That way unvalidated data can't weasel its
way into the trusted portion of your program without getting
checked first. Anyway, that's just an idea (and getting further
and further off topic).

February 07, 2014
On Thursday, February 06, 2014 22:20:37 Dicebot wrote:
> On Thursday, 6 February 2014 at 22:18:10 UTC, Brad Anderson wrote:
> > You should probably validate utf from all foreign sources. Catch a problem with it as it comes in rather than in some arbitrary part of your program.
> > 
> > http://dlang.org/phobos/std_utf.html#.validate
> 
> pure @safe void validate(S)(in S str) if (isSomeString!S);
> 
> Throws:
> UTFException if str is not well-formed.
> 
> ;)

In general, I think that throwing on malformed Unicode is a good thing, because it results in code that's less error-prone (as the alternative is to not validate Unicode and try and continue somehow regardless of bad input when decoding Unicode, which would be very bad IMHO). That being said, validating strings when they enter the program is a good way to localize any failures - which is where validate would come in - and I have to agree that the fact that validate throws is horrific. It's a classic example of a function that should return a bool rather than throw. You're asking it whether the string is valid, not asking to report errors when your normal control flow encounters an error that prevents it from functioning normally (which is where exceptions should normally be used).

As such, I think that it's clear that we need a new function to replace it (e.g. isValidUnicode). I'll have to take a look at it. If I'm lucky, it won't even take all that long to implement.

- Jonathan M Davis
February 07, 2014
On 2/6/2014 7:08 PM, bearophile wrote:
> Walter Bright:
>
>> It's not a matter of taste. If your input is subject to a DoS attack, don't
>> put exceptions in the control flow.
>
> Perhaps the world of today malicious attacks on the software you write should be
> assumed as the default situation, and then the language+library has to offer
> something less paranoiac on request.
>
> That's why some languages have changed their sorting and hashing routines to
> make them a little slower but safer on default.

DoS attack resistance requires faster code, not slower code.

February 07, 2014
On 2/6/2014 6:19 PM, Andrei Alexandrescu wrote:
> On 2/6/14, 5:23 PM, Walter Bright wrote:
>> I'm tempted to say that the throw expression can call 'new' even if the
>> function is marked as @nogc.
>
> That's extreme. A better possibility is to allocate exceptions from a different
> heap and proclaim that the heap is cleaned once all catch blocks are left. (I'm
> sure we can find something better, but now is not the time to worry about it.)

That doesn't work, as nothing prevents code from squirreling away the caught exception object handle.
February 07, 2014
Dicebot:

> Now if Phobos would have only thrown exceptions in really _exceptional_ situations and handled broken input gracefully...

I wrote two small ideas to reduce throwing exceptions in Phobos:

http://d.puremagic.com/issues/show_bug.cgi?id=6840
http://d.puremagic.com/issues/show_bug.cgi?id=11913

Bye,
bearophile