February 09, 2014 Re: List of Phobos functions that allocate memory? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | 09-Feb-2014 09:35, Marco Leise пишет: > Am Sat, 08 Feb 2014 15:21:26 +0400 > schrieb Dmitry Olshansky <dmitry.olsh@gmail.com>: > >> 08-Feb-2014 02:57, Jonathan M Davis пишет: >>> On Friday, February 07, 2014 20:43:38 Dmitry Olshansky wrote: >>>> 07-Feb-2014 20:29, Andrej Mitrovic пишет: >>>>> On Friday, 7 February 2014 at 16:27:35 UTC, Andrei Alexandrescu wrote: >>>>>> Add a bugzilla and let's define isValid that returns bool! >>>>> >>>>> Add std.utf.decode() to that as well. IOW, it should have an overload >>>>> which returns a status code >>>> >>>> Much simpler - it returns a special dchar to designate bad encoding. And >>>> there is one defined by Unicode spec. >>> >>> Isn't that actually worse? >> >> No, it's better and more flexible for those who care to repair broken >> text in case it's broken. We currently have ZERO facilities to work with >> partly broken UTF and it's not that rare thing to have it. > > Your argument is unsubstantiated, since we have this already: > http://dlang.org/phobos/std_encoding.html#.sanitize Working with ranges of dchar? Nobody is taking eager validation from your hands anyway. > >>> Unless you're suggesting that we stop throwing on >>> decode errors, >> >> That is exactly what I suggest. >> >> then functions like std.array.front will have to check the >>> result on every call to see whether it was valid or not and thus whether they >>> should throw, which would mean extra overhead over simply having decode throw >>> on decode errors. >> >> Why the heck? It will not throw either. In the very end bad encoding is >> handled by displaying the 'substituted' (typically '?') character in >> places where it broke not by throwing up hands in the air and spitting >> "UTF Exception: offset 4302 bad UTF sequence". This is not good enough >> (in case somebody though that it is). >> >> Those who care about throwing add a trivial map!(x => x != '\uFFFD' || >> die()) over a string, where die function throws an exception. > > Thats neither an improvement over calling "validate" nor does > that deal with distinguishing between invalid UTF and Means text is broken but wasn't ever read... >\uFFFD > in the input. ...means text was broken sometime before. Hardly makes any difference to the most applications. Normal text doesn't contain \uFFFD. And you can test a string with proper 'validate', it's just that while decoding the default is to substitute. >>> validate has no business throwing, and we definitely should >>> add isValidUnicode (or isValid or whatever you want to call it) for validation >>> purposes. Code can then call that to validate that a string is valid and not >>> worry about any UTFExceptions being thrown as long as it doesn't manipulate >>> the string in a way that could result in its Unicode becoming invalid. >> >> Yet later down the road decode will triple check that anyway. Just >> saying. BTW if the string was checked beforehand there is no difference >> between 2 approaches at all (don't have to check). >> >>> However, I would argue that assuming that everyone is going to validate their >>> strings and that pretty much all string-related functions shouldn't ever have >>> to worry about invalid Unicode is just begging for subtle bugs all over the >>> place IMHO. You're essentially dealing with error codes at that point, and I >>> think that experience has shown quite clearly that error codes are generally a >>> bad way to go. Almost no one checks them unless they have to. I think that >>> having decode throw on invalid Unicode is exactly what it should be doing. The >>> problem is that validate shouldn't. >> >> Every single text editor out there seems to disagree with you: they do >> show you partially substituted text, not a dialog box "My bad, it's >> broken UTF-8, I'm giving up!". > > Editor do different things. They often try to detect the > encoding with a fall back to Latin1. If you open a file > explicitly as UTF-8 they may display a substitution char or > detect the error and use the fall back, as is the case with > Geany and Throwing exception here is not something useful in 90% of cases. Requiring everybody to call sanitize on every string from the outside smells like a wrong default to me. > gedit does in fact throw an error message at you > saying "My bad, it's broken UTF-8, I'm giving up!". I know and it's piece of junk :) Seriously it doesn't even has regular expressions for search and replace! -- Dmitry Olshansky |
February 09, 2014 Re: List of Phobos functions that allocate memory? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | 09-Feb-2014 02:17, Walter Bright пишет: > On 2/7/2014 12:51 PM, Dmitry Olshansky wrote: >> It's deh.d or rather deh_win32./ deh_win64_posix.d and it doesn't look >> like >> _all_ that lot especially if you have no finally blocks and the only >> catch is >> the top-most catch-all. > > It's a heluva lot slower than "jmp". > If you can show me how a single unconditional jump propagates error code 4 calls up the stack I'm sold. I do understand it's slow, it's not that slow to make difference in the discussed case. It's all about jumping to the wrong conclusions. To put it in one pitch: it should be possible to throw/catch in excess of 100k exceptions per second no problem at all (assuming a single core of some run of the mill modern CPU). Nobody is asking to optimize it better then the normal flow. -- Dmitry Olshansky |
February 09, 2014 Re: List of Phobos functions that allocate memory? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | "Dmitry Olshansky" wrote in message news:ld7dla$pdg$1@digitalmars.com... > > gedit does in fact throw an error message at you > > saying "My bad, it's broken UTF-8, I'm giving up!". > > I know and it's piece of junk :) > Seriously it doesn't even has regular expressions for search and replace! That would be a luxury, gedit doesn't even have auto-indent. |
February 09, 2014 Re: List of Phobos functions that allocate memory? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sunday, 9 February 2014 at 05:29:25 UTC, Walter Bright wrote: > Ola, I've done it both ways, I actually do know what I'm talking about. Please note that "you" and "they" was meant as "one" or "the c++ community" not personal. It was not ad hominem. So no reason to be defensive about it. I am grateful if you can point out where my reasoning fails, then I learn something new. Maybe you could explain why a single occasional Branch Always over the unwind-pointer would be slow. Clearly the offset should be empirically based (so that you usually can avoid the goto), maybe even set to a separate cache line for some CPUs, and you could fill out the gaps with other data you need there. It's not like I have run i7 on Vtune, so I could be wrong, but I don't see why… And I also think that if you have a CPU with sufficient number of callee save registers you can carry along a pointer to the last try-block stack frame with not much penalty. After all you only have to restore it if the function ruined it and before calling new functions that are not inlined and not nothrow, and you could stick it into a thread local global too where it matters. On 32 bit x86 it probably is quite expensive though. In code where I write try blocks they tend to stay in the "main logic function", this cosde is so heavy that adding the stack frame to a linked list (of stack frames) is a neglectible cost One really need to be careful when doing performance tests of exception handling, because it is easy to construct "theoretical" code. Programmers should write exception handlers with the implementation in mind, so using existing programs as a base line is not a good solution either. > I've sometimes been proven wrong here, so you're welcome to do a pull request proving so. You know very well that I am not going to rewrite codegen for DMD. Adding this feature will complicate codegen and you need to understand the code generator well to do the modification. Besides, I am not sure if a system level language should have exceptions at all or that I would use them when doing the kind of stuff I like to use D for. :-P ;-) I like to use exception handling in application-level code, but not in code for audio/simulations/buffer-streaming/low-level-stuff. |
February 09, 2014 Re: List of Phobos functions that allocate memory? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | On Sunday, 9 February 2014 at 04:38:23 UTC, Marco Leise wrote:
> Yes, it doesn't seem feasible otherwise. Since you can call
> functions recursively you could potentially chain exceptions
> from the same line of code several times.
>
> catch (Exception e)
> {
> staticException.line = __LINE__;
> staticException.file = __FILE__;
> staticException.next = e; // e.next is staticException
> throw staticException;
> }
>
> You'd have to flag staticException as "in use" and spawn a new
> instance every time you need another one of the same type.
> Since there is no way to reset that flag automatically when
> the last user goes out of scope (i.e. ref counting), that's
> not even an option.
>
> Preallocated exceptions only work if you are confident your
> exception wont be recursively thrown and thereby chained to
> itself. Granted, the majority of code, but really too much
> cognitive load when writing exception handling code.
While writes directly to line and file and such can't be prevented, `next` could be implemented as a property that does the conditional .dup when assigned to itself (or throw an Error).
|
February 09, 2014 Re: List of Phobos functions that allocate memory? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | On Sunday, 9 February 2014 at 05:00:15 UTC, Marco Leise wrote:
> And static allocation isn't an exactly appealing option...
>
> throw staticException ? staticException : (staticException =
> new SomethingException("Don't do this at home kids!"));
>
> and practically out of question when you need to chain
> exceptions and your call stack could contain this line of code
> more than once, resulting in infinite loops in exception
> chains as a new bug type in D, that is fixed by writing:
>
> catch (Exception e) {
> throw (staticException ? (e.linksTo(staticException) ? staticException.dupThenWrap(e) : staticException) : (staticException = new SomethingException("Don't do this at home kids!"));
> }
This doesn't seem like a valid concern. Nothing stops you from
using a (standard) function to do that ugly boilerplate.
|
February 09, 2014 Re: List of Phobos functions that allocate memory? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jakob Ovrum | On Saturday, 8 February 2014 at 11:17:26 UTC, Jakob Ovrum wrote:
> On Saturday, 8 February 2014 at 11:05:38 UTC, Dmitry Olshansky wrote:>
>> If both are thread-local and cached I see no problem whatsoever.
>> The thing is the current "default" of creating exception is AWFUL.
>> And D stands for sane defaults and the simple path being good last time I checked.
>
> How is it not a problem? XException's fields (message, location etc) would be overwritten by the latest throw site, and its `next` field would point to itself.
It's supposedly one exception instance per place where it can be thrown, not per exception type. Then the problem would be restricted to recursive calls, where in the exception handler for XException, another XException is thrown.
|
February 09, 2014 Re: List of Phobos functions that allocate memory? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 8 February 2014 at 21:59:24 UTC, Walter Bright wrote: > On 2/7/2014 6:50 AM, "Marc Schütz" <schuetzm@gmx.net>" wrote: >> The specific problem was that it was possible to provoke hash collisions by >> sending carefully crafted input, causing the hash-tables to degrade to linked >> lists. The small performance penalty of using collision-resistant hashes is >> certainly worth it in this case. > > That has nothing to do with needing exceptions in the control flow path (and the performance penalty for using exceptions in this manner is certainly not small). Huh? I responded to this discussion: On Friday, 7 February 2014 at 08:30:35 UTC, Walter Bright wrote: > On 2/6/2014 7:08 PM, bearophile wrote: >> That's why some languages have changed their sorting and hashing routines to >> make them a little slower but safer on default. > > DoS attack resistance requires faster code, not slower code. I was merely clarifying why in this specific case making the average code path slower _did_ help DoS attack resistance. |
February 09, 2014 Re: List of Phobos functions that allocate memory? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | On 2014-02-07 21:56, Sean Kelly wrote: > I was mostly surprised that the stack trace was written back to > the client. I'd expect something like that in a log on the > server side. I do see how it would be convenient to have a stack > trace included in a bug report, but if this feature is disabled > in release mode then you can't rely on it anyway. I'd just > always be checking the logs (where I'd hope the stack trace would > always be written). Ruby on Rails always writes the stack trace to the log. In development mode it will also render it to the client. In production mode we use a plugin that sends an email when an exception occurs. The email will contain the full stack trace, environment variables and some other data about the request that failed. BTW, you can do a lot more with HTML than plain text (log files). -- /Jacob Carlborg |
February 09, 2014 Re: List of Phobos functions that allocate memory? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ola Fosheim Grøstad | This is a pretty nice description of the i7 pipeline by Hennesey and Patterson: https://www.inkling.com/read/computer-architecture-hennessy-5th/chapter-3/section-3-13#0113e87a6dc141d7abda84b497128d61 Notice the 28 micro ops buffer before execution. I'd expect a short predicted branch to not cause a big bubble, but I don't know for sure. |
Copyright © 1999-2021 by the D Language Foundation