November 20, 2013 Re: Checking function parameters in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Wednesday, 20 November 2013 at 20:06:47 UTC, Andrei Alexandrescu wrote:
> That wouldn't help much - people have access to the underlying range anyway.
>
> Andrei
You're right, I forgot about that. However, people generally won't be modifying a SortedRange in place, will they? Even if they do, it'll probably be using one of the mutating functions in std.algorithm. Also, somewhat related, couldn't std.algorithm.sort simply return the passed-in range if that range is already wrapped with SortedRange?
|
November 21, 2013 Re: Checking function parameters in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On 20.11.2013 19:30, Dmitry Olshansky wrote: > 20-Nov-2013 22:01, Simen Kjærås пишет: >> On 20.11.2013 18:45, Simen Kjærås wrote: > [snip] >>> May I suggest: >>> >>> struct Validated(alias fn, T) { >>> private T value; >>> @property inout >>> T get() { >>> return value; >>> } >> >> Uh-hm. Add this: >> alias get this; >> > > And it decays to the naked type in a blink of an eye. And some function > down the road will do the validation again... And guess what? That's (often) ok. It's better to do the validation once too many than missing it once. The point (at least in the cases I've used it) is to enforce that only validated values are passed to functions that require validated strings, not that validated values never be passed to functions that don't really care. Doing it like this also lets you call functions that take the unadorned type, because that might be just as important. The result of re-validating is performance loss. The result of missed validation is a bug. Also, in just a few lines, you can make a version that will *not* decay to the original type: struct Validated(alias fn, T) { private T _value; @property inout T value() { return _value; } } // validated() is identical to before. Sure, using it is a bit more verbose than using the unadorned type, which is why I chose to make the original version automatically decay. This is a judgment where sensible people may disagree, even with themselves on a case-by-case basis. -- Simen |
November 21, 2013 Re: Checking function parameters in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On 11/20/2013 2:49 AM, Jacob Carlborg wrote:
> How should we accomplish this? We can't replace:
>
> void main (string[] args)
>
> With
>
> void main (UnsafeString[] args)
>
> And break every application out there.
Use a different type for the validated string, validated means your program has guaranteed it has a certain form defined by that program.
|
November 21, 2013 Re: Checking function parameters in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On 11/20/2013 3:16 AM, Jonathan M Davis wrote:
> ValidatedString would then avoid any extra validation when iterating over the
> characters, though I don't know how much of an efficiency gain that would
> actually be given that much of the validation occurs naturally when decoding
> or using stride. It would have the downside that any function which
> specializes on strings would likely have to then specialize on ValidatedString
> as well. So, while I agree with the idea in concept, I'd propose that we
> benchmark the difference in decoding and striding without the checks and see if
> there actually is much difference. Because if there isn't, then I don't think
> that it's worth going to the trouble of adding something like ValidatedString.
Utf validation isn't the only form of validation for strings. You could, for example, validate that the string doesn't contain SQL injection code, or contains a correctly formatted date, or has a name that is guaranteed to be in your employee database, or is a valid phone number, or is a correct email address, etc.
Again, validation is not defined by D, it is defined by the constraints YOUR PROGRAM puts on it.
|
November 21, 2013 Re: Checking function parameters in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Wednesday, November 20, 2013 16:26:59 Walter Bright wrote:
> On 11/20/2013 3:16 AM, Jonathan M Davis wrote:
> > ValidatedString would then avoid any extra validation when iterating over the characters, though I don't know how much of an efficiency gain that would actually be given that much of the validation occurs naturally when decoding or using stride. It would have the downside that any function which specializes on strings would likely have to then specialize on ValidatedString as well. So, while I agree with the idea in concept, I'd propose that we benchmark the difference in decoding and striding without the checks and see if there actually is much difference. Because if there isn't, then I don't think that it's worth going to the trouble of adding something like ValidatedString.
> Utf validation isn't the only form of validation for strings. You could, for example, validate that the string doesn't contain SQL injection code, or contains a correctly formatted date, or has a name that is guaranteed to be in your employee database, or is a valid phone number, or is a correct email address, etc.
>
> Again, validation is not defined by D, it is defined by the constraints YOUR PROGRAM puts on it.
Yes, but we seemed to be discussing the possibility of having some kind of type in Phobos which indicated that the string had been validated for UTF correctness. I wouldn't expect other types of string validation to end up in Phobos.
And without the type for UTF validation being in Phobos and specialized on in Phobos functions, I don't think that I would ever want to use it, because in such a case, you lose out on all of the specialization that Phobos does for strings and are stuck with a range of dchar, which will force a lot of extra decoding even if some of the validation can be skipped, since it was already validated, whereas a number of Phobos functions are able to specialize on narrow strings and avoid decoding altogether. That performance boost would be lost if a string was wrapped in a UTFValidatedString without Phobos specializing on UTFValidatedString, and based on how decode and stride work, it looks to me like the decoding costs way more than the little bit of extra validation that is currently done as part of that such that avoiding the decoding is likely to be a much greater performance boost than avoiding those checks. And if that is indeed the case, I don't see much point to something like UTFValidatedString unless Phobos specializes for it like it specializes for narrow strings.
Other types of string validation might very well be worth doing without Phobos knowing about them, but having the wrapper type which indicates that that validation has been done still needs to be worth more than the performance hit of not being able to use naked strings anymore and losing any performance gains that come from the functions which specialize for narrow strings. And that's probably true for strings that just get passed around but probably isn't true for strings that end up being processed by range-based functions a lot.
- Jonathan M Davis
|
November 21, 2013 Re: Checking function parameters in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Meta | On 2013-11-20 19:53, Meta wrote: > Yes. It is very important not to allow direct access to the underlying > value. This is important for ensuring that it is not put in an invalid > state. This is a mistake that was made with std.typecons.Nullable, > making it useless for anything other than giving a non-nullable type a > null state (which, in fairness, is probably all that it was originally > intended for). In that case all string functionality needs to be provided inside the Validated struct. In addition to that we loose the beauty of UFCS, at least for functions expecting plain "string". -- /Jacob Carlborg |
November 21, 2013 Re: Checking function parameters in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Simen Kjærås | On 2013-11-21 01:16, Simen Kjærås wrote: > The result of re-validating is performance loss. The result of missed > validation is a bug. Also, in just a few lines, you can make a version > that will *not* decay to the original type: > > struct Validated(alias fn, T) { > private T _value; > @property inout > T value() { > return _value; > } > } > > // validated() is identical to before. > > Sure, using it is a bit more verbose than using the unadorned type, > which is why I chose to make the original version automatically decay. > This is a judgment where sensible people may disagree, even with > themselves on a case-by-case basis. It's still accessible via "value". -- /Jacob Carlborg |
November 21, 2013 Re: Checking function parameters in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On Thursday, November 21, 2013 08:36:37 Jacob Carlborg wrote:
> On 2013-11-20 19:53, Meta wrote:
> > Yes. It is very important not to allow direct access to the underlying value. This is important for ensuring that it is not put in an invalid state. This is a mistake that was made with std.typecons.Nullable, making it useless for anything other than giving a non-nullable type a null state (which, in fairness, is probably all that it was originally intended for).
>
> In that case all string functionality needs to be provided inside the Validated struct. In addition to that we loose the beauty of UFCS, at least for functions expecting plain "string".
You could use alias this and alias the Validated struct to the underlying string, but if you did that, you'd probably end up having it escape the struct and used as a naked string the vast majority of the time, which would essentially defeat the purpose of the Validated struct.
- Jonathan M Davis
|
November 21, 2013 Re: Checking function parameters in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On Thursday, 21 November 2013 at 07:36:38 UTC, Jacob Carlborg wrote: > On 2013-11-20 19:53, Meta wrote: > >> Yes. It is very important not to allow direct access to the underlying >> value. This is important for ensuring that it is not put in an invalid >> state. This is a mistake that was made with std.typecons.Nullable, >> making it useless for anything other than giving a non-nullable type a >> null state (which, in fairness, is probably all that it was originally >> intended for). > > In that case all string functionality needs to be provided inside the Validated struct. In addition to that we loose the beauty of UFCS, at least for functions expecting plain "string". This is tricky business. Unfortunately, having the wrapper be able to degrade to its base type is at odds with providing compiler-enforced guarantees. We can't allow direct access to the underlying string, because the user could purposely or inadvertently put it in an invalid state. On the other hand, these opaque wrapper types can no longer be transparently substituted into existing code. One solution is copying the validated string to do arbitrary operations on, leaving the original validated string unchanged. auto validatedString = validate!isValidUTF(someString); //Doesn't work; Validated!string does not expose the string interface //auto invalidString = validatedString.map!(c => c - cast(char)int.max); //Also doesn't work //validatedString ~= cast(char)0xFFFF auto validatedCopy = validatedString.duplicate(); //Do bad things with validatedCopy. validatedString remains unchanged and valid |
November 21, 2013 Re: Checking function parameters in Phobos | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On 2013-11-21 08:46, Jonathan M Davis wrote: > You could use alias this and alias the Validated struct to the underlying > string, but if you did that, you'd probably end up having it escape the struct > and used as a naked string the vast majority of the time, which would > essentially defeat the purpose of the Validated struct. Yeah, that's what needs to be avoided and is the reason "alias this" or a property returning the raw string cannot be used. -- /Jacob Carlborg |
Copyright © 1999-2021 by the D Language Foundation