Thread overview | ||||||
---|---|---|---|---|---|---|
|
August 20, 2013 Add duration parsing to core.time? | ||||
---|---|---|---|---|
| ||||
While working on a configuration file parser, I found myself trying to decide which units to use for various time variables (e.g. `expireInterval`) which is silly because we have an excellent Duration structure in core.time. I was pleased to discover that Duration has a toString method which prints a nice, human-readable description. Unfortunately, there appears to be no corresponding parse method. Turns out that it's surprisingly easy to write thanks to the existing functionality in std.conv: http://dpaste.dzfl.pl/1500b834 It appears that DPaste stumbles over the unicode 'μs' in the units enum, so here's a test invocation and output: $ dmd -unittest test_duration.d && ./test_duration '12 hours, 30 minutes' '1w2d20m12h5m2s' 12 hours and 30 minutes 1 week, 2 days, 12 hours, 25 minutes, and 2 secs I've made the implementation more flexible than simply parsing the very standard output of Duration.toString by adding more unit synonyms and making whitespace, commas, and 'and' optional. All this really requires is a sequence of digits followed by a unit name, possibly repeating; leading to the very compact form used in '1w2d20m12h5m2s'. All validation is performed by the two calls to std.conv.parse, so invalid strings should fail (e.g. 'four madeupunits'). One possible improvement is to support written-out numbers such as "seven" and "forty-two", but I suspect this would entail a much more involved implementation. Thoughts on including something like this core.time? My thought is that Duration could have a `this(string)` with a non-consuming version of this function for automatic to! support in addition to providing parse. Justin |
August 20, 2013 Re: Add duration parsing to core.time? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Justin Whear | On Tuesday, August 20, 2013 17:57:19 Justin Whear wrote:
> While working on a configuration file parser, I found myself trying to decide which units to use for various time variables (e.g. `expireInterval`) which is silly because we have an excellent Duration structure in core.time. I was pleased to discover that Duration has a toString method which prints a nice, human-readable description. Unfortunately, there appears to be no corresponding parse method. Turns out that it's surprisingly easy to write thanks to the existing functionality in std.conv: http://dpaste.dzfl.pl/1500b834
>
> It appears that DPaste stumbles over the unicode 'μs' in the units enum, so here's a test invocation and output:
>
> $ dmd -unittest test_duration.d && ./test_duration '12 hours, 30 minutes'
> '1w2d20m12h5m2s'
> 12 hours and 30 minutes
> 1 week, 2 days, 12 hours, 25 minutes, and 2 secs
>
> I've made the implementation more flexible than simply parsing the very
> standard output of Duration.toString by adding more unit synonyms and
> making whitespace, commas, and 'and' optional. All this really requires
> is a sequence of digits followed by a unit name, possibly repeating;
> leading to the very compact form used in '1w2d20m12h5m2s'.
> All validation is performed by the two calls to std.conv.parse, so
> invalid strings should fail (e.g. 'four madeupunits').
>
> One possible improvement is to support written-out numbers such as "seven" and "forty-two", but I suspect this would entail a much more involved implementation.
>
> Thoughts on including something like this core.time? My thought is that Duration could have a `this(string)` with a non-consuming version of this function for automatic to! support in addition to providing parse.
If such a function were added, it would be fromString on Duration, and it would accept the exact format that toString uses (and only that format). Anything more complicated would have to be part of a functionality relating to user-defined format strings, which I haven't finished yet. That'll probably end up in std.datetime.format at some point after I've finished splitting std.datetime.
- Jonathan M Davis
|
August 21, 2013 Re: Add duration parsing to core.time? | ||||
---|---|---|---|---|
| ||||
On Tuesday, August 20, 2013 15:35:20 Jonathan M Davis wrote:
> On Tuesday, August 20, 2013 17:57:19 Justin Whear wrote:
> > While working on a configuration file parser, I found myself trying to decide which units to use for various time variables (e.g. `expireInterval`) which is silly because we have an excellent Duration structure in core.time. I was pleased to discover that Duration has a toString method which prints a nice, human-readable description. Unfortunately, there appears to be no corresponding parse method. Turns out that it's surprisingly easy to write thanks to the existing functionality in std.conv: http://dpaste.dzfl.pl/1500b834
> >
> > It appears that DPaste stumbles over the unicode 'μs' in the units enum, so here's a test invocation and output:
> >
> > $ dmd -unittest test_duration.d && ./test_duration '12 hours, 30 minutes'
> > '1w2d20m12h5m2s'
> > 12 hours and 30 minutes
> > 1 week, 2 days, 12 hours, 25 minutes, and 2 secs
> >
> > I've made the implementation more flexible than simply parsing the very
> > standard output of Duration.toString by adding more unit synonyms and
> > making whitespace, commas, and 'and' optional. All this really requires
> > is a sequence of digits followed by a unit name, possibly repeating;
> > leading to the very compact form used in '1w2d20m12h5m2s'.
> > All validation is performed by the two calls to std.conv.parse, so
> > invalid strings should fail (e.g. 'four madeupunits').
> >
> > One possible improvement is to support written-out numbers such as "seven" and "forty-two", but I suspect this would entail a much more involved implementation.
> >
> > Thoughts on including something like this core.time? My thought is that Duration could have a `this(string)` with a non-consuming version of this function for automatic to! support in addition to providing parse.
>
> If such a function were added, it would be fromString on Duration, and it would accept the exact format that toString uses (and only that format). Anything more complicated would have to be part of a functionality relating to user-defined format strings, which I haven't finished yet. That'll probably end up in std.datetime.format at some point after I've finished splitting std.datetime.
And actually, I really don't like the idea of adding a function for parsing the result of Duration's toString. Duration's toString was intended for human legibility, not for being written out and the read in again. std.datetime has several to*String functions with corresponding from*String functions, but they're all in standard formats, whereas Duration's toString is not. So, if any kind of from*String is going to be added to Duration, then a standard format needs to be used and a corresponding to*String function created. There are several standard formats for dates and times, so I assume that there's one for durations as well, but I'd have to look into it. Preferably something from ISO 8601 would be used if it has a standard string format for durations, since that's the main ISO standard for time-related stuff.
In general, I'm very much opposed to functions which try and parse arbitrary strings as they're incredibly error-prone and have to guess at what you mean. In pretty much any case where the string was emitted by a computer in the first place rather than a human, that's just plain sloppy, and ideally, a human would be required to put a string in a standard format when inputting it (or input the values separately rather than as a string) in order to avoid intepretation errors.
- Jonathan M Davis
|
August 21, 2013 Re: Add duration parsing to core.time? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On Wednesday, 21 August 2013 at 06:46:49 UTC, Jonathan M Davis wrote:
> In general, I'm very much opposed to functions which try and parse arbitrary
> strings as they're incredibly error-prone and have to guess at what you mean.
> In pretty much any case where the string was emitted by a computer in the first
> place rather than a human, that's just plain sloppy, and ideally, a human
> would be required to put a string in a standard format when inputting it (or
> input the values separately rather than as a string) in order to avoid
> intepretation errors.
>
> - Jonathan M Davis
I agree completely and can speak from experience. We used wxWidget's wxDateTime class for years at work and its ParseDateTime which allows free format strings. It was a source of never ending problems for us until we finally stopped using it. The implementation was fine, it's just that dates are not amenable to unstructured reading. Date strings with locale information embedded in them may be doable but they are basically nonexistent.
Date strings are a lot like string encodings. They are unsafe to use without knowing a definitive format/encoding.
|
Copyright © 1999-2021 by the D Language Foundation