October 09, 2010
On 10/9/10 20:40 CDT, Denis wrote:
> I believe in D one should be able to write a RangedInt(int lowerBound,
> int upperBound) type

More precisely: BoundedInt(long lowerBound, ulong upperBound) so it can express any range of any number.

Andrei
October 09, 2010
On 10/9/10 20:13 CDT, Jonathan M Davis wrote:
> On Saturday 09 October 2010 17:15:06 Andrei Alexandrescu wrote:
>> I suggest you generate html documentation and attach it. I can put it on my website.
>
> I included it in the compressed tar file that I linked to. It has a Src folder and a Doc folder. Src holds the .d files and Doc holds the generated html files.

Great. All, refer to:

http://erdani.com/d/datetime/all.html http://erdani.com/d/datetime/core.html http://erdani.com/d/datetime/duration.html http://erdani.com/d/datetime/interval.html http://erdani.com/d/datetime/other.html http://erdani.com/d/datetime/timepoint.html http://erdani.com/d/datetime/timezone.html http://erdani.com/d/datetime/unittests.html


Andrei
October 10, 2010
On Saturday 09 October 2010 12:27:26 Andrei Alexandrescu wrote:
> * I suggest even experimenting with strings instead of TUnit. Really TUnit is a vestige from C++ and Java, but we do accept string template parameters so how about
> 
> assert(convert!("years", "months")(2) == 24);
> 
> etc. Opinions?

One downside to this over TUnit is that it makes comparing time units harder (e.g. units < TUnit.year), but that could be gotten around easily enough with a function that does the comparison for you, and not much code is going to care. It's primarily of interest in template constraints.

One issue that would arise from this is one which I'm beginning to think we need to address in the language somehow, and that is template constraint failures. When a template constraint fails, all the compiler tells you is that no version of the template matches the arguments that you're trying to instantiate with. That's highly uninformative. Ideally, it would tell you which portion of the constraint failed. For instance, by using strings instead of TUnit, you have the issue of getting the string exactly right. If the string had to be "year", and someone used "years" or "Year", then the template constraint would fail, and they would have no idea why. At least with the enum, the compiler will complain about the parameter that you're giving the template (though it complains that the mistyped value is not a property of int (e.g. that years is not a property of int if you used TUnit.years instead of TUnit.year), which isn't a very good message either - it's just that it's specific enough that it's easy to figure out). Using a string will result in worse error messages.

Still, I'm beginning to think that it's a good idea. If I do it, I'd likely change duration creation too. Right now, you'd do Dur.years() or Dur.months() or whatnot to create a duration, but it could become Dur!"year"() and Dur!"month"() (or something similar, since normally you wouldnt start a function with a capital letter - though it is specifically constructing a type in this case, so it's not completely off to use a capital letter) and you'd end up with fewer functions for creating durations, and could get rid of the Dur class (though in spite of your misgivings about it, I do quite like the idea of using classes for namespaces, at least in cases where a namespace would be useful). With TUnitConv removed, that would leave only Clock and IRange as namespace classes (and I really do think that it improves code readability to leave them that way).

- Jonathan M Davis
October 10, 2010
On Sun, Oct 10, 2010 at 03:40, Denis <2korden at gmail.com> wrote:
> I believe in D one should be able to write a RangedInt(int lowerBound, int upperBound) type that would make sure it contains value in the range specified (that would require runtime checks, but I think that's acceptable, those check can be disabled in release). Could be suitable for DayOfWeek, Month, etc.

What about posting that as a [challenge] on the main D newsgroup?

It's also interesting for floating point values between 0 and 1, for
probabilities or whatever field that's normalized.
Heck, even for defining letters, digits, etc.

alias Ranged!(char, 'a','z') letters;
alias Ranged!(char, 'A', 'Z') Letters;
alias Ranged!(char, '0', '9') digits;
October 10, 2010
On 10/10/10 7:56 CDT, Philippe Sigaud wrote:
> On Sun, Oct 10, 2010 at 03:40, Denis<2korden at gmail.com>  wrote:
>> I believe in D one should be able to write a RangedInt(int lowerBound, int upperBound) type that would make sure it contains value in the range specified (that would require runtime checks, but I think that's acceptable, those check can be disabled in release). Could be suitable for DayOfWeek, Month, etc.
>
> What about posting that as a [challenge] on the main D newsgroup?

Good idea.

> It's also interesting for floating point values between 0 and 1, for
> probabilities or whatever field that's normalized.
> Heck, even for defining letters, digits, etc.
>
> alias Ranged!(char, 'a','z') letters;
> alias Ranged!(char, 'A', 'Z') Letters;
> alias Ranged!(char, '0', '9') digits;

Great. I still suggest using "Bounded" instead of "Ranged" to avoid confusion with anything range. In particular, ranges are open to the right but bounded are closed.


Andrei
October 10, 2010
On 10/10/10 5:59 CDT, Jonathan M Davis wrote:
> On Saturday 09 October 2010 12:27:26 Andrei Alexandrescu wrote:
>> * I suggest even experimenting with strings instead of TUnit. Really TUnit is a vestige from C++ and Java, but we do accept string template parameters so how about
>>
>> assert(convert!("years", "months")(2) == 24);
>>
>> etc. Opinions?
>
> One downside to this over TUnit is that it makes comparing time units harder (e.g. units<  TUnit.year), but that could be gotten around easily enough with a function that does the comparison for you, and not much code is going to care. It's primarily of interest in template constraints.

OK. Just mentioning the possibility.

> One issue that would arise from this is one which I'm beginning to think we need to address in the language somehow, and that is template constraint failures. When a template constraint fails, all the compiler tells you is that no version of the template matches the arguments that you're trying to instantiate with. That's highly uninformative. Ideally, it would tell you which portion of the constraint failed. For instance, by using strings instead of TUnit, you have the issue of getting the string exactly right. If the string had to be "year", and someone used "years" or "Year", then the template constraint would fail, and they would have no idea why. At least with the enum, the compiler will complain about the parameter that you're giving the template (though it complains that the mistyped value is not a property of int (e.g. that years is not a property of int if you used TUnit.years instead of TUnit.year), which isn't a very good message either - it's just that it's specific enough that it's easy to figure out). Using a string will result in worse error messages.

In fact that's not an issue. Template constraints were meant to avoid a template biting off more than it can chew. In this case, however, you want to cover the total set, so you can simply write:

long convert(string from, string to)(long) {
     static if (from == "seconds" && to == "minutes") {
     } else static if (...) {
     ...
     } else {
         static assert("Invalid duration specifiers. ...");
     }
}

> Still, I'm beginning to think that it's a good idea. If I do it, I'd likely change duration creation too. Right now, you'd do Dur.years() or Dur.months() or whatnot to create a duration, but it could become Dur!"year"() and Dur!"month"() (or something similar, since normally you wouldnt start a function with a capital letter - though it is specifically constructing a type in this case, so it's not completely off to use a capital letter) and you'd end up with fewer functions for creating durations, and could get rid of the Dur class (though in spite of your misgivings about it, I do quite like the idea of using classes for namespaces, at least in cases where a namespace would be useful). With TUnitConv removed, that would leave only Clock and IRange as namespace classes (and I really do think that it improves code readability to leave them that way).

Here's an unrelated suggestion that might simplify things a fair amount: drop the "months" and the "years" durations. They are the only ones that are irregular. Then you have only one duration type that's always precise and you can drop the interval type. If someone wants to figure out e.g. the number of months between two dates, we can provide specific functions.


Andrei
October 10, 2010
On Sunday 10 October 2010 06:43:59 Andrei Alexandrescu wrote:
> Here's an unrelated suggestion that might simplify things a fair amount: drop the "months" and the "years" durations. They are the only ones that are irregular. Then you have only one duration type that's always precise and you can drop the interval type. If someone wants to figure out e.g. the number of months between two dates, we can provide specific functions.

It's not an altogether bad suggestion, but I think that it limits the expressiveness of the library. For instance, if you wanted to iterate over a range of Christmases, you would need to be able to indicate that you wanted a year between each date that you were iterating over. That is, each date in the range would be 1 year later than the previous one. You can't indicate that with a duration of days, because once you hit a leap year, you'll get December 24th instead of December 25th. You need to be able to indicate years.

Now, we _could_ have addXXX functions for months and years and use durations for everything else. It's a little bit awkward, but it would solve the range issue I just mentioned and still allow for the adding of months and years. The one problem that remains is that that makes it impossible to do something like Dur.months(5) + Dur.days(3) to indicate a duration of 5 months and 3 days. And creating a range function which iterates over successive dates which are 5 months and 3 days apart would be much harder because you would have to use both the add function and a duration to do it.

Eliminating MonthDuration and JointDuration in favor of simply having HNSecDuration simplifies a fair bit of code, but it makes it very hard to mix months and years and anything smaller anywhere. So, it's tempting, but I'm not entirely sure if it's a good idea or not.

By having duration creation be entirely by functions without requiring the programmer to really have to care about the types of the durations, for the most part, you manage to be able to mix months and smaller units without causing problems for the programmer. It _does_ complicate the library itself, and it is certainly possible for it to cause problems for the user (like if they don't use auto or if they try and pass a duration to a function that they write), but it manages to avoid most of them.

So, I'm a bit torn on the issue. The extra complication is annoying, but if we go for a single duration type (or two if you count TickDuration), there's definitely code which gets simplified. On the other hand, other code becomes harder to write. In terms of the code that I've written which uses durations (which is primarily the unit tests), I think that the current solution generally works quite well, but then again, I understand the issues of MonthDuration vs HNSecDuration and am not confused by the multiple duration types. It's likely to be more confusing for someone who hasn't dealt with it before. On the other hand, if you're typically using durations in time code and you suddenly need to deal with months or years, the lack of ability to create durations of months or years would also be confusing and potentially frustrating.

So, I'm not really sure what the best solution is. For the most part, I like it how it is, but it certainly isn't ideal (then again, "ideal" isn't really possible due to the variable number of days in months and years). I'll have to think about it, and I really don't know what the average programmer would prefer, or how limiting it will be to the not-so average programmer who is really trying to leverage the library if they can't have durations of months or years.

- Jonathan M Davis
October 10, 2010
On 10/9/10 19:57 CDT, Jonathan M Davis wrote:
> I ended up making it fairly easy to _use_ in generic code, I believe, but it doesn't do much to generate code, which does indeed result in a lot of generic and repetitive code.

Libraries that contain type names in function names have trouble with generic code.

> They were meant primarily to better document what a function took (like Year, MonthOfYear, DayOfMonth rather than int, int, int) and they restrict the range of what you can feed a function (like short instead of int), but they are rather ugly and using anything smaller than int gets a bit annoying to use as I found in the unit tests. It does have the advantage of making sure that the programmer restricts their values to the smaller size, but since the smaller size still isn't the exact range required for the given value (e.g. 1 - 12 for months or 1 - 31 for days), it doesn't really help all that much anyway. It also creates greater confusion when various other functions (like addXXX()) take longs, though if I did it correctly, that it least manages to indicate when you use a value in a specific range rather than x amount of a time unit (e.g. the exact month vs the number of months to add). It certainly isn't the greatest as is though, and I'm more than open to fixing it.
>
> The main one that I'm leary on fixing is StdTime (and AdjustedTime) since it has a specific meaning beyond long and it helps with the documentation. But ideally, programmers would use such functions very sparingly anyway, since SysTime wraps and deals with most of that for you.

My stance is simple: any global alias that defines a structured type in terms of an unstructured type is undesirable. This is because the alias lures the programmer into a false sense of security. So you'd have difficulty pushing through "alias Anything long;". If SysTime is always long, we may as well use long. If SysTime is system-dependent (a la size_t), then the case is stronger.

Getting back to Bounded, I'm thinking of allowing an optional string as the last argument:

alias Bounded!(long, long.min, long.max, "StdTime") StdTime;

This creates a type that's pretty much like long except it's distinct from any other type, even other types that have long.min and long.max as their bounds.

The advantage is that people can't pass an unadorned long to a function expecting an StdTime. They'd have to write foo(StdTime(100)). Talk about library-defined typedef! :o)

>> The names are self-explanatory, no need to repeat them!
>
> I added the individual enume comments at the last minute as soon as I realized that for some reason putting ddoc comments on the enum itself didn't result in the enum values being shown in the documentation (highly negative in my opinion

I think it ain't that bad. Put a comment on the first value and then "/// ditto" for the others. They will be nicely grouped.

> For Clock, Dur, and IRange, however (which are similar namespaces), I really do
> think that the added namespacing is well worth it. It makes the code much
> clearer. Dur in particular would be bad to have its functions separated out,
> because then you have functions like years(), months(), and days() which are
> going to conflict all over the place with variable  names.

I'm just wondering how come dates and times are the only API that's hairy enough to require carving namespaces within the whole Phobos. I'm just saying a solid argument should be brought up. Otherwise anyone could use this as a precedent to justify unnecessarily baroque APIs.

>> Suggestions: (a) throw TUnitConv away; (b) define this and friends at
>> global scope:
>>
>> long convert(TUnit from, TUnit to)(long);
>
> I'm not sure if convert is descriptive enough (TUnitConv.to!() has the distinct advantage of being quite explicit even if it isn't very pretty), but given how little is in TUnitConv, that's probably a good idea.

For one thing, "to" is not a good name when you need to specify the type of "from" as well.

auto a = to!int("123");

makes sense because you get to see the subject of "to". But

auto a = to!(TUnit.weeks, TUnit.seconds)(123);

does not parse remotely like English, whereas

auto a = convert!(TUnit.weeks, TUnit.seconds)(123);

can be reasonably evoked and remembered as "convert from weeks to seconds".

Finally, you need to put faith in overloading. Even if someone else defines some "convert" template, only yours takes two TUnit parameters.

>> That being said, convert() does go against the grain of elaborate typing for durations. I'd accept it as a low-level function, but really the library currently fosters elaborate types for durations... so there's a disconnect here.
>
> The library itself needs it regardless. As for user code, well sometimes they may very well need to deal with converting seconds to minutes and the like, so I think that it's worthwhile to have in the API, but it's not something that I'd expect to be heavily used.

Fine.

>> * As another random example:
>>
>> DayOfWeek getDayOfWeek(DayOfGregorianCal day) pure nothrow
>>
>> could/should be (in client use):
>>
>> get!(TUnit.dayOfWeek, TUnit.dayOfGregorianCal)(day);
>
> Since those functions would all have to be have specialized version of such templates, I'm not sure how much you gain with that approach - particularly when there aren't very many of those functions (and IIIRC they're all package or private rather than in the API). But it may be worth doing something like that. I'm more skeptical of this suggestion than the others though.

What you gain is that the client only needs to remember "get" and the types, instead of the cross product of types. Not to mention advantages in generic code, although I don't have an example handy.

Fewer functions for a given functionality == better library.

>> * On to duration.d. It's simple: there's no need for three durations. Rename JointDuration to Duration and throw everything else. I'd find myself hard pressed to produce situations in which efficiency considerations make it impossible to cope with a 128-bit representation.
>>
>> * If you do want to go fine-grained durations without the aggravation, use templates. Then you can have Duration!HNSeconds, Duration!Seconds, Duration!Months, whatever you want, without inflicting pain on yourself and others. If you play your cards right, such durations would match 100% the templates you define in core.d and elsewhere so you don't need to make a lot of implementation effort. BUT again I think one duration is enough. Maximum two - short duration and duration.
>
> I really don't think that this is going to work, and the problem is months and years (they frequently seem to be the problem) due to the fact that they don't have a uniform number of days. Originally, I had a Duration struct templated on TUnit, but it resulted in several problems - the biggest having to do with addition/subtraction.

Maybe we can throw month durations and year durations away and dedicate a few specialized functions to them? For example, given a date and a uniform duration, we can add the duration to the year to figure the end year etc.

> You can't convert months or years to units smaller than months, and you can't convert anything smaller than a month to months or years without a specific date.

Yah, so that fuels my argument that month durations don't make sense. They are exceptional so maybe we should just give up integrating then into a uniform concept.

> So, you can't add a Duration!month and Duration!day no matter how it may seem like you should be able to. Many durationos _could_ be added - such as Duration!day and Duration!hour - but the variable sizes of months and years poses a big problem for that. To fix the problem, you need a duration which joins the two - hence JointDuration. That way, you can write user code like Dur.years(7) + Dur.days(2) and have it work. The way I'd originally done that resulted in a combinatorial explosion of joint duration types (it was effctively templated on the types of duraton that you'd added together to create it), so I just combined the months+ and sub-months durations into two types. It cuts down on the amount of generated code at least.

I see. How much would it hurt us to drop months and years durations, as well as methods such as years() in any duration?


Andrei
October 10, 2010
On 10/10/10 9:23 CDT, Jonathan M Davis wrote:
> On Sunday 10 October 2010 06:43:59 Andrei Alexandrescu wrote:
>> Here's an unrelated suggestion that might simplify things a fair amount: drop the "months" and the "years" durations. They are the only ones that are irregular. Then you have only one duration type that's always precise and you can drop the interval type. If someone wants to figure out e.g. the number of months between two dates, we can provide specific functions.
>
> It's not an altogether bad suggestion, but I think that it limits the expressiveness of the library. For instance, if you wanted to iterate over a range of Christmases, you would need to be able to indicate that you wanted a year between each date that you were iterating over. That is, each date in the range would be 1 year later than the previous one. You can't indicate that with a duration of days, because once you hit a leap year, you'll get December 24th instead of December 25th. You need to be able to indicate years.

In this example I'd iterate over years and construct Date(year, Month.dec, 25) on the fly. So that's not a compelling example, but I'm not saying one doesn't exist.

> Now, we _could_ have addXXX functions for months and years and use durations for everything else. It's a little bit awkward, but it would solve the range issue I just mentioned and still allow for the adding of months and years. The one problem that remains is that that makes it impossible to do something like Dur.months(5) + Dur.days(3) to indicate a duration of 5 months and 3 days. And creating a range function which iterates over successive dates which are 5 months and 3 days apart would be much harder because you would have to use both the add function and a duration to do it.

I think we're safe to drop things like Dur.months(5) + Dur.days(3). I can hardly think of a use case for such, and I think it's much better to accommodate them in other ways - e.g. allow addMonths to dates. (There would be no addDays because days are regular durations so + and += should suffice. Yay!)

> Eliminating MonthDuration and JointDuration in favor of simply having HNSecDuration simplifies a fair bit of code, but it makes it very hard to mix months and years and anything smaller anywhere. So, it's tempting, but I'm not entirely sure if it's a good idea or not.

Let's enumerate what we stand to lose.

> By having duration creation be entirely by functions without requiring the programmer to really have to care about the types of the durations, for the most part, you manage to be able to mix months and smaller units without causing problems for the programmer. It _does_ complicate the library itself, and it is certainly possible for it to cause problems for the user (like if they don't use auto or if they try and pass a duration to a function that they write), but it manages to avoid most of them.

It also complicates users' lives. One thing I've learned by working on Phobos is that people _do_ look at and judge library code. And clearly API sizes are an issue. Again, enumerating what we stand to lose and possibly serving that through specialized functions might be the easiest way to go.

> So, I'm a bit torn on the issue. The extra complication is annoying, but if we go for a single duration type (or two if you count TickDuration), there's definitely code which gets simplified. On the other hand, other code becomes harder to write.

Let's see which, and let's see how we can improve that without aggravating anyone.

> In terms of the code that I've written which uses durations
> (which is primarily the unit tests), I think that the current solution generally
> works quite well, but then again, I understand the issues of MonthDuration vs
> HNSecDuration and am not confused by the multiple duration types. It's likely to
> be more confusing for someone who hasn't dealt with it before. On the other
> hand, if you're typically using durations in time code and you suddenly need to
> deal with months or years, the lack of ability to create durations of months or
> years would also be confusing and potentially frustrating.

My suspicion is that there are very few users who are sophisticated enough to manipulate months/years durations (with all the wierdnesses), yet too simple-minded to code their way out of a paper bag.

> So, I'm not really sure what the best solution is. For the most part, I like it how it is, but it certainly isn't ideal (then again, "ideal" isn't really possible due to the variable number of days in months and years). I'll have to think about it, and I really don't know what the average programmer would prefer, or how limiting it will be to the not-so average programmer who is really trying to leverage the library if they can't have durations of months or years.

Again, the action item is to find realistic use cases.


Andrei
October 10, 2010
On Sunday 10 October 2010 07:45:14 Andrei Alexandrescu wrote:
> Again, the action item is to find realistic use cases.

Okay. If all you're doing is adding to a time point, then addYears() and addMonths() works just fine. It's a bit odd to have addYears() and addMonths() but uses durations for all of the other units, but it's likely less confusing than having the separate types MonthDuration and HNSecDuration.

Now, it doesn't work as well for generic code to have add functions for months and years but durations for everything else. You have to special case years and months whereas with MonthDuration and HNSecDuration you don't have to as long as you templatize your duration types. With MonthDuration and HNSecDuration, most of the special casing is in the time point and duration types, so the programmer doesn't have to worry about special casing stuff.

The really big gain fo being able to do something like Dur.months(5) + Dur.days(2) and have a duration with both months and days is that it makes it a lot easier to do fancy stuff with ranges. You can specify any combination of time units to specify very exact durations between time units. Without durations with months or years, that becomes much more awkward. I'm not sure how likely it is that someone would be interested in iterating over succesive dates which are odd durations apart like 2 months and 7 days though. It may make it easier to do a proper implementation of date recurrence patterns (RFC 2445, I believe), but since that's already going to require a potentially extensive library to make that work, it could likely abstract away any of the difficulties in having a range which iterates over time points which are apart by more eccentric durations.

It wouldn't be all that hard to create a function which took the number of years, months, and a duration to generate a range function, and that would be far more awkward than Dur.months(5) + Dur.days(3), but it wouldn't be all that bad, and it would seriously simplify a lot of the rest of std.datetime.

Really, the only major case that I can think of is that being able to generate arbitrary durations is useful when dealing with ranges. If you're simply trying to add a duration to a time point, it's not much worse to have to explicitly use addYears and/or addMonths in addition to adding a duration to the time point (a bit annoying perhaps, but not a big deal). Generic code also does better with durations (though we could always keep addTUnit() and that would solve the generic code issue just fine), but I question that that's all that big a deal.

Part of me would definitely like to keep MonthDuration, HNSecDuration, and JointDuration as is, but the more I think about it, the more it looks like it wouldn't be all that bad to have to work around their lack, and while using the durations as they are really isn't all that hard, it's going to confuse a fair number of people when they first encounter them. So, I'm beginning to lean towards just simplifying it to HNSecDuration (though I'd rename it as Duration in that case). It's annoying in some cases, but it'll definitely cut down on the learning curve for using the library.

Much as auto pretty much saves the day, it's pretty much a guarantee that there are programmers who will want to (or will think that they have to) actually worry about the specific duration types, no matter how good a job I do at making it so that they don't have to worry about it. I know that there are programmers who have worried about that sort of thing with std.algorithm and been scared off by it, when they don't really need to care thanks to auto.

- Jonathan M Davis