Another day in the ordeal of cartesianProduct (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Another day in the ordeal of cartesianProduct (page 2)

October 28, 2012

Re: Another day in the ordeal of cartesianProduct

Posted by monarch_dodra
in reply to Peter Alexander

monarch_dodra

Posted in reply to Peter Alexander

On Sunday, 28 October 2012 at 12:28:23 UTC, Peter Alexander wrote:
> Before pull request 880, walkLength accepted infinite ranges and just returned size_t.max.

Actually, before 880, you had an infinite loop, as "size_t.max" was a magic number that meant "walk to end"...

----

I've found recently trying to do coverage for std.container and others, that an "efficient" method of unittesting is to just try to instanciate the template in as many forms as possible. While you aren't actually verifying any run-time behavior, you ARE finding a lot of compile errors.

I haven't commited any of them, but I have been able to detect and fix quite some errors. They are [keyboar died]

October 28, 2012

Re: Another day in the ordeal of cartesianProduct

Posted by monarch_dodra
in reply to monarch_dodra

monarch_dodra

Posted in reply to monarch_dodra

On Sunday, 28 October 2012 at 14:11:41 UTC, monarch_dodra wrote:
> ...[keyboar died]

Yeah, sorry. I was saying that by just doing unit tests that do
nothing more than declare the templates, you can get some very
good compile coverage.

It is not optimal, but it gives a good "bang for the buck", in
terms of invested effort (very little, just typing the types, but not testing anything).

October 28, 2012

Re: Another day in the ordeal of cartesianProduct

Posted by monarch_dodra
in reply to H. S. Teoh

monarch_dodra

Posted in reply to H. S. Teoh

On Friday, 26 October 2012 at 22:43:44 UTC, H. S. Teoh wrote:
>
> This is inadequate.
>

The best we can get are users like you that take the time to report ;)

Thanks.

October 29, 2012

Re: Another day in the ordeal of cartesianProduct

Posted by xenon325
in reply to H. S. Teoh

xenon325

Posted in reply to H. S. Teoh

On Saturday, 27 October 2012 at 16:19:43 UTC, H. S. Teoh wrote:
> [snip]
> I think there is some merit to being able to declare concepts; for
> example:
>
> 	// This concept matches any type that has fields that satisfy
> 	// what's specified inside the body.
> 	concept InputRange(T) {
> 		bool empty;	// this matches @property bool empty() too
> 		T front;
> 		void popFront();
> 	}
>
> 	auto myRangeBasedFunc(InputRange r) {
> 		r.popBack();	// compile-time error: InputRange does
> 				// not define .popBack
> 	}
>
> This way, *both* the user of the template and the template writer are
> kept "honest" (i.e., they both have to conform to the requirements of an
> input range). The compiler can thus statically check the template for
> correctness *before* it ever gets instantiated.
>
> Not only so, the compiler will be able to generate a meaningful error
> message when the templates fail to match -- it can tell the user "the
> template didn't match 'cos struct X that you tried to pass to it isn't
> an InputRange, nor a ForwardRange, ... etc.".

We've had a short discussion with Jacob Carlborg about almost exactly this syntax (http://forum.dlang.org/post/jukabm$1btd$1@digitalmars.com) in the context of better compiler messages.

I will quote my concerns:

=====

> void foo (InputRange range);

1. How would you signify `foo` is generic ?
   For complier it's probably possible - by type of `InputRange`,
        but that will be one more special case
   What about user ?

2. How would you put 2 constraints on `range` ?

3. Also you forgot to address:

>> template isInputRange(R)
>> {
>>      enum bool isInputRange = is(typeof(
>>      (inout int _dummy=0)
>>      {
>>          R r = void;  /// how would you express this
>>                       /// b.t.w. ?

   range.d comment on this line is "can define a range object".

===== end quote

However I do agree it would be nice if compiler verifies template body against constraints. IMNA compiler-writer, but I wonder if it's possible with current syntax.

October 29, 2012

Re: Another day in the ordeal of cartesianProduct

Posted by Don Clugston
in reply to H. S. Teoh

Don Clugston

Posted in reply to H. S. Teoh

On 27/10/12 00:45, H. S. Teoh wrote:
> http://d.puremagic.com/issues/show_bug.cgi?id=8900
>
> :-(
>
> (The code there is called cartesianProd but it's the reduced code, so it
> doesn't really compute the cartesian product. But that's where it's
> from.)
>
> So far, the outstanding blockers for cartesianProduct are:
> 1) Compiler bug which causes unittest failure:
>
> 	std/range.d(4629): Error: variable lower used before set
> 	std/range.d(4630): Error: variable upper used before set
>
> (Jonathan had a pull request with a Phobos workaround for this, which I
> _think_ is already merged, but the autotester is still failing at this
> point. :-/)
>
> 2) Issue 8542 (crosstalk between template instantiations)
>
> 3) And now, issue 8900 (zip fails to compile with repeat(char[]))
>
> So there's still no joy for cartesianProduct. :-(
>
> I'm getting a bit frustrated with the Phobos bugs related to ranges and
> std.algorithm. I think we need to increase the number of unittests. And
> by that I mean, GREATLY increase the number of unittests. Most of the
> current tests are merely sanity tests for the most common usage
> patterns, most basic types, or tests added as a result of fixed bugs.
>
> This is inadequate.
>
> We need to actively unittest corner cases, rare combinations, unusual
> usages, etc.. Torture test various combinations of range constructs.
> Algorithms. Nested range constructs. Nested algorithms. Deliberately
> cook up nasty tests that try their best to break the code by using
> unusual parameters, unusual range-like objects, strange data, etc.. Go
> beyond the simple cases to test non-trivial things.  We need unittests
> that pass unusual structs and objects into the range constructs and
> algorithms, and make sure they actually work as we have been _assuming_
> they should.
>
> I have a feeling there are a LOT of bugs lurking in there behind
> overlooked corner cases, off by 1 errors, and other such careless slips,
> as well as code that only works for basic types like arrays, which
> starts breaking when you hand it something non-trivial.  All these
> issues must be weeded out and prevented from slipping back in.
>
> Here's a start:
>
> - Create a set of structs/classes (inside a version(unittest) block)
>    that are input, forward, bidirectional, output, etc., ranges, that are
>    NOT merely arrays.
>
> - There should be some easy way, perhaps using std.random, of creating
>    non-trivial instances of these things.  These should be put in a
>    separate place, perhaps outside the std/ subdirectory, where they can
>    be imported into unittest blocks by std.range, std.algorithm, whatever
>    else that needs extensive testing.
>
> - Use these ranges as input for testing range constructs and algorithms.
>
> - For best results, use a compile-time loop to loop over a given
>    combination of these range types, and run them through the same set of
>    tests. This will improve the currently spotty test coverage.  Perhaps
>    provide some templated functions that, given a set of range types
>    (from the above structs/classes) and a set of functions, run through
>    all combinations of them to make sure they all work. (We run unittests
>    separately anyway, we aren't afraid of long-running tests.)
>
>
> T


I think that unit tests aren't very effective without code coverage.

One fairly non-disruptive thing we could do: implement code coverage for templates. Currently, templates get no code coverage numbers.
We could do a code-coverage equivalent for templates: which lines actually got instantiated?
I bet this would show _huge_ gaps in the existing test suite.

October 29, 2012

Re: Another day in the ordeal of cartesianProduct

Posted by Andrei Alexandrescu
in reply to Peter Alexander

Andrei Alexandrescu

Posted in reply to Peter Alexander

On 10/28/12 8:28 AM, Peter Alexander wrote:
> On Saturday, 27 October 2012 at 13:06:09 UTC, Andrei Alexandrescu wrote:
>> On 10/27/12 8:23 AM, Peter Alexander wrote:
>>> Retrofitting some sort of structure to templates will be a Herculean
>>> task, but I think it has to happen. It is clear to me that the
>>> development process we use now (write the template, try a few
>>> instantiations, pray) is unsustainable beyond simple templates.
>>
>> It's not clear to me at all. The mechanism works very well and is more
>> expressive than alternatives used by other languages.
>
> I'm not sure I can agree it works well.
>
> For example, here's what happened with bug 8900 mentioned in the OP:
>
> std.range.zip creates a Zip object, which has a Tuple member. Tuple has
> a toString function, which calls formatElement, which calls formatValue,
> which calls formatRange, which (when there's a range of characters) has
> a code path for right-aligning the range. To right-align the range it
> needs to call walkLength.
>
> The problem arises when you zip an infinite range of characters e.g.
> repeat('a').

This proves nothing at all. So this has to do with invoking walkLength against an infinite range. At the time I wrote walkLength, infinite ranges were an experimental notion that I was ready to remove if there wasn't enough practical support for it. So I didn't even think of the connection, which means the restriction wouldn't have likely made it into the definition of walkLength regardless of the formalism used.

> Before pull request 880, walkLength accepted infinite ranges and just
> returned size_t.max. Pull request 880 added a constraint to walkLength
> to stop it accepting infinite ranges. Suddenly, you cannot zip a range
> of infinite chars because of a seemingly unrelated change.

The connection is obvious and is independent qualitatively of other cases of "if you change A and B uses it, B may change in behavior too". It's a pattern old as dust in programming.

Anyway, I'm not sure whether this is clear as day: expressing constraints as Booleans or "C++ concepts" style or Gangnam style doesn't influence this case in the least.

> This would have been caught if there was a unit test, but there wasn't,
> and as Dijkstra says, "testing can be a very effective way to show the
> presence of bugs, but it is hopelessly inadequate for showing their
> absence." There are probably other places that are broken, and many
> changes in the future will just introduce more bugs without tests.
>
> Maybe we have different standards for "working well", but to me at
> least, this isn't what "working well" looks like.
>
> Working well in this case would look like this:
>
> - The person that put together pull request 880 would add the template
> constraint to walkLength.
> - On the next compile he would get this error: "formatRange potentially
> calls walkLength with an infinite range." (or something along those lines).
> - The person fixes formatRange, and all is well.
>
> No need for unit tests, it's all caught as soon as possible without need
> for instantiation.

But this works today and has nothing to do with "retrofitting structure to templates". Nothing. Nothing.


Andrei

October 29, 2012

Re: Another day in the ordeal of cartesianProduct

Posted by Peter Alexander
in reply to Don Clugston

Peter Alexander

Posted in reply to Don Clugston

On Monday, 29 October 2012 at 14:47:37 UTC, Don Clugston wrote:
> One fairly non-disruptive thing we could do: implement code coverage for templates. Currently, templates get no code coverage numbers.
> We could do a code-coverage equivalent for templates: which lines actually got instantiated?
> I bet this would show _huge_ gaps in the existing test suite.

That's a good step forward, but I don't think it solves (what appears to me to be) the most common issue: incorrect/missing template constraints.

auto average(R)(R r) if (isForwardRange!R)
{
    return reduce!("a + b")(r) / walkLength(r);
}

(This code can have full coverage for some ranges, but also needs to also check !isInfinite!R, and probably other things).

These are difficult because every time a constraint changes, you need to go round and update the constraints of all calling functions. It's like when you change the types of a function argument, or return type; except that with template constraints, nothing is checked until instantiation.

October 29, 2012

Re: Another day in the ordeal of cartesianProduct

Posted by Peter Alexander
in reply to Andrei Alexandrescu

Peter Alexander

Posted in reply to Andrei Alexandrescu

On Monday, 29 October 2012 at 15:48:11 UTC, Andrei Alexandrescu wrote:
> On 10/28/12 8:28 AM, Peter Alexander wrote:
>> For example, here's what happened with bug 8900 mentioned in the OP:
>>
>> std.range.zip creates a Zip object, which has a Tuple member. Tuple has
>> a toString function, which calls formatElement, which calls formatValue,
>> which calls formatRange, which (when there's a range of characters) has
>> a code path for right-aligning the range. To right-align the range it
>> needs to call walkLength.
>>
>> The problem arises when you zip an infinite range of characters e.g.
>> repeat('a').
>
> This proves nothing at all. So this has to do with invoking walkLength against an infinite range. At the time I wrote walkLength, infinite ranges were an experimental notion that I was ready to remove if there wasn't enough practical support for it. So I didn't even think of the connection, which means the restriction wouldn't have likely made it into the definition of walkLength regardless of the formalism used.

You're misunderstanding. walkLength used to allow infinite ranges. Recently, a commit added a constraint to walkLength to disallow infinite ranges. After this commit, all the unit tests still passed, but at least one bug was introduced (bug 8900).

That's the problem: a change occurred that introduced a bug, but the type system failed to catch it before the change was committed. Something like typeclasses would have caught the bug before commit and without unit tests.

> The connection is obvious and is independent qualitatively of other cases of "if you change A and B uses it, B may change in behavior too". It's a pattern old as dust in programming.
>
> Anyway, I'm not sure whether this is clear as day: expressing constraints as Booleans or "C++ concepts" style or Gangnam style doesn't influence this case in the least.

If I change A and B uses it, I expect B to give an error or at least a warning at compile time where possible. This doesn't happen. With template constraints, you don't get an error until you try to instantiate the template. This is too late in my opinion.

I would like this to give an error:

void foo(R)(R r) if (isForwardRange!R) { r.popBack(); }

It doesn't, not until you try to use it at least, and even then it only gives you an error if you try it with a non-bidirectional forward range. If this did give an error, bug 8900 (any many others) would never have happened.

The problem with constraints vs. something like typeclasses or C++ concepts is that constraint predicates are not possible to enforce pre-instantiation. They have too much freedom of expression.

>> Working well in this case would look like this:
>>
>> - The person that put together pull request 880 would add the template
>> constraint to walkLength.
>> - On the next compile he would get this error: "formatRange potentially
>> calls walkLength with an infinite range." (or something along those lines).
>> - The person fixes formatRange, and all is well.
>>
>> No need for unit tests, it's all caught as soon as possible without need
>> for instantiation.
>
> But this works today and has nothing to do with "retrofitting structure to templates". Nothing. Nothing.

It doesn't work today.

This isn't a fabricated example. This happened. walkLength changed its constraint, everything still compiled, and all the unit tests passed. There was no error, no hint that things were broken, nothing. Problems only started to arise when the poor OP tried to implement cartesianProduct.

This should never have happened. Typeclasses or C++ concepts wouldn't have allowed it to happen. This is the kind of structure that templates need.

October 29, 2012

Re: Another day in the ordeal of cartesianProduct

Posted by Andrei Alexandrescu
in reply to Peter Alexander

Andrei Alexandrescu

Posted in reply to Peter Alexander

On 10/29/12 12:21 PM, Peter Alexander wrote:
> On Monday, 29 October 2012 at 15:48:11 UTC, Andrei Alexandrescu wrote:
>> On 10/28/12 8:28 AM, Peter Alexander wrote:
>>> For example, here's what happened with bug 8900 mentioned in the OP:
>>>
>>> std.range.zip creates a Zip object, which has a Tuple member. Tuple has
>>> a toString function, which calls formatElement, which calls formatValue,
>>> which calls formatRange, which (when there's a range of characters) has
>>> a code path for right-aligning the range. To right-align the range it
>>> needs to call walkLength.
>>>
>>> The problem arises when you zip an infinite range of characters e.g.
>>> repeat('a').
>>
>> This proves nothing at all. So this has to do with invoking walkLength
>> against an infinite range. At the time I wrote walkLength, infinite
>> ranges were an experimental notion that I was ready to remove if there
>> wasn't enough practical support for it. So I didn't even think of the
>> connection, which means the restriction wouldn't have likely made it
>> into the definition of walkLength regardless of the formalism used.
>
> You're misunderstanding. walkLength used to allow infinite ranges.
> Recently, a commit added a constraint to walkLength to disallow infinite
> ranges. After this commit, all the unit tests still passed, but at least
> one bug was introduced (bug 8900).

I thought I understood the matter rather well.

> That's the problem: a change occurred that introduced a bug, but the
> type system failed to catch it before the change was committed.
> Something like typeclasses would have caught the bug before commit and
> without unit tests.

Yes, but what gets ignored here is that typeclasses have a large cognitive cost to everyone involved. I think typeclasses generally don't pull their weight. Besides I think template constraints are more powerful because they operate on arbitrary Boolean expressions instead of types.

>> The connection is obvious and is independent qualitatively of other
>> cases of "if you change A and B uses it, B may change in behavior
>> too". It's a pattern old as dust in programming.
>>
>> Anyway, I'm not sure whether this is clear as day: expressing
>> constraints as Booleans or "C++ concepts" style or Gangnam style
>> doesn't influence this case in the least.
>
> If I change A and B uses it, I expect B to give an error or at least a
> warning at compile time where possible. This doesn't happen. With
> template constraints, you don't get an error until you try to
> instantiate the template.

That's also at compile time, just a tad later.

> This is too late in my opinion.

I think there's a marked difference between compile-time and run-time. Instantiation time does not make a big enough difference to bring a big gun in the mix.

> I would like this to give an error:
>
> void foo(R)(R r) if (isForwardRange!R) { r.popBack(); }
>
> It doesn't, not until you try to use it at least, and even then it only
> gives you an error if you try it with a non-bidirectional forward range.

So them a unittest with a minimal mock forward range should be created. I understand your concern, but please understand that typeclasses are too big a weight for what they do.

> If this did give an error, bug 8900 (any many others) would never have
> happened.

How many others? (Honest question.)

> The problem with constraints vs. something like typeclasses or C++
> concepts is that constraint predicates are not possible to enforce
> pre-instantiation. They have too much freedom of expression.

Freedom of expression is also a strength. (The problem with C++ concepts is that they almost sunk C++.)

>>> Working well in this case would look like this:
>>>
>>> - The person that put together pull request 880 would add the template
>>> constraint to walkLength.
>>> - On the next compile he would get this error: "formatRange potentially
>>> calls walkLength with an infinite range." (or something along those
>>> lines).
>>> - The person fixes formatRange, and all is well.
>>>
>>> No need for unit tests, it's all caught as soon as possible without need
>>> for instantiation.
>>
>> But this works today and has nothing to do with "retrofitting
>> structure to templates". Nothing. Nothing.
>
> It doesn't work today.
>
> This isn't a fabricated example.

It's just blown out of proportion.

> This happened. walkLength changed its
> constraint, everything still compiled, and all the unit tests passed.
> There was no error, no hint that things were broken, nothing. Problems
> only started to arise when the poor OP tried to implement cartesianProduct.
>
> This should never have happened. Typeclasses or C++ concepts wouldn't
> have allowed it to happen. This is the kind of structure that templates
> need.

We will not add C++ concepts or typeclasses to D.


Andrei

October 29, 2012

Re: Another day in the ordeal of cartesianProduct

Posted by bearophile
in reply to Andrei Alexandrescu

bearophile

Posted in reply to Andrei Alexandrescu

Andrei Alexandrescu:

> Yes, but what gets ignored here is that typeclasses have a large cognitive cost to everyone involved.

Such costs are an interesting discussion topic :-)

To put people up to speed a bit:
http://en.wikipedia.org/wiki/Type_class

A bit of explanations regarding Rust ones:
https://air.mozilla.org/rust-typeclasses/

Deeper info from one of the original designers:
http://homepages.inf.ed.ac.uk/wadler/topics/type-classes.html


>I think typeclasses generally don't pull their weight.<

Rust and Haskell designers think otherwise, it seems.


> We will not add C++ concepts or typeclasses to D.

I agree that maybe now it's too much late to add them to D2. But we are free to discuss about the topic. I didn't know much about typeclasses years ago when people were designing D2 :-( I have studied them only recently while learning Haskell.

Bye,
bearophile

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation