November 16, 2012
On Thursday, November 15, 2012 22:11:33 Timon Gehr wrote:
> On 11/14/2012 08:32 PM, Jonathan M Davis wrote:
> > On Wednesday, November 14, 2012 20:18:26 Timon Gehr wrote:
> >> That is a very imprecise approximation. I think it does not cover any ground: The day eg. 'array' will require this kind of non-transient element range is the day where I will write my own.
> > 
> > std.array.array _cannot_ work with a transient front. ...
> 
> It can work if 'transient' is over-approximated like suggested in the parent post.

How so? std.array.array _needs_ copies when it creates its array. If front is transient, it will end up with all elements being the same. So, either transient fronts need to be illegal, or std.array.array needs to be able to put a check for them in its template constraint to prevent them from being used with it. In either case, no range with a transient front would work with std.array.array.

All that inferring transience from the type of front will do is make it so that std.array.array can prevent ranges with a transient front from compiling with it, and it would be forced to reject a bunch of ranges which were _not_ transient, because any mutable reference type would have to be treated as transient even if it wasn't. So, that doesn't make std.array.array work with ByLine or its ilk at all.

- Jonathan M Davis
November 16, 2012
On 11/14/2012 11:33 PM, Vladimir Panteleev wrote:
> But if the D community is interested in
> just a MediaWiki setup on the same server as the D forum (with me taking care of
> maintenance), I could look into that.

I don't know how much everyone else is interested in that, but I am.
November 16, 2012
On Thu, 2012-11-15 at 10:56 -0800, H. S. Teoh wrote:
> I don't like duplicating a whole bunch of algorithms in transalgorithm.

If its true what you say, that usually there is no difference in efficiency, than there is no need for any duplication. But it is certainly better to offer an standard implementation (if needed) than having every user do the duplication.

I know that coding by convention is discouraged, but some conventions work pretty well. I would suggest for any new ranges that provide a transient front to have them called something like byLineTransient, byChunkTransient(). So it is obvious from reading the code what you deal with. (I am not suggesting renaming the already existing methods, I just picked the first examples I could think of)

I personally would be very happy, if all algorithms who can accept transient fronts do and that mostly algorithms for which it is obvious that they can not support them, don't support them. In conjunction with documentation & maybe a naming convention, I think this is already pretty good and at least in my very humble opinion, good enough.



November 16, 2012
On 11/16/2012 01:58 AM, Jonathan M Davis wrote:
> On Thursday, November 15, 2012 22:11:33 Timon Gehr wrote:
>> On 11/14/2012 08:32 PM, Jonathan M Davis wrote:
>>> On Wednesday, November 14, 2012 20:18:26 Timon Gehr wrote:
>>>> That is a very imprecise approximation. I think it does not cover any
>>>> ground: The day eg. 'array' will require this kind of non-transient
>>>> element range is the day where I will write my own.
>>>
>>> std.array.array _cannot_ work with a transient front. ...
>>
>> It can work if 'transient' is over-approximated like suggested in the
>> parent post.
>
> How so? ...

The suggestion is to ban all potentially mutable indirections from 'non-transient' ranges. (this redefines what 'transient' means.) It cannot be derived that popFront will invalidate stuff just from the fact that the type system does not explicitly exclude that any code ever modifies it.

November 16, 2012
On Friday, November 16, 2012 19:06:05 Timon Gehr wrote:
> The suggestion is to ban all potentially mutable indirections from 'non-transient' ranges. (this redefines what 'transient' means.) It cannot be derived that popFront will invalidate stuff just from the fact that the type system does not explicitly exclude that any code ever modifies it.

The idea behind the suggestion is to attempt to determine transience based on the time and assume that it's non-transient if it's not definitively non- transient. It's not redefining the term transience. It's just that it cannot definitively determine transience based on the type alone. By I guess that I could see how you could think of that as redefining transience.

Regardless, std.array.array _cannot_ work on ranges with transient fronts. So, we have a few options:

1. Make transient fronts illegal, then std.array.array and its ilk never have to worry about it.

2. Make it so that all transient ranges must mark themselves as transient.

3. Introduce primitives like peekFront to enable ranges with transient fronts to work in general while allowing code to assume that front is non-transient.

3. Attempt to infer transience from the type of front.

#2 and #3 are too distruptive IMHO and complicate everything. #4 is quite feasible, but it's going to get a lot of false positives, meaning that functions like std.array.array won't work with ranges that they should be able to work with just fine. #1 obviously solves all of the problems caused by transient fronts, but it means that ranges such as ByLine and ByChunk are illegal (as they currently stand anyway).

Personally, I favor making it so that ByLine and ByChunk overload opApply and reuse their buffers with it and then make their fronts non-transient. It pretty much avoids this whole issue while allowing the extra efficiency provided by a transient front in pretty much the only common use case that we have for such a range.

- Jonathan M Davis
November 16, 2012
On Fri, Nov 16, 2012 at 10:32:22PM +0100, Jonathan M Davis wrote: [...]
> Regardless, std.array.array _cannot_ work on ranges with transient fronts. So, we have a few options:
> 
> 1. Make transient fronts illegal, then std.array.array and its ilk never have to worry about it.
> 
> 2. Make it so that all transient ranges must mark themselves as transient.
> 
> 3. Introduce primitives like peekFront to enable ranges with transient fronts to work in general while allowing code to assume that front is non-transient.
> 
> 3. Attempt to infer transience from the type of front.
> 
> #2 and #3 are too distruptive IMHO and complicate everything. #4 is quite feasible, but it's going to get a lot of false positives, meaning that functions like std.array.array won't work with ranges that they should be able to work with just fine. #1 obviously solves all of the problems caused by transient fronts, but it means that ranges such as ByLine and ByChunk are illegal (as they currently stand anyway).

Regardless of how we solve it, ByLine and ByChunk need to be changed anyway. Except for #4, I suppose, but it's very leaky. I wouldn't go for #4 except as a last resort.


> Personally, I favor making it so that ByLine and ByChunk overload opApply and reuse their buffers with it and then make their fronts non-transient. It pretty much avoids this whole issue while allowing the extra efficiency provided by a transient front in pretty much the only common use case that we have for such a range.
[...]

#5 is to document transience prominently and indicate clearly which range functions are safe/unsafe to use with it. This option requires no code changes, only documentation changes. It *is* mere coding by convention, which is generally not desirable, but it does seem to be the simplest way to allow people to use transient ranges if they know what they're doing.

#5.1 is to document transience *and* make ByLine and ByChunk non-transient by default, and provide new transient versions of them for those people who need to squeeze out every last bit of performance (who also should know what they're doing so they can't complain if things break). This way, things are safe by default, yet people can consciously choose to work with transient ranges.

This may be another approach to balance the need for flexibility and yet keep things on the safe side by default. I don't like the prospect of having to duplicate parts of std.algorithm just because I have some code that produces transient ranges -- I *know* I'm never going to need to use algorithms on them that can't handle transience anyway. It's Not Nice to be forced to duplicate code just because an arbitrary decision was made to ban transience, or just because some leaky type inference decided my range was transient when it's not, etc..


T

-- 
You have to expect the unexpected. -- RL
November 17, 2012
On Friday, November 16, 2012 13:55:31 H. S. Teoh wrote:
> I don't like the prospect of
> having to duplicate parts of std.algorithm just because I have some code
> that produces transient ranges -- I *know* I'm never going to need to
> use algorithms on them that can't handle transience anyway. It's Not
> Nice to be forced to duplicate code just because an arbitrary decision
> was made to ban transience, or just because some leaky type inference
> decided my range was transient when it's not, etc..

The problem is that supporting transience complicates ranges even further, and they're already too complicated. And supporting every possible use case is likely to require yet further modifications to ranges, complicating them even further. We have to draw the line somewhere.

I have found reference type ranges to be very useful upon occasion to the point that not having them would be a major problem for some code, but if we were to start from scratch, I'd probably still argue against allowing them and just get rid of save entirely, because it's caused us a lot of problems. I really don't want to see ranges complicated even further.

- Jonathan M Davis
November 17, 2012
On Fri, Nov 16, 2012 at 08:52:39PM -0500, Jonathan M Davis wrote:
> On Friday, November 16, 2012 13:55:31 H. S. Teoh wrote:
> > I don't like the prospect of having to duplicate parts of std.algorithm just because I have some code that produces transient ranges -- I *know* I'm never going to need to use algorithms on them that can't handle transience anyway. It's Not Nice to be forced to duplicate code just because an arbitrary decision was made to ban transience, or just because some leaky type inference decided my range was transient when it's not, etc..
> 
> The problem is that supporting transience complicates ranges even further, and they're already too complicated. And supporting every possible use case is likely to require yet further modifications to ranges, complicating them even further. We have to draw the line somewhere.

Since that is the case, that really only leaves us with two choices: (1)
declare transient ranges outright illegal, or (2) make all default
ranges non-transient (e.g. ByLine, ByChunk), and let documentation warn
the user that transient ranges may not work with every algorithm.

I'm leaning towards (2), because every other option brought up so far sucks one way or another. I know coding by convention is frowned upon here, but clearly, transience is an issue that requires human insight to solve on a case-by-case basis, and no simple enforceable solution exists. Thus, the only choice seems to be to leave it up to the programmer to do the right thing. The redeeming point is that we will make byLine and byChunk non-transient by default, so that users who don't want to care, don't need to care -- the code will just do the right thing by default. We can then provide byLineFast and byChunkFast for people who want the extra performance, and know how to deal with transience correctly.

This solution requires no further code changes beyond making byLine and byChunk non-transient by default, which is what you have been pushing for anyway. And it doesn't have any of the drawbacks of the other approaches.


> I have found reference type ranges to be very useful upon occasion to the point that not having them would be a major problem for some code, but if we were to start from scratch, I'd probably still argue against allowing them and just get rid of save entirely, because it's caused us a lot of problems. I really don't want to see ranges complicated even further.
[...]

If we agree on (2), then ranges will be no more complicated than they are today. The two Phobos offenders, byLine and byChunk, will be non-transient by default. The documentation will warn the user that transient ranges are to be "used at your own risk", just like casting void pointers and other risky language features that are nevertheless sometimes necessary.


T

-- 
IBM = I'll Buy Microsoft!
November 21, 2012
On 11/16/12, Walter Bright <newshound2@digitalmars.com> wrote:
> On 11/14/2012 11:33 PM, Vladimir Panteleev wrote:
>> But if the D community is interested in
>> just a MediaWiki setup on the same server as the D forum (with me taking
>> care of
>> maintenance), I could look into that.
>
> I don't know how much everyone else is interested in that, but I am.
>

We should give it a try and see if it works out.
November 21, 2012
On Friday, 16 November 2012 at 03:01:46 UTC, Walter Bright wrote:
> On 11/14/2012 11:33 PM, Vladimir Panteleev wrote:
>> But if the D community is interested in
>> just a MediaWiki setup on the same server as the D forum (with me taking care of
>> maintenance), I could look into that.
>
> I don't know how much everyone else is interested in that, but I am.

Oops, saw your reply only today.

Here's something to start with:

http://dwiki.kimsufi.thecybershadow.net/

Spam protection is currently minimal, but I'm subscribed to the edit feed and will apply more as needed (so as not to hamper initial contributors who will be moving most of the data).

If you point wiki.dlang.org to it, it should become accessible from that address as well.