Jump to page: 1 25  
Page
Thread overview
Policy for exposing range structs
Mar 25, 2016
Seb
Mar 25, 2016
Adam D. Ruppe
Mar 26, 2016
Jonathan M Davis
Mar 25, 2016
H. S. Teoh
Mar 25, 2016
Anon
Apr 01, 2016
Johan Engelen
Apr 01, 2016
Johan Engelen
Apr 19, 2016
Johan Engelen
Apr 19, 2016
Jack Stouffer
Apr 19, 2016
Marco Leise
Apr 20, 2016
Johan Engelen
Mar 26, 2016
Anon
Mar 31, 2016
Johan Engelen
Mar 31, 2016
Anon
Mar 31, 2016
Adam D. Ruppe
Mar 31, 2016
Anon
Mar 31, 2016
Adam D. Ruppe
Mar 31, 2016
Anon
Mar 31, 2016
Adam D. Ruppe
Mar 31, 2016
Anon
Mar 27, 2016
David Nadlinger
Mar 30, 2016
Liran Zvibel
Mar 31, 2016
Adam D. Ruppe
Mar 31, 2016
Adam D. Ruppe
Mar 31, 2016
jmh530
Apr 01, 2016
Jonathan M Davis
Mar 25, 2016
H. S. Teoh
March 25, 2016
If I understand it correctly, the current policy in Phobos is that range methods should use static nested structs to avoid the name clutter and document the capabilities of the returned ranges in the documentation.
However there are a lot of old functions that still use public structs, which leads to the rather confusing documentation output in std.range:

Chunks · chunks · Cycle · cycle ... evenChunks · EvenChunks · FrontTransversal · frontTransversal · ... · indexed · Indexed · ... · lockstep · Lockstep · .. · Recurrence · recurrence · RefRange · refRange · repeat · Repeat · ... · sequence · Sequence · SortedRange · ... · Take · take · ... Transversal · transversal · TransverseOptions · zip · Zip

(off-topic: it's quite interesting to see that sometimes the structs are before the method and sometimes after it)

Two arguments to keep exposing the structs are that (1) an API user can explicitly specify the return type (in contrast to auto) and (2) one can see the capabilities of the struct in the documentation.

There are many cases where methods in these structs are optional and depend on the capabilities of the input range (e.g. backward, length, random access, ...). I could imagine that

1) We rework ddoc, s.t. it doesn't list these structs in the overview and adds the struct methods to the function (could get ugly)
2) We deprecate those exposed structs and make them private/nested (does anyone know whether it's common to depend on those structs?)

What are your thoughts on this matter?

(This is a follow-up from a short discussion with Jakob Ovrum on github [1].)

[1] https://github.com/D-Programming-Language/phobos/pull/4027
March 25, 2016
On 3/25/16 7:37 AM, Seb wrote:
> If I understand it correctly, the current policy in Phobos is that range
> methods should use static nested structs to avoid the name clutter and
> document the capabilities of the returned ranges in the documentation.
> However there are a lot of old functions that still use public structs,
> which leads to the rather confusing documentation output in std.range:
>
> Chunks · chunks · Cycle · cycle ... evenChunks · EvenChunks ·
> FrontTransversal · frontTransversal · ... · indexed · Indexed · .... ·
> lockstep · Lockstep · .. · Recurrence · recurrence · RefRange · refRange
> · repeat · Repeat · ... · sequence · Sequence · SortedRange · ... · Take
> · take · ... Transversal · transversal · TransverseOptions · zip · Zip
>
> (off-topic: it's quite interesting to see that sometimes the structs are
> before the method and sometimes after it)
>
> Two arguments to keep exposing the structs are that (1) an API user can
> explicitly specify the return type (in contrast to auto) and (2) one can
> see the capabilities of the struct in the documentation.
>
> There are many cases where methods in these structs are optional and
> depend on the capabilities of the input range (e.g. backward, length,
> random access, ...). I could imagine that
>
> 1) We rework ddoc, s.t. it doesn't list these structs in the overview
> and adds the struct methods to the function (could get ugly)
> 2) We deprecate those exposed structs and make them private/nested (does
> anyone know whether it's common to depend on those structs?)
>
> What are your thoughts on this matter?

We should actually be moving *away from* voldemort types:

https://forum.dlang.org/post/n96k3g$ka5$1@digitalmars.com

-Steve
March 25, 2016
On Fri, Mar 25, 2016 at 11:37:24AM +0000, Seb via Digitalmars-d wrote:
> If I understand it correctly, the current policy in Phobos is that range methods should use static nested structs to avoid the name clutter and document the capabilities of the returned ranges in the documentation.  However there are a lot of old functions that still use public structs, which leads to the rather confusing documentation output in std.range:

Actually, certain types of range structs *have* to be in module space rather than inside the function, because of compiler limitations / language restrictions. An example that comes to mind is a range with an alias parameter -- I forget the details, but basically there are some cases where this will not work correctly with certain uses if it's declared inside the function, so it has to be in module space.

This should probably be investigated more, though. It seems to be a grey area of semantics that could use a DIP.


[...]
> (off-topic: it's quite interesting to see that sometimes the structs are before the method and sometimes after it)

I think it's a recent change in convention that recommends declaring such structs after the function. In any case, declaration order in D is (generally -- mostly in module space) not important, so it's not a big deal.


> Two arguments to keep exposing the structs are that (1) an API user
> can explicitly specify the return type (in contrast to auto) and (2)
> one can see the capabilities of the struct in the documentation.

I think (1) is moot, because there's always typeof and ReturnType.  In fact, I'd argue that it's better to use typeof / ReturnType, because sometimes the user *shouldn't* need to know exactly what the template arguments are. For example, if some of the template arguments come from IFTI, where the exact types may not be immediately obvious from the user's POV, or if there are default template parameters that are a pain to spell out every single time.

I argue that (2) is a bad idea, because a range ought to be opaque. The whole point of the range API is that it should be possible for the implementation to change drastically or be replaced by a different implementation, yet user code should still continue to Just Work(tm) without any modifications.  As such, user code should not depend on implementation details of the range, and really shouldn't know anything else about the range other than what is specified in the range API.  All the user ought to know is whether it's an input range, forward range, bidirectional range, etc.. Anything more than that leads to breakage when the range implementation is updated/replaced, which breaks the whole premise of using ranges in the first place.


> There are many cases where methods in these structs are optional and depend on the capabilities of the input range (e.g. backward, length, random access, ...). I could imagine that
> 
> 1) We rework ddoc, s.t. it doesn't list these structs in the overview
> and adds the struct methods to the function (could get ugly)

No. The docs of the function should simply state what kind of range it returns -- input, forward, bidirectional, or random access. If it depends on what the function is given, the docs should explain under what conditions the function will return which kind of range.  Listing individual range methods for every range function in Phobos is a lot of needless repetition, and is error-prone.  If anything, we should write a page that explains exactly what methods are available to each kind of range, and just link to that from the function docs. That's what hyperlinks are for.


> 2) We deprecate those exposed structs and make them private/nested
> (does anyone know whether it's common to depend on those structs?)
[...]

I don't know if it's common, but I *have* seen people spell out the type explicitly.  So changing this now will probably break existing code, and people will likely be resistant to that.  (Arguably, though, it would be for the better -- users really shouldn't need to know the exact range type.)


T

-- 
Winners never quit, quitters never win.
But those who never quit AND never win are idiots.
March 25, 2016
On Friday, 25 March 2016 at 14:07:41 UTC, Steven Schveighoffer wrote:
> We should actually be moving *away from* voldemort types:

Indeed, they basically suck.

Really, I think the structs ought to be where the bulk of the documentation is, and the function is just listed as a convenience method for creating them. That's really how it works anyway.
March 25, 2016
On 3/25/16 10:07 AM, Steven Schveighoffer wrote:
>
> We should actually be moving *away from* voldemort types:
>
> https://forum.dlang.org/post/n96k3g$ka5$1@digitalmars.com
>
> -Steve

Has this bug been submitted? -- Andrei
March 25, 2016
On 3/25/16 11:07 AM, Andrei Alexandrescu wrote:
> On 3/25/16 10:07 AM, Steven Schveighoffer wrote:
>>
>> We should actually be moving *away from* voldemort types:
>>
>> https://forum.dlang.org/post/n96k3g$ka5$1@digitalmars.com
>>
>
> Has this bug been submitted? -- Andrei

I'm not sure it's a bug that can be fixed. It's caused by the design of the way template name mangling is included.

I can submit a general "enhancement", but I don't know what it would say? Make template mangling more efficient? :)

I suppose having a bug report with a demonstration of why we should change it is a good thing. I'll add that.

-Steve
March 25, 2016
On Fri, Mar 25, 2016 at 11:40:11AM -0400, Steven Schveighoffer via Digitalmars-d wrote:
> On 3/25/16 11:07 AM, Andrei Alexandrescu wrote:
> >On 3/25/16 10:07 AM, Steven Schveighoffer wrote:
> >>
> >>We should actually be moving *away from* voldemort types:
> >>
> >>https://forum.dlang.org/post/n96k3g$ka5$1@digitalmars.com
> >>
> >
> >Has this bug been submitted? -- Andrei
> 
> I'm not sure it's a bug that can be fixed. It's caused by the design of the way template name mangling is included.
> 
> I can submit a general "enhancement", but I don't know what it would say?  Make template mangling more efficient? :)
> 
> I suppose having a bug report with a demonstration of why we should change it is a good thing. I'll add that.
[...]

We've been talking about compressing template symbols a lot recently, but there's a very simple symbol size reduction that we can do right now: most of the templates in Phobos are eponymous templates, and under the current mangling scheme that means repetition of the template name and the eponymous member in the symbol.  My guess is that most of the 4k symbol bloats come from eponymous templates. In theory, a single character (or something in that vicinity) ought to be enough to indicate an eponymous template. That should cut down symbol size significantly (I'm guessing about 30-40% reduction at a minimum, probably more in practice) without requiring a major overhaul of the mangling scheme.


T

-- 
Windows: the ultimate triumph of marketing over technology. -- Adrian von Bidder
March 25, 2016
On 3/25/16 12:18 PM, H. S. Teoh via Digitalmars-d wrote:
> On Fri, Mar 25, 2016 at 11:40:11AM -0400, Steven Schveighoffer via Digitalmars-d wrote:
>> On 3/25/16 11:07 AM, Andrei Alexandrescu wrote:
>>> On 3/25/16 10:07 AM, Steven Schveighoffer wrote:
>>>>
>>>> We should actually be moving *away from* voldemort types:
>>>>
>>>> https://forum.dlang.org/post/n96k3g$ka5$1@digitalmars.com
>>>>
>>>
>>> Has this bug been submitted? -- Andrei
>>
>> I'm not sure it's a bug that can be fixed. It's caused by the design
>> of the way template name mangling is included.
>>
>> I can submit a general "enhancement", but I don't know what it would
>> say?  Make template mangling more efficient? :)
>>
>> I suppose having a bug report with a demonstration of why we should
>> change it is a good thing. I'll add that.
> [...]
>
> We've been talking about compressing template symbols a lot recently,
> but there's a very simple symbol size reduction that we can do right
> now: most of the templates in Phobos are eponymous templates, and under
> the current mangling scheme that means repetition of the template name
> and the eponymous member in the symbol.  My guess is that most of the 4k
> symbol bloats come from eponymous templates. In theory, a single
> character (or something in that vicinity) ought to be enough to indicate
> an eponymous template. That should cut down symbol size significantly
> (I'm guessing about 30-40% reduction at a minimum, probably more in
> practice) without requiring a major overhaul of the mangling scheme.

I don't think it's that simple. For example:

auto foo(T)(T t)

Needs to repeat T (whatever it happens to be) twice -- once for the template foo, and once for the function parameter. If foo returns an internally defined type that can be passed to foo:

x.foo.foo.foo.foo

Each nesting multiplies the size of the symbol by 2 (at least, maybe even 3). So it's exponential growth. Even if you compress it to one character, having a chain of, say, 16 calls brings you to 65k characters for the symbol. We need to remove the number of times the symbol is repeated, via some sort of substitution.

Added the bug report. Take a look and see what you think.

https://issues.dlang.org/show_bug.cgi?id=15831

-Steve
March 25, 2016
On Friday, 25 March 2016 at 18:20:12 UTC, Steven Schveighoffer wrote:
> On 3/25/16 12:18 PM, H. S. Teoh via Digitalmars-d wrote:
>> On Fri, Mar 25, 2016 at 11:40:11AM -0400, Steven Schveighoffer via Digitalmars-d wrote:
>>> On 3/25/16 11:07 AM, Andrei Alexandrescu wrote:
>>>> On 3/25/16 10:07 AM, Steven Schveighoffer wrote:
>>>>>
>>>>> We should actually be moving *away from* voldemort types:
>>>>>
>>>>> https://forum.dlang.org/post/n96k3g$ka5$1@digitalmars.com
>>>>>
>>>>
>>>> Has this bug been submitted? -- Andrei
>>>
>>> I'm not sure it's a bug that can be fixed. It's caused by the design
>>> of the way template name mangling is included.
>>>
>>> I can submit a general "enhancement", but I don't know what it would
>>> say?  Make template mangling more efficient? :)
>>>
>>> I suppose having a bug report with a demonstration of why we should
>>> change it is a good thing. I'll add that.
>> [...]
>>
>> We've been talking about compressing template symbols a lot recently,
>> but there's a very simple symbol size reduction that we can do right
>> now: most of the templates in Phobos are eponymous templates, and under
>> the current mangling scheme that means repetition of the template name
>> and the eponymous member in the symbol.  My guess is that most of the 4k
>> symbol bloats come from eponymous templates. In theory, a single
>> character (or something in that vicinity) ought to be enough to indicate
>> an eponymous template. That should cut down symbol size significantly
>> (I'm guessing about 30-40% reduction at a minimum, probably more in
>> practice) without requiring a major overhaul of the mangling scheme.
>
> I don't think it's that simple. For example:
>
> auto foo(T)(T t)
>
> Needs to repeat T (whatever it happens to be) twice -- once for the template foo, and once for the function parameter. If foo returns an internally defined type that can be passed to foo:
>
> x.foo.foo.foo.foo
>
> Each nesting multiplies the size of the symbol by 2 (at least, maybe even 3). So it's exponential growth. Even if you compress it to one character, having a chain of, say, 16 calls brings you to 65k characters for the symbol. We need to remove the number of times the symbol is repeated, via some sort of substitution.
>
> Added the bug report. Take a look and see what you think.
>
> https://issues.dlang.org/show_bug.cgi?id=15831
>
> -Steve

These repetitions could be eliminated relatively easily (from a user's perspective, anyway; things might be more difficult in the actual implementation).

Two changes to the mangling:

1) `LName`s of length 0 (which currently cannot exist) mean to repeat the previous `LName` of the current symbol.

2) N `Number` is added as a valid `Type`, meaning "Type Back Reference". Basically, all instances of a struct/class/interface/enum type in a symbol's mangling get counted (starting from zero), and subsequent instances of that type can be referred to by N0, N1, N2, etc.

So given:

```
module mod;
struct Foo;
Foo* func(Foo* a, Foo* b);
```

`func` currently mangles as:
_D3mod4funcFPS3mod3FooPS3mod3FooZPS3mod3Foo

It would instead be mangled as:
_D3mod4funcFPS3mod3FooPN0ZPN0

Nested templates declarations would get numbered depth first as follows:

S7!(S2!(S0, S1), S6!(S3, S5!(S4)))

I have another idea for reducing the byte impact of template string value parameters, but it is a bit more complicated and I need to finish de-bugging and optimizing some code to make sure it will work as well as I think. I'll post more on that soon, I suspect.
March 25, 2016
On 3/25/16 3:04 PM, Anon wrote:
>
> These repetitions could be eliminated relatively easily (from a user's
> perspective, anyway; things might be more difficult in the actual
> implementation).
>
> Two changes to the mangling:

[snip]

Please add these ideas to the bug report! I'm not sure if it completely fixes this or not, but it's worth exploring options.

This is similar to how I was thinking we should approach this.

-Steve
« First   ‹ Prev
1 2 3 4 5