April 05, 2012
On Thu, 05 Apr 2012 17:02:13 -0400, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Thursday, April 05, 2012 15:26:14 Steven Schveighoffer wrote:
>> On Thu, 05 Apr 2012 14:33:22 -0400, Jonathan M Davis <jmdavisProg@gmx.com>
>>
>> wrote:
>> > On Thursday, April 05, 2012 11:30:26 Steven Schveighoffer wrote:
>> >> A couple issues that still need consideration:
>> >>
>> >> 1. If std.algorithm the module becomes std.algorithm the package,  
>> what
>> >> happens with ddoc? We probably *do* need a compiler solution to this.
>> >
>> > That's assuming that you insist on keeping all of the documentation in
>> > one
>> > file. That arguably defeats the purpose of splitting up the modules.  
>> If
>> > there
>> > isn't enough in the module to split the documentation, then why do you
>> > need to
>> > split the module?
>>
>> I thought the whole point was code maintenance? Not documentation
>> splitting... I would have expected people to continue to treat
>> std.algorithm like it was one module, even though it imports several
>> sub-modules for its implementation.
>
> If the module isn't large enough to be split for documentation, I find it hard
> to believe that it needs to be split for maintenance.

Why do we ever need to split modules for documentation?  Just fix the doc generator so it's not as monolithic.  For instance, have one page per class or struct.

> And if all you care
> about is sub-modules for implementation and want all of the functions in the
> same module still, then this DIP is pointless. All you have to do is declare
> undocumented sub-modules which hold the various implementations and have the
> actual module call them. We already do this sort of thing in Phobos to get
> around static destructors screaming about circular dependencies.

You are starting to see my point :)  But I think the issue is not so much that you are splitting the implementation, but splitting up the API into related modules.

For example, std.container.  Imagine we have a robust set of 15 containers.  Why should those all be in one file?  They have nothing to do with eachother except they are in the same namespace.  Why does RedBlackTree need to be able to access the internals of Array?

As the writer of RedBlackTree, I want to be able to test and develop my container without having to worry about the rest of std.container.  But I also would like to have the FQN of it to be std.container.RedBlackTree.  Public imports allow this *today* without any changes.

Note that size isn't so much an issue for me as being able to compartmentalize and develop individual pieces of a module.

-Steve
April 05, 2012
On 4/5/12 4:26 PM, Steven Schveighoffer wrote:
> On Thu, 05 Apr 2012 17:00:56 -0400, Jonathan M Davis
> <jmdavisProg@gmx.com> wrote:
>
>> On Thursday, April 05, 2012 15:30:17 Steven Schveighoffer wrote:
>
>>> I don't see how. Just move the code into another module and publicly
>>> import that module from std/algorithm.d. Problem pretty much solved.
>>
>> The issue is code organization. If you want to split up std.algorithm (or
>> std.datetime or whatever) into multiple modules, you have to create a new
>> package with a completely different name with no connection to the
>> original
>> save for the fact that the original publicly imports it.
>
> My view is that people will not import the smaller modules, they will
> only ever import std.algorithm.

I think we should be looking for a solution that not only allows replacing module -> package transparently, but also allows people to import the newly introduced fine-grained modules.

Andrei

April 05, 2012
On Thursday, April 05, 2012 17:33:50 Steven Schveighoffer wrote:
> On Thu, 05 Apr 2012 17:02:13 -0400, Jonathan M Davis <jmdavisProg@gmx.com>
> > If the module isn't large enough to be split for documentation, I find
> > it hard
> > to believe that it needs to be split for maintenance.
> 
> Why do we ever need to split modules for documentation? Just fix the doc generator so it's not as monolithic. For instance, have one page per class or struct.

That may or may not be desirable (certainly in the case of smaller types, I'd argue that it isn't). By doing it on a module basis, you have far more control. But regardless, that would be a major change to ddoc.

In either case, the size of the documentation page for a module is currently closely tied to the number of public symbols in the module, so if you have a large API, it can become desirable to split it up simply because the documentation page for it is too large. And by splitting up the API, you fix that problem. Not to mention, if the module is mostly free functions, putting the documentation for each class or struct on its own page doesn't help anyway. So, while that may be a good change to ddoc in at least some cases, it doesn't solve the problem in general. For instance, it would help std.datetime, but it wouldn't help std.algorithm at all.

> > And if all you care
> > about is sub-modules for implementation and want all of the functions in
> > the
> > same module still, then this DIP is pointless. All you have to do is
> > declare
> > undocumented sub-modules which hold the various implementations and have
> > the
> > actual module call them. We already do this sort of thing in Phobos to
> > get
> > around static destructors screaming about circular dependencies.
> 
> You are starting to see my point :) But I think the issue is not so much that you are splitting the implementation, but splitting up the API into related modules.

As I understand it, the entire point of this DIP is to enable splitting up the API cleanly without breaking code. The implementation can already be split up seemlessly.

- Jonathan M Davis
April 05, 2012
On Thursday, April 05, 2012 16:43:24 Andrei Alexandrescu wrote:
> On 4/5/12 4:26 PM, Steven Schveighoffer wrote:
> > On Thu, 05 Apr 2012 17:00:56 -0400, Jonathan M Davis
> > 
> > <jmdavisProg@gmx.com> wrote:
> >> On Thursday, April 05, 2012 15:30:17 Steven Schveighoffer wrote:
> >>> I don't see how. Just move the code into another module and publicly import that module from std/algorithm.d. Problem pretty much solved.
> >> 
> >> The issue is code organization. If you want to split up std.algorithm (or
> >> std.datetime or whatever) into multiple modules, you have to create a new
> >> package with a completely different name with no connection to the
> >> original
> >> save for the fact that the original publicly imports it.
> > 
> > My view is that people will not import the smaller modules, they will only ever import std.algorithm.
> 
> I think we should be looking for a solution that not only allows replacing module -> package transparently, but also allows people to import the newly introduced fine-grained modules.

Yeah. If all we want to do is continue to always import std.algorithm, then the DIP is more or less pointless. It's the splitting of the API among multiple packages while allowing the programmer to either call/import it like he has been or to call/import it from the new module explicitly that the DIP is trying to enable.

If we make it possible to split std.algorithm into multiple modules in place, then we avoid breaking code, keep the code organized in the same hierarchy - only more detailed - and allow the programmer to import on either a roughly or finely grained level, depending on which they prefer. And I really like how this could enable us to have package-specific documentation, whereas all documentation is currently module-specific and doesn't enable us to provide a page which gaves an overview of a package. That's not always necessary, but there are times when it would be quite nice (e.g. std.datetime).

- Jonathan M Davis
April 05, 2012
On 2012-04-05 21:43:24 +0000, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said:

> I think we should be looking for a solution that not only allows replacing module -> package transparently, but also allows people to import the newly introduced fine-grained modules.

I think it'd be valuable too. But how do you do that without creating ambiguous fully qualified names?

One way would be that by importing a module inside a package, the compiler would check first that the package file contains no symbol with that module name. In other words, if you import std.algorithm.sorting, it'd check that "algorithm" is not a symbol defined in the "std" package file (if it exists for std) and that no "sorting" symbol exists in the "std.algorithm" package file. That would work. But the problem is that if the package file publicly imports all of its submodules, then you'll need to parse all the imported files to determine if the module you're trying to import conflicts with a package-level symbol.

A more practical option would be limit the package files to only two things:

1. a list of modules to import when you import the package (importing the package would be akin importing all modules in this list)

2. a list of symbols aliased in the package's namespace (aliased symbols can be accessed using the package name as a prefix and through selective imports)

Since the list of aliases is stored directly in the package file, reading the package file to get the names of each alias is enough to tell there is no conflict with the module name when you're trying to import it. No need to open the other modules the package refers to, or to resolve the symbols the aliases refer to: you only need the name of each alias to verify there is no conflict.

For instance, a package file could look like this:

	package std.algorithm;

	// modules to import when a file is importing the package
	// (those are not imported inside this package's namespace)
	module std.algorithm.sorting;
	module std.algorithm.mapping;
	...

	// symbols to alias to this package namespace
	// (importing a module with one of these names is an error)
	alias std.algorithm.sorting.completeSort  completeSort;
	alias std.algorithm.sorting.isSorted      isSorted;
	alias std.algorithm.sorting.partialSort   partialSort;
	alias std.algorithm.sorting.schwartzSort  schwartzSort;
	alias std.algorithm.sorting.sort          sort;
	...

The drawback is that it's hard to maintain the list of fully-qualified names up to date if you change it in the various modules. But if the goal is only to preserve backward compatibility, creating that list of aliases could be a one-time thing. New symbols would continue to be imported when you import std.algorithm, but they'd not be available directly under std.algorithm: you'd need to use std.algorithm.sorting.newSort if you find yourself in a situation that requires a fully-qualified name.


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

April 06, 2012
On 4/5/12 10:55 PM, Steven Schveighoffer wrote:
> On Fri, 30 Mar 2012 10:46:19 -0400, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>
>> Starting a new thread from one in announce:
>>
>> http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP16
>>
>> Please comment, after which Walter will approve. Walter's approval
>> means that he would approve a pull request implementing DIP16 (subject
>> to regular correctness checks).
>>
>>
>> Destroy!
>
> BTW, this case makes the part of DIP16 which wants to shortcut fully
> qualified names invalid, or at least costly (I posted this code in
> another part of the thread, but I thought I'd bring it up higher).
>
> The following is valid code today:
>
> a/b.d:
>
> module a.b;
>
> void foo() {}
> struct b
> {
> static void foo() {}
> }
>
> main.d:
> import a.b;
>
> void main()
> {
> a.b.foo();
> }
>
> If DIP16 were to be implemented, this becomes ambiguous. Is a.b.foo()
> the module function foo() from a.b, or is it a shortcut for a.b.b.foo()?

It's a shortcut to the module function foo().

If I replace main.d with:

void main() {
  foo();
}

Is foo() the module function foo() from a.b, or is it a shortcut for a.b.b.foo()? Here you have no doubts. What your mind does it: find a top level symbol "foo" in all imported modules. If you find more than one, error.

You must apply the same logic for "a.b.foo()". First you search "foo" in the "a.b" symbol. Here you find it: it's the top level function "foo" in "a.b". Then you stop searching.

However, if you can't find it in the module "a.b", you search a top level symbol "foo" in all modules that are in package "a.b". That's it. You don't search "foo" in every possible nesting: just module nesting.

>
> The main issue is, because you can shortcut the FQN, and a chain of
> identifiers can have repeated identifiers in them, ambiguity is possible.

As I said before, it's not a shortcut of the FQN: it's just a shortcut for the module name.
April 06, 2012
Le 05/04/2012 23:43, Andrei Alexandrescu a écrit :
> On 4/5/12 4:26 PM, Steven Schveighoffer wrote:
>> On Thu, 05 Apr 2012 17:00:56 -0400, Jonathan M Davis
>> <jmdavisProg@gmx.com> wrote:
>>
>>> On Thursday, April 05, 2012 15:30:17 Steven Schveighoffer wrote:
>>
>>>> I don't see how. Just move the code into another module and publicly
>>>> import that module from std/algorithm.d. Problem pretty much solved.
>>>
>>> The issue is code organization. If you want to split up std.algorithm
>>> (or
>>> std.datetime or whatever) into multiple modules, you have to create a
>>> new
>>> package with a completely different name with no connection to the
>>> original
>>> save for the fact that the original publicly imports it.
>>
>> My view is that people will not import the smaller modules, they will
>> only ever import std.algorithm.
>
> I think we should be looking for a solution that not only allows
> replacing module -> package transparently, but also allows people to
> import the newly introduced fine-grained modules.
>
> Andrei
>

Why not limit name collision to name which make sense ?

For instance, import std.a.b.c is a module. if it refers also to a function, this import doesn't make any sense, so, even if we have a name collision, this isn't a big deal (except maybe for reflection ?).

Same goes for std.a.b.c(); which is a function call, and obviously not the module. Here what I propose to resolve names :

1/ import does always find the .d corresponding file. No exception.
2/ Module a.b.c is in package a, a.b and a.b.c . Any package declaration in a.b.c match package a.b (one level is removed).
3/ When a name is used in the code and have to be resolved, the following process occurs :
 - The compiler find all stuff that have this name.
 - The compiler discard all stuffs that have this name and doesn't make sense.
 - If all remaining items are overload of the same item, then standard best match rule apply.
 - If all remaining items aren't in the same module, or overload or different items, an error occurs. This is never a problem in the case of big modules splitted in submodules.

Some examples :

a.d

public import a.b;  // import a/b.d

class b {
    static void foo() {}
}

****************

a/b.d

public import a.b.c;  // import a/b/c.d

void foo(int i) {}

****************

a/b/c.d

void foo() {}

****************

main.d

import a;

foo();  // foo from a/b/c.d
foo(2);  // foo from a/b.d
a.foo();  // foo from a/b/c.d
a.b.foo();  // Error, match both a/b.d and a/b/c.d
a.b.foo(2);  // foo from a/b.d
a.b.c.foo();  // foo from a/b/c.d
April 06, 2012
Le 06/04/2012 01:32, Michel Fortin a écrit :
> On 2012-04-05 21:43:24 +0000, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> said:
>
>> I think we should be looking for a solution that not only allows
>> replacing module -> package transparently, but also allows people to
>> import the newly introduced fine-grained modules.
>
> I think it'd be valuable too. But how do you do that without creating
> ambiguous fully qualified names?
>

It isn't possible. But as already mentioned, all name doesn't make sense in all situation, so most of the time, disambiguation can be done.

Plus, we want to be able to split module when they grow, and in such a situation, collisions will never happen, because all symbols comes from the same module in a first place.
April 06, 2012
On Thu, 05 Apr 2012 23:18:07 -0400, Ary Manzana <ary@esperanto.org.ar> wrote:

> On 4/5/12 10:55 PM, Steven Schveighoffer wrote:
>> On Fri, 30 Mar 2012 10:46:19 -0400, Andrei Alexandrescu
>> <SeeWebsiteForEmail@erdani.org> wrote:
>>
>>> Starting a new thread from one in announce:
>>>
>>> http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP16
>>>
>>> Please comment, after which Walter will approve. Walter's approval
>>> means that he would approve a pull request implementing DIP16 (subject
>>> to regular correctness checks).
>>>
>>>
>>> Destroy!
>>
>> BTW, this case makes the part of DIP16 which wants to shortcut fully
>> qualified names invalid, or at least costly (I posted this code in
>> another part of the thread, but I thought I'd bring it up higher).
>>
>> The following is valid code today:
>>
>> a/b.d:
>>
>> module a.b;
>>
>> void foo() {}
>> struct b
>> {
>> static void foo() {}
>> }
>>
>> main.d:
>> import a.b;
>>
>> void main()
>> {
>> a.b.foo();
>> }
>>
>> If DIP16 were to be implemented, this becomes ambiguous. Is a.b.foo()
>> the module function foo() from a.b, or is it a shortcut for a.b.b.foo()?
>
> It's a shortcut to the module function foo().
>
> If I replace main.d with:
>
> void main() {
>    foo();
> }
>
> Is foo() the module function foo() from a.b, or is it a shortcut for a.b.b.foo()? Here you have no doubts. What your mind does it: find a top level symbol "foo" in all imported modules. If you find more than one, error.

That's slightly different, because you must *always* qualify struct b's foo with a preceeding b.

> You must apply the same logic for "a.b.foo()". First you search "foo" in the "a.b" symbol. Here you find it: it's the top level function "foo" in "a.b". Then you stop searching.
>
> However, if you can't find it in the module "a.b", you search a top level symbol "foo" in all modules that are in package "a.b". That's it. You don't search "foo" in every possible nesting: just module nesting.

What if a.b is a struct, and it's the only possible match?  We don't search for it?  At some point the struct b has to come into play.  Or are you saying we cannot shortcut struct FQN's?

I suppose if we prefer to match modules before types, then name lookup for fully qualified names only becomes ambiguous with packages allowed to have their own modules, so it shouldn't affect existing code.

>
>>
>> The main issue is, because you can shortcut the FQN, and a chain of
>> identifiers can have repeated identifiers in them, ambiguity is possible.
>
> As I said before, it's not a shortcut of the FQN: it's just a shortcut for the module name.

My example *was* a shortcut for the module name.  I did not imply that you could shortcut the other parts of the FQN.

What about hijacking though?  For example:

module a.b

struct c
{
   static void foo() {}
}

people now use a.c.foo() to avoid having to type the whole thing

But along comes someone who creates:

module a.c;
void foo() {}

Now, doesn't this usurp a.c.foo() without warning?

-Steve
April 06, 2012
On Thu, 05 Apr 2012 19:14:42 -0400, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Thursday, April 05, 2012 17:33:50 Steven Schveighoffer wrote:
>>
>> Why do we ever need to split modules for documentation? Just fix the doc
>> generator so it's not as monolithic. For instance, have one page per
>> class or struct.
>
> That may or may not be desirable (certainly in the case of smaller types, I'd
> argue that it isn't). By doing it on a module basis, you have far more
> control. But regardless, that would be a major change to ddoc.

ddoc's output leaves a lot to be desired.  The unorganized links at the top suck.  Using module order to show symbols instead of category of symbols.

How is a doc page ever "too big"?  Even std.datetime loads in a second.  It's more that it's "too disorganized".

>> > And if all you care
>> > about is sub-modules for implementation and want all of the functions  
>> in
>> > the
>> > same module still, then this DIP is pointless. All you have to do is
>> > declare
>> > undocumented sub-modules which hold the various implementations and  
>> have
>> > the
>> > actual module call them. We already do this sort of thing in Phobos to
>> > get
>> > around static destructors screaming about circular dependencies.
>>
>> You are starting to see my point :) But I think the issue is not so much
>> that you are splitting the implementation, but splitting up the API into
>> related modules.
>
> As I understand it, the entire point of this DIP is to enable splitting up the
> API cleanly without breaking code. The implementation can already be split up
> seemlessly.

As can the API via public imports.

-Steve