April 20, 2014
On 4/20/14, 12:11 AM, Lars T. Kyllingstad wrote:
> The fact that "private" really means "module private" in D means that
> any number of functions can break when a class/struct implementation
> changes.

No, only those in that module. There's no change. -- Andrei
April 21, 2014
On Sunday, 20 April 2014 at 13:01:53 UTC, Gary Willoughby wrote:
> On Sunday, 20 April 2014 at 11:12:42 UTC, Lars T. Kyllingstad wrote:
>> However, in D, all functions defined in the same module as a class will have access to the private state of that class, on an equal footing with its member methods.  Therefore, the above statment doesn't really help in deciding which to use.
>
> Yeah it does. If the function can be used generically across many different parts of the program then it would be much better implemented as a non-member function, even if it's defined in the same module as an associated class.

I agree.  If a function is generally useful outside the context of a class, it should not be defined in the class.


> Functions which are focused to only deal with data associated with a particular class then these would be better suited to be implemented as a method of that class.

This is the tricky part, an it is where I have a hard time deciding which to use.  For example:

    struct File {
        private int fileno;
        void read(ubyte[] buf) {
            core.sys.posix.unistd.read(fileno, buf.ptr, buf.length);
        }
    }

Why, or when, is the above preferable to the following?

    struct File {
        private int fileno;
    }
    void read(File f, ubyte[] buf)
        core.sys.posix.unistd.read(f.fileno, buf.ptr, buf.length);
    }

I still haven't heard any fact-based, logical arguments that advise me on which style to use, and so far it seems to be just that -- a matter of style.

There are a few clear-cut cases, such as when a function should be virtual, or when it is part of a predefined interface (e.g. input range), but in the general case one seems just as "encapsulated" as the other.
April 21, 2014
On Sunday, 20 April 2014 at 20:36:58 UTC, Andrei Alexandrescu wrote:
> On 4/20/14, 12:11 AM, Lars T. Kyllingstad wrote:
>> The fact that "private" really means "module private" in D means that
>> any number of functions can break when a class/struct implementation
>> changes.
>
> No, only those in that module. There's no change. -- Andrei

Ok, so "any number" was poorly phrased.  What I meant was "a large number", because in my experience, modules tend to be quite large.  Specifically, they are rarely limited to containing just a single class.  They often contain multiple classes, along with most related functionality.  In principle, changing the implementation of one class can break the implementation of another class!  Now, you may argue that kitchen sink modules are poor programming style, but it seems to be a common style, with Phobos being a very prominent example. :)

I often wish "private" meant class private in D.  I know, the usual argument against this is that someone who writes a module usually has full control of that module, but again, Phobos is an example of the contrary.  Each module has at least a dozen authors, even if they aren't all listed in the documentation.

I also know it's never going to happen due to the amount of code breakage it would cause.  But maybe we could extend the syntax a bit?  E.g. "private(class)" or "class private"?
April 21, 2014
On Monday, 21 April 2014 at 08:33:21 UTC, Lars T. Kyllingstad wrote:
> This is the tricky part, an it is where I have a hard time deciding which to use.  For example:
>
>     struct File {
>         private int fileno;
>         void read(ubyte[] buf) {
>             core.sys.posix.unistd.read(fileno, buf.ptr, buf.length);
>         }
>     }
>
> Why, or when, is the above preferable to the following?
>
>     struct File {
>         private int fileno;
>     }
>     void read(File f, ubyte[] buf)
>         core.sys.posix.unistd.read(f.fileno, buf.ptr, buf.length);
>     }
>
> I still haven't heard any fact-based, logical arguments that advise me on which style to use, and so far it seems to be just that -- a matter of style.

In this case i would say go with the method and not a non-member function. There are two reasons for this. First the method uses private state which is itself a good indicator. Second 'read' is an action of 'File'. This is encapsulation in action because you are defining logical actions to be performed on the state of a class. It's a well formed unit with associated behaviours.

It is confusing to think 'if this was a non-member function in the same module i can also access use the private state'. Yes that's true but just because you *can* access it, it doesn't mean you should! In fact you should have a very well defined reason for doing so.

Classes should nearly always be nouns and methods nearly always be verbs. Nearly always because there are always exceptions. For example i like to name methods which return bool's starting with 'is'. e.g. isOpen, isRunning, etc. The rule is follow encapsulation[1] as much as possible when designing classes.

I found accessing private state in a module is useful when *initialising* said state when constructing objects. In this case the module can act like a very helpful factory. In a recent project i have a rigid class design of an application and it's child windows. Other windows should be able to be opened and their id's generated automatically and internally. These id's are only available as read-only properties. One window in particular i needed to create with a specific non-generated id. I couldn't include this data in the constructor as i don't want anyone to control the id's but *i* needed to this one time. This case fitted well into the D module design and allowed me to create a window, override its private id and move on knowing that could not be tampered with by anything else. This design turned out to be very clean and without any baggage of a unnecessary setter methods or constructor parameter.

I used to be of the ilk that thought all programming could be done using only classes and designing everything in a very strict OOP way. D has broken that way of thinking in me purely because of things like UFCS and a module's private access to members. Now i understand that you can actually achieve a cleaner design by moving towards these things instead of having everything as a class. First and foremost you must try and achieve a good OOP design, this is essential. Then use UFCS and module private access features to keep things clean and simple.

Keep things logical, simple and straightforward.

[1]: http://en.wikipedia.org/wiki/Encapsulation_(object-oriented_programming)
April 21, 2014
On Sun, 20 Apr 2014 03:11:39 -0400, Lars T. Kyllingstad <public@kyllingen.net> wrote:

> So, can anyone think of some good guidelines for when to make a function a member, when to write it as a free function in the same module, and when to move it to a different module?

First, you rightly destroy the main reason for making a module-level function vs. a method for D -- module-level functions have access to the private data. Therefore, the motivation to make a module-level function is significantly diminished. In my opinion, I would say that you should always first try making them methods, but under certain cases, you should make them functions.

Reasons off the top of my head not to make them module functions:

1. You can import individual symbols from modules. i.e.:

import mymodule: MyType;

If a large portion of your API is module-level functions, this means you have to either import the whole module, or the individual methods you plan to use.

2. You can get delegates to methods. You cannot get delegates to module functions, even if they are UFCS compatible.

3. There is zero chance of a conflict with another type's similarly named method.

4. It enforces the "method call" syntax. I.e. you cannot use foo(obj) call. This may be important for readability.

5. You can only use operator overloads via methods. D is different in this respect from C++.

6. The documentation will be grouped with the object itself. This becomes even more critical with the new doc layout which has one page per type.

Reasons to make them module functions:

1. You have more than one object in the same file which implements the method identically via duck typing.

2. You want to change how the 'this' type is passed -- in other words, you want to pass a struct by value or by pointer instead of by ref.

3. The complement to #1 in the 'against' list -- you want your module-level API to be selectively enabled!

4. Of course, if you are actually implementing in a different module, Scott Meyers' reasoning applies there.

-Steve
April 21, 2014
On Monday, 21 April 2014 at 08:33:21 UTC, Lars T. Kyllingstad wrote:
> On Sunday, 20 April 2014 at 13:01:53 UTC, Gary Willoughby wrote:
>> Yeah it does. If the function can be used generically across many different parts of the program then it would be much better implemented as a non-member function, even if it's defined in the same module as an associated class.
>
> I agree.  If a function is generally useful outside the context of a class, it should not be defined in the class.

I think this view is too simple. Even if a function is generally useful you risk ending up with maintenance problems later on when you need to optimize your code. So if in doubt, make it local.

Unless you write libraries the primary goal with encapsulation is not reuse, but being able to evolve, modify, refactor, optimize. So having a local wrapper on top of a generic function or just make it local until you need it somewhere else is quite acceptable IMO. (But whether it is inside the class or not is mostly syntactical?)

I've one time too many done too early refactoring under the assumption that it would lead to better, resusable code. It seldom does. It often leads to a wasted design effort, less intuitive function names and more fragmented code that is harder to understand later on.

Ola.
April 21, 2014
On Monday, 21 April 2014 at 12:45:12 UTC, Steven Schveighoffer wrote:
> [...]
>
> Reasons off the top of my head not to make them module functions:
>
> 1. You can import individual symbols from modules. i.e.:
>
> import mymodule: MyType;
>
> If a large portion of your API is module-level functions, this means you have to either import the whole module, or the individual methods you plan to use.

Based on this, combined with your points 6 and 3 further down -- the second number 3, that is :) -- we can make the following guideline:

Methods which are central to the class' usage, and which are therefore likely to be used often, should be member functions, while auxiliary functions and convenience functions should be non-members.

The same thing was stated earlier in this thread, in different words, and I guess it is the rule most of us use already.  However, this is the first non-subjective rationale I've seen for it so far.  Awesome!

> 2. You can get delegates to methods. You cannot get delegates to module functions, even if they are UFCS compatible.

This is an excellent point.  I would never have thought of that.

> 3. There is zero chance of a conflict with another type's similarly named method.

How?  If you have the following functions:

    void foo(A a);
    void foo(B b);

and you write

    foo(new B);

there is also zero chance of conflict -- even if B happens to be a subclass of A, since the most specialised function is always called.

> 4. It enforces the "method call" syntax. I.e. you cannot use foo(obj) call. This may be important for readability.

Some would argue that giving users the choice between typing foo(obj) and obj.foo() is a Good Thing, because it doesn't impose your preferences on them.  I'm not going to do that, though. ;)

> 5. You can only use operator overloads via methods. D is different in this respect from C++.

True. Operator overloads fall in the same category as virtuals and interface functions, i.e., the ones that *cannot* be non-members.

> [...]
>
> Reasons to make them module functions:
>
> 1. You have more than one object in the same file which implements the method identically via duck typing.
>
> 2. You want to change how the 'this' type is passed -- in other words, you want to pass a struct by value or by pointer instead of by ref.
>
> 3. The complement to #1 in the 'against' list -- you want your module-level API to be selectively enabled!
>
> 4. Of course, if you are actually implementing in a different module, Scott Meyers' reasoning applies there.

All very good points.  This is exactly what I was looking for, thanks!
April 21, 2014
On Monday, 21 April 2014 at 13:03:50 UTC, Ola Fosheim Grøstad wrote:
> On Monday, 21 April 2014 at 08:33:21 UTC, Lars T. Kyllingstad wrote:
>> On Sunday, 20 April 2014 at 13:01:53 UTC, Gary Willoughby wrote:
>>> Yeah it does. If the function can be used generically across many different parts of the program then it would be much better implemented as a non-member function, even if it's defined in the same module as an associated class.
>>
>> I agree.  If a function is generally useful outside the context of a class, it should not be defined in the class.
>
> I think this view is too simple. Even if a function is generally useful you risk ending up with maintenance problems later on when you need to optimize your code. So if in doubt, make it local.

I agree, but I think that's more a question of *when* a function is considered "generally useful".  To me, that is when I have an actual use case for it beyond the one for which it was originally designed, and not just because I think it might come in handy some time in the future.  "If in doubt, make it private" is always a good guideline.
April 21, 2014
On 4/21/14, Steven Schveighoffer via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> Reasons off the top of my head not to make them module functions

Here's another one, the bug report is about enums but it showcases an issue with module-scoped functions taking struct parameters (in short: function hijacking protection makes defining module-scoped functions problematic):

https://issues.dlang.org/show_bug.cgi?id=10846
April 21, 2014
On Mon, 21 Apr 2014 09:46:18 -0400, Lars T. Kyllingstad <public@kyllingen.net> wrote:

> On Monday, 21 April 2014 at 12:45:12 UTC, Steven Schveighoffer wrote:
>> 3. There is zero chance of a conflict with another type's similarly named method.
>
> How?  If you have the following functions:
>
>      void foo(A a);
>      void foo(B b);
>
> and you write
>
>      foo(new B);
>
> there is also zero chance of conflict -- even if B happens to be a subclass of A, since the most specialised function is always called.

Demonstration:

module m1;
import std.stdio;

class C {}

void foo(C c)
{
   writeln("C.foo");
}

void bar(C c)
{
   writeln("C.bar");
}

module m2;
import m1;
import std.stdio;

void foo(T)(T t)
{
  writeln("m2.foo");
}

void bar(T)(T t, int x)
{
  writeln("m2.bar");
}

void main()
{
   auto c = new C;
   c.foo(); // "m2.foo";
   //c.bar(); // error if uncommented!
}

Basically, I've inadvertently overridden C.foo, without intending to. With bar, I've somehow hidden the inherent functionality of C!

>
>> 4. It enforces the "method call" syntax. I.e. you cannot use foo(obj) call. This may be important for readability.
>
> Some would argue that giving users the choice between typing foo(obj) and obj.foo() is a Good Thing, because it doesn't impose your preferences on them.  I'm not going to do that, though. ;)

You may recall that I am a big proponent of explicit properties because I think the ways of calling functions have strong implications to the reader, regardless of the functions. This is the same thing. I look at foo(x) much differently than x.foo().

It's the same (though not quite as important) as choosing a good name for a function.

-Steve