November 04, 2012
On 03/11/2012 21:29, bearophile wrote:
> Faux Amis:
>
>> Care to elaborate on that?
>
> They share most of the problems of global variables. While not evil,
> it's better to avoid module-level mutables. This makes the code more
> testable, simpler to understand, less bug prone, and makes functions
> more usable for other purposes. In D there the attribute "pure" is
> present also to enforce such better coding style.
>
> Bye,
> bearophile

From a good-coding standpoint, do you think there is a difference between these two options?

--
module a;

int a;
--
module b;

struct S{//otherwise unused wrapper
static int b;
}
--
November 04, 2012
On Sunday, 4 November 2012 at 14:59:24 UTC, Faux Amis wrote:
> I failed to mention that I am mostly talking about private module scope variables. I don't see how private module scoped vars make for less testable, readable or more bug prone code.

It's not like I feel that you should never use them, but what he says is right. The more open access that is given to a variable, the more difficult it is for the programmer to know what will access or change it. That becomes a huge problem in many languages that don't use thread-local variables by default, but it's still a problem in D.

Without using some sort of automated search, you can't know where in the module a variable is accessed or changed. Sometimes even a search will be insufficient:
---
module a;
int b; // the variable we're interested in
int c;

void blah() {
   foo();
   bar();
}

void foo() {
   c = b + 1;
}

void bar() {
   b *= 3;
}
---

So, via search, we can see that b is being accessed in foo, and being changed in bar. However, a simple search will not tell us that blah is accessing and changing b. Furthermore, any function that uses blah will also be accessing and changing b. The reason this is a problem is that it's "hidden" from the people reading the code. You might not know that using "blah" will write and read from b, which might effect the behavior of the call to "bax" later on in your code.

Consider this code, however:

---
void main() {
    int b = 5; // the variable we're interested in
    int c = blah(b);
    // What's the value of b here? How about c?
    bax(b);
    flax();
    bongo();
    for(d; sheep(c))
       blast(d);
    // Do you still know what b is? Of course you do!
}

int blah(ref int b) {
    int c = foo(b);
    bar(b);
    return c;
}

void foo(int b) {
    return b + 1;
}

void bar(ref int b) {
   b *= 3;
}

void bax(int b) {
   // large switch statement on b for behaviors
}
---

Now we're explicit in what uses b. All of a sudden, we can reason much more about the code without doing nearly as much searching. We know who depends on the value of b and who might change b. If we allowed it to be a global variable, or a module variable, or even a static struct variable, we might not always have such a good grasp on what is happening to it.

And if you don't have a good grasp on what's happening to the state of your program, you might introduce bugs. Hence, it's bug prone to use module scoped variables.

As for testability: if the behavior of your code depends on globals (and/or module-scoped variables), then it should be obvious why it's more difficult to test. Tests shouldn't be effected by tests run before nor should they have an effect on tests run after. When you use globals, your code will violate that by definition. Unless, of course, you spend time being very careful to reset globals before and/or after each test. That certainly makes testing more difficult and error prone, though.


Now, that all said, it's not like a velociraptor will jump through your window and eat your face off if you use a global or a module scoped variable. I'm sure you can come up with examples of where it might be beneficial or preferable. But it's something that should be avoided where you can.

Here's a good resource on why global variables are bad: http://c2.com/cgi/wiki?GlobalVariablesAreBad

Most of those reasons are applicable to module-scoped variables as well. Even struct static variables can still be problematic for some of the same reasons.
November 04, 2012
On 04/11/2012 17:05, Chris Cain wrote:
> On Sunday, 4 November 2012 at 14:59:24 UTC, Faux Amis wrote:
>> I failed to mention that I am mostly talking about private module
>> scope variables. I don't see how private module scoped vars make for
>> less testable, readable or more bug prone code.
>
> It's not like I feel that you should never use them, but what he says is
> right. The more open access that is given to a variable, the more
> difficult it is for the programmer to know what will access or change
> it. That becomes a huge problem in many languages that don't use
> thread-local variables by default, but it's still a problem in D.
>
> Without using some sort of automated search, you can't know where in the
> module a variable is accessed or changed. Sometimes even a search will
> be insufficient:
> ---
> module a;
> int b; // the variable we're interested in
> int c;
>
> void blah() {
>     foo();
>     bar();
> }
>
> void foo() {
>     c = b + 1;
> }
>
> void bar() {
>     b *= 3;
> }
> ---
>
> So, via search, we can see that b is being accessed in foo, and being
> changed in bar. However, a simple search will not tell us that blah is
> accessing and changing b. Furthermore, any function that uses blah will
> also be accessing and changing b. The reason this is a problem is that
> it's "hidden" from the people reading the code. You might not know that
> using "blah" will write and read from b, which might effect the behavior
> of the call to "bax" later on in your code.
>
> Consider this code, however:
>
> ---
> void main() {
>      int b = 5; // the variable we're interested in
>      int c = blah(b);
>      // What's the value of b here? How about c?
>      bax(b);
>      flax();
>      bongo();
>      for(d; sheep(c))
>         blast(d);
>      // Do you still know what b is? Of course you do!
> }
>
> int blah(ref int b) {
>      int c = foo(b);
>      bar(b);
>      return c;
> }
>
> void foo(int b) {
>      return b + 1;
> }
>
> void bar(ref int b) {
>     b *= 3;
> }
>
> void bax(int b) {
>     // large switch statement on b for behaviors
> }
> ---
>
> Now we're explicit in what uses b. All of a sudden, we can reason much
> more about the code without doing nearly as much searching. We know who
> depends on the value of b and who might change b. If we allowed it to be
> a global variable, or a module variable, or even a static struct
> variable, we might not always have such a good grasp on what is
> happening to it.
>
> And if you don't have a good grasp on what's happening to the state of
> your program, you might introduce bugs. Hence, it's bug prone to use
> module scoped variables.
>
> As for testability: if the behavior of your code depends on globals
> (and/or module-scoped variables), then it should be obvious why it's
> more difficult to test. Tests shouldn't be effected by tests run before
> nor should they have an effect on tests run after. When you use globals,
> your code will violate that by definition. Unless, of course, you spend
> time being very careful to reset globals before and/or after each test.
> That certainly makes testing more difficult and error prone, though.
>
>
> Now, that all said, it's not like a velociraptor will jump through your
> window and eat your face off if you use a global or a module scoped
> variable. I'm sure you can come up with examples of where it might be
> beneficial or preferable. But it's something that should be avoided
> where you can.
>
> Here's a good resource on why global variables are bad:
> http://c2.com/cgi/wiki?GlobalVariablesAreBad
>
> Most of those reasons are applicable to module-scoped variables as well.
> Even struct static variables can still be problematic for some of the
> same reasons.

In your last paragraph you are getting to my point in my other post:
I think there is nothing wrong with a module scope private var as in D a module is the first encapsulation and adding a wrapper only adds noise.

These are equivalent(from a good-coding pov):
---
module a;

private int i;
---
module b;

// annoying wrapper
//which makes it difficult to have a single b in the program
struct S{
 int i;
}
---

These are also equivalent:
---
module a;

int i;
---
module b;

struct S{
 static int i;
}
---

November 04, 2012
Faux Amis:

> I think there is nothing wrong with a module scope private var as in D a module is the first encapsulation and adding a wrapper only adds noise.

Generally it's better to minimize the scope of variables. So if you wrap a variable inside a struct you have often reduced its scope, unless the module contains only one variable and only one struct :-)

Often it's better to pass variables as arguments.

Again, in D "pure" is your good friend.

Bye,
bearophile
November 05, 2012
On 05/11/2012 00:58, bearophile wrote:
> Faux Amis:
>
>> I think there is nothing wrong with a module scope private var as in D
>> a module is the first encapsulation and adding a wrapper only adds noise.
>
> Generally it's better to minimize the scope of variables. So if you wrap
> a variable inside a struct you have often reduced its scope, unless the
> module contains only one variable and only one struct :-)
>
> Often it's better to pass variables as arguments.
>
> Again, in D "pure" is your good friend.
>
> Bye,
> bearophile

Is there any reason to encapsulate this kind of code in a struct?
---
module a;

private int _a;

int a(){
  return _a;
}

void a(int aIn){
  _a = aIn;
}

void useA(){
}
---
module main;

static import a;

void main(){
  a.useA();
}
---
November 05, 2012
On Sunday, 4 November 2012 at 23:51:15 UTC, Faux Amis wrote:
> In your last paragraph you are getting to my point in my other post:
> I think there is nothing wrong with a module scope private var as in D a module is the first encapsulation and adding a wrapper only adds noise.
>
> These are equivalent(from a good-coding pov):
> ---
> module a;
>
> private int i;
> ---
> module b;
>
> // annoying wrapper
> //which makes it difficult to have a single b in the program
> struct S{
>  int i;
> }
> ---
>
> These are also equivalent:
> ---
> module a;
>
> int i;
> ---
> module b;
>
> struct S{
>  static int i;
> }
> ---

Like I said, I don't think it's something that should be banned outright, but I've seen many examples of code where someone didn't take the time to be explicit about the dependencies in their code and instead chose to "hide" it like you propose is appropriate. Trust me, as the person who has to maintain and extend their code, **I don't appreciate their laziness**.

If you can, you should avoid such things. Module-scoped variables mitigate the problems, but they don't eliminate them.

Also, bearophile's suggestion to try to make everything "pure" is a good idea. I write as much code as I can using immutable and pure and only relax those restrictions when the cost is far too great to maintain that. When I come back to the code later, I find it significantly easier to figure out for many of the reasons I outlined in my previous post.

---
void bar(ref int b) {
    //code using b
}
---

Is generally better than

---
int b;
void bar() {
    // code using b
}
---

Even if b is only private module scoped. I've outlined the reasons why in my previous post.

On Sunday, 4 November 2012 at 23:58:08 UTC, bearophile wrote:
> Generally it's better to minimize the scope of variables.

Quoted for truth.

Heck, I've seen examples of code where instance variables were too liberally applied, like so:

---
struct S {
   int b;
   void bar() {
      // b is initialized and used in this scope, but no where else
   }

   // more code where b isn't used
}
---

is worse than:

---
struct S {
   void bar() {
       int b = ...;
       // use b
   }

   // more code
}
---

Now, maybe b used to be used elsewhere, (I can't say for sure...) but sometimes I wonder why people are so willing to let things like that leak out of the scope where it's used... it makes the next person's job much harder for no reason.
November 05, 2012
Chris Cain:

>> Generally it's better to minimize the scope of variables.
>
> Quoted for truth.

:-)
I was mostly quoting from this blog post, that shows the point is not just scope:

http://blog.knatten.org/2011/11/11/disempower-every-variable/

It's a simple couple of rules, useful in all languages, even functional ones.

Bye,
bearophile
November 05, 2012
On 05/11/2012 01:45, Chris Cain wrote:
> On Sunday, 4 November 2012 at 23:51:15 UTC, Faux Amis wrote:
>> In your last paragraph you are getting to my point in my other post:
>> I think there is nothing wrong with a module scope private var as in D
>> a module is the first encapsulation and adding a wrapper only adds noise.
>>
>> These are equivalent(from a good-coding pov):
>> ---
>> module a;
>>
>> private int i;
>> ---
>> module b;
>>
>> // annoying wrapper
>> //which makes it difficult to have a single b in the program
>> struct S{
>>  int i;
>> }
>> ---
>>
>> These are also equivalent:
>> ---
>> module a;
>>
>> int i;
>> ---
>> module b;
>>
>> struct S{
>>  static int i;
>> }
>> ---
>
> Like I said, I don't think it's something that should be banned
> outright, but I've seen many examples of code where someone didn't take
> the time to be explicit about the dependencies in their code and instead
> chose to "hide" it like you propose is appropriate. Trust me, as the
> person who has to maintain and extend their code, **I don't appreciate
> their laziness**.
>
> If you can, you should avoid such things. Module-scoped variables
> mitigate the problems, but they don't eliminate them.
>
> Also, bearophile's suggestion to try to make everything "pure" is a good
> idea. I write as much code as I can using immutable and pure and only
> relax those restrictions when the cost is far too great to maintain
> that. When I come back to the code later, I find it significantly easier
> to figure out for many of the reasons I outlined in my previous post.
>
> ---
> void bar(ref int b) {
>      //code using b
> }
> ---
>
> Is generally better than
>
> ---
> int b;
> void bar() {
>      // code using b
> }
> ---
>
> Even if b is only private module scoped. I've outlined the reasons why
> in my previous post.
>
> On Sunday, 4 November 2012 at 23:58:08 UTC, bearophile wrote:
>> Generally it's better to minimize the scope of variables.
>
> Quoted for truth.
>
> Heck, I've seen examples of code where instance variables were too
> liberally applied, like so:
>
> ---
> struct S {
>     int b;
>     void bar() {
>        // b is initialized and used in this scope, but no where else
>     }
>
>     // more code where b isn't used
> }
> ---
>
> is worse than:
>
> ---
> struct S {
>     void bar() {
>         int b = ...;
>         // use b
>     }
>
>     // more code
> }
> ---
>
> Now, maybe b used to be used elsewhere, (I can't say for sure...) but
> sometimes I wonder why people are so willing to let things like that
> leak out of the scope where it's used... it makes the next person's job
> much harder for no reason.

Ok, good to see that you are replying to incorrectly scoped variables, but this is not the point I am trying to make. I know you should always keep the scope as small as possible.

Can you think of a setting in which we have legitimate private struct members? If so, then add to this setting that you only want one instantiation of the data in this struct. As a solution, what is wrong with dropping the struct encapsulation? There is no other code except the struct in the module.

I sincerely want to know if there is any difference. As I understand it the scope is exactly the same or even smaller as you can't leak instances of modules as you can structs.


November 06, 2012
On Monday, 5 November 2012 at 08:37:49 UTC, Faux Amis wrote:
> Ok, good to see that you are replying to incorrectly scoped variables, but this is not the point I am trying to make. I know you should always keep the scope as small as possible.

Eh? I'm confused. The second half of my post certainly was a bit of a rant on incorrectly scoped variables (which is related to the discussion, but it was my response to bearophile), but the first part of my post is supporting the viewpoint that you should avoid using module-scoped variables (and even static struct member variables) and suggesting an alternative.

> Can you think of a setting in which we have legitimate private struct members? If so, then add to this setting that you only want one instantiation of the data in this struct. As a solution, what is wrong with dropping the struct encapsulation? There is no other code except the struct in the module.
>
> I sincerely want to know if there is any difference. As I understand it the scope is exactly the same or even smaller as you can't leak instances of modules as you can structs.

From my understanding, you're trying to get a specific viewpoint on this idea?

> From a good-coding standpoint, do you think there is a difference between these two options?
>
> --
> module a;
>
> int a;
> --
> module b;
>
> struct S{//otherwise unused wrapper
> static int b;
> }
> --

I think the first option is "better" than the second. The second seems to be a misuse of struct to me. I can't see why you'd use a struct in the second option.

That said, they're effectively equivalent pieces of code. That is, they're both static data that is a shared resource among all functions that have access to them. And, thus, they're both "equally bad" in terms of how they will affect the understandability and testability of the code. It's possible that it would cause the code to have more bugs in it than it would otherwise.

That is, both of them are not as good of choices as this:
---
void good(ref int b) pure {
    // code using/setting b
}
void good2(int b) pure {
    // code using b, but not setting it
}

// No b in module scope or statically allocated

// Sometimes a better idea, depending on circumstances:
int better(int b) pure {
   // code using b, and putting changes into changedB
   return changedB;
}

// example uses:
void main() {
    int a = 1, b = 2, c = 3;
    good(a);
    good2(b);
    c = better(c);
}
---

Because this creates code that is honest about its dependencies and allows for the overall state of the program to be consistent between runs of functions. This is essential for testability, but it's also important for a programmer to reason about the behavior of their code.

Of course, I'm sure you can give examples of code that couldn't be written like that, and that's okay. I'm only arguing that you should avoid static data when it's realistic to do so, not that it will open a black hole in your living room if you use it under any circumstances :-). Though if you're using it as your "primary data encapsulation," I have to wonder whether you're using it in instances it could have been avoided.
November 06, 2012
On 06/11/2012 07:46, Chris Cain wrote:
> On Monday, 5 November 2012 at 08:37:49 UTC, Faux Amis wrote:
>> Ok, good to see that you are replying to incorrectly scoped variables,
>> but this is not the point I am trying to make. I know you should
>> always keep the scope as small as possible.
>
> Eh? I'm confused. The second half of my post certainly was a bit of a
> rant on incorrectly scoped variables (which is related to the
> discussion, but it was my response to bearophile), but the first part of
> my post is supporting the viewpoint that you should avoid using
> module-scoped variables (and even static struct member variables) and
> suggesting an alternative.
>
>> Can you think of a setting in which we have legitimate private struct
>> members? If so, then add to this setting that you only want one
>> instantiation of the data in this struct. As a solution, what is wrong
>> with dropping the struct encapsulation? There is no other code except
>> the struct in the module.
>>
>> I sincerely want to know if there is any difference. As I understand
>> it the scope is exactly the same or even smaller as you can't leak
>> instances of modules as you can structs.
>
>  From my understanding, you're trying to get a specific viewpoint on
> this idea?
>
>> From a good-coding standpoint, do you think there is a difference
>> between these two options?
>>
>> --
>> module a;
>>
>> int a;
>> --
>> module b;
>>
>> struct S{//otherwise unused wrapper
>> static int b;
>> }
>> --
>
> I think the first option is "better" than the second. The second seems
> to be a misuse of struct to me. I can't see why you'd use a struct in
> the second option.
>
> That said, they're effectively equivalent pieces of code. That is,
> they're both static data that is a shared resource among all functions
> that have access to them. And, thus, they're both "equally bad" in terms
> of how they will affect the understandability and testability of the
> code. It's possible that it would cause the code to have more bugs in it
> than it would otherwise.
>
> That is, both of them are not as good of choices as this:
> ---
> void good(ref int b) pure {
>      // code using/setting b
> }
> void good2(int b) pure {
>      // code using b, but not setting it
> }
>
> // No b in module scope or statically allocated
>
> // Sometimes a better idea, depending on circumstances:
> int better(int b) pure {
>     // code using b, and putting changes into changedB
>     return changedB;
> }
>
> // example uses:
> void main() {
>      int a = 1, b = 2, c = 3;
>      good(a);
>      good2(b);
>      c = better(c);
> }
> ---
>
> Because this creates code that is honest about its dependencies and
> allows for the overall state of the program to be consistent between
> runs of functions. This is essential for testability, but it's also
> important for a programmer to reason about the behavior of their code.
Yes, this is obvious.

>
> Of course, I'm sure you can give examples of code that couldn't be
> written like that, and that's okay. I'm only arguing that you should
> avoid static data when it's realistic to do so, not that it will open a
> black hole in your living room if you use it under any circumstances
> :-). Though if you're using it as your "primary data encapsulation," I
> have to wonder whether you're using it in instances it could have been
> avoided.

I would have loved an answer to this:

Is there any reason to encapsulate this kind of code in a struct?
---
module a;

private int _a;

int a(){
  return _a;
}

void a(int aIn){
  _a = aIn;
}

void useA(){
}
---
module main;

static import a;

void main(){
  a.useA();
}
---

What I am trying to get answered here is whether there is something special about a struct or a class which makes it a 'correct' data encapsulator where a module does not.