Jump to page: 1 2
Thread overview
DIP idea: q{}-inspired block mixins
Jun 10, 2020
Q. Schroll
Jun 10, 2020
mw
Jun 10, 2020
Q. Schroll
Jun 11, 2020
Q. Schroll
Jun 11, 2020
Stefan Koch
Jun 20, 2020
Q. Schroll
Jun 10, 2020
Stanislav Blinov
Jun 10, 2020
Q. Schroll
Jun 11, 2020
Luis
Nov 05, 2020
Q. Schroll
Nov 05, 2020
Paul Backus
Nov 05, 2020
Jacob Carlborg
Nov 05, 2020
Jacob Carlborg
Nov 05, 2020
Adam D. Ruppe
Nov 05, 2020
data pulverizer
Nov 06, 2020
Q. Schroll
Nov 06, 2020
Patrick Schluter
Nov 06, 2020
Q. Schroll
Nov 06, 2020
Paul Backus
June 10, 2020
Mixin declarations and statements are often used in conjunction with large literals containing lots of code, where stuff is inserted at specific points using interruptions like mixin("...", identifier, "...") or %s format specifiers and .format afterwards.
It hurts readability. In the best case, code should directly express what we mean.
We mean in those cases: Insert the value of identifier and not "identifier" here.
An observation is also that hardly anywhere, the value of the identifier and not "identifier" are used together.

Suggestion:

    mixin[op]
    {
        lhs op= rhs;
    }

It parses the stuff between the { and } the same as a q{}-string, but any identifier token found that is identical to the one between [ and ] will not be used literally, but be evaluated (mixed in). So the above is equivalent to:

    mixin("lhs ", op, "= rhs;");

For a simple idea(*): Pack everything between the braces in a q{} string. "Interrupt" the q{} string anywhere an identifier `ident` of the bracketed list is found by "},ident,q{".
(*) Doesn't work when braces are involved.

A prime example is pseudo-code we have on the spec that doesn't compile. We often write stuff like op= and expect people to interpret it properly. With that, the compiler could, too.

FAQ.

Q: Can I use multiple identifiers?
A: Yes. Comma-separated in the brackets. E.g. [ident1, ident2]

Q: Can I use expressions?
A: Instead of a plain identifier, use [ident = expression()].

Q: So [ident] alone is equivalent to [ident = ident]?
A: You got the idea right, but you cannot actually write that. Use [ident] or [ident2 = ident].

Q: Can I mix assignments and single identifiers?
A: Sure.

Q: Where can I use it?
A: Anywhere a mixin() statement or declaration can be. It's just an alternative syntax to mixin per se. Potentially also everywhere a {}-block can be. (Let's see what others think about that.)

Q: Wait, what about mixin() expressions?
A: Unfortunately not those. Block mixins are for complete statements or delcarations (depending on the context). Note that the examples didn't use a semicolon at their end.

Q: Can I use the identifiers in [ and ] in the block mixin without being mixed in (i.e. "escaped" in some sense)?
A: No. Use [ident2 = ident]. Then ident will not be replaced by its value.

Syntax up for debate.

Disclaimer: I don't really know how q{} strings are lexed and parsed. I'd assume the they're lexed as separate tokens, parsed token-wise until the closing brace is found.

What do you think about it?
June 10, 2020
On Wednesday, 10 June 2020 at 03:03:49 UTC, Q. Schroll wrote:
> Suggestion:
>
>     mixin[op]
>     {
>         lhs op= rhs;
>     }
> ...
> evaluated (mixed in). So the above is equivalent to:
>
>     mixin("lhs ", op, "= rhs;");
>
> For a simple idea(*): Pack everything between the braces in a q{} string. "Interrupt" the q{} string anywhere an identifier `ident` of the bracketed list is found by "},ident,q{".
> (*) Doesn't work when braces are involved.
>
> A prime example is pseudo-code we have on the spec that doesn't compile. We often write stuff like op= and expect people to interpret it properly. With that, the compiler could, too.
>
> FAQ.
>
> Q: Can I use multiple identifiers?
> A: Yes. Comma-separated in the brackets. E.g. [ident1, ident2]

I like it. Actually I was thinking about it since my previous question and answer from Ali:

https://forum.dlang.org/post/ravtsk$1uav$1@digitalmars.com

The reason that template is so hard to be write tidily, is because D does not have an easy way to stringfy a token.

As a D beginner, I can come-up with this simple version:
https://forum.dlang.org/post/cggmhdgmsxsutangohlf@forum.dlang.org

--------------------
enum RW(string T, string name) =
  format(q{
    private %1$s _%2$s;
    public  %1$s  %2$s()        {return _%2$s;}
    public  auto  %2$s(%1$s v)  {_%2$s = v;  return this;}
  }, T, name);


class Point {
  mixin(RW!("int",     "x"));
  mixin(RW!("double",  "y"));
  mixin(RW!("string",  "z"));
}
--------------------

although it got the job done, but it's ugly:
1) in the implementation, string format and format marker (%1$s, _%2$s, etc.) all over the place.
2) from the mixin call-site, string are passed (mixin(RW!("int", "x"))), instead of just simple token.

Other people (esp. Ali) tried to work around this problem by using template parameter T for type, and dispatch(name) to stringfy that token. It is the only way to do it now in D, and it took an D language expert (Ali, book author) quite sometime to achieve this:

  mixin RW!int.x;    // <-- NICE :)

in his second attempt. And even he said it's "convoluted".

The language shouldn't be made it so hard to achieve something this simple, e.g. a novice C++ programmer can write this equivalent thing without too much effort:

-----------------------------
#define ATTR_NAME(name)  _##name

#define READER(type, name)      \
public:    type name () const {return ATTR_NAME(name);}

#define WRITER(type, name)      \
public:  void name(type val)  {ATTR_NAME(name) = val;}

#define READ_ONLY_ATTR(type, name)      \
protected: type ATTR_NAME(name);        \
           READER(type, name)

#define DECL_ATTR(type, name)   \
   READ_ONLY_ATTR(type, name)   \
           WRITER(type, name)

class Point {
  DECL_ATTR(int, x);
  DECL_ATTR(int, y);
  DECL_ATTR(int, z);
};
-----------------------------

And do you think this simple C++ macro version is much more easily readable & maintainable than the D expert's version: (copied from: https://forum.dlang.org/post/ravtsk$1uav$1@digitalmars.com)

-----------------------------
Ok, I solved that too with a very convoluted "eponymous mixin template opDispatch." :)

struct RW(T) {
  template opDispatch(string name) {
    static codeImpl() {
      import std.format;

      return format!q{
        private %s _%s;
        public auto %s() { return _%s; }
        public auto %s(%s val) { _%s = val; return this; }
      }(T.stringof, name,
        name, name,
        name, T.stringof, name);
    }

    mixin template opDispatch(alias code = codeImpl()) {
      mixin (code);
    }
  }
}

struct Point {
  mixin RW!int.x;    // <-- NICE :)
  mixin RW!int.y;
}

-----------------------------

Actually opDispatch plays the role of stringfy a token already, so *why not* just give it a proper mechanism of its own? Otherwise, people are enforced to write such *convoluted* code to be able to use it to stringfy a token.

In short, we have the mechanism already (thru opDispatch(...)), but please export it as first class D citizen.

I'm glad you made this proposal, and I hope it will be adopted.


June 10, 2020
On Wednesday, 10 June 2020 at 03:03:49 UTC, Q. Schroll wrote:

I like it, but I have some Qs for a Q

> Q: Can I use expressions?
> A: Instead of a plain identifier, use [ident = expression()].

Why expression()? I.e. why parentheses?
Evaluated how many times? Eagerly once? Lazily every time 'ident' is encountered within the {} ? Lazily once, the first time 'ident' is encountered within the {}?
June 10, 2020
On Wednesday, 10 June 2020 at 12:49:47 UTC, Stanislav Blinov wrote:
> On Wednesday, 10 June 2020 at 03:03:49 UTC, Q. Schroll wrote:
>
> I like it, but I have some Qs for a Q
>
>> Q: Can I use expressions?
>> A: Instead of a plain identifier, use [ident = expression()].
>
> Why expression()? I.e. why parentheses?

To indicate an evaluation may take place. I thought it were more understandable than without.

> Evaluated how many times? Eagerly once? Lazily every time 'ident' is encountered within the {} ? Lazily once, the first time 'ident' is encountered within the {}?

I'd suggest every assignment eagerly once. That's the version that seems the least confusing in most cases.

Imagine it like this:

    enum __ident = ident;

or

    enum __ident = expression();

depending on which one is used, and splicing "...", __ident, "..." into the parameters to regular mixin().

The same is done in foreach loops to the range expression; in foreach (element; range()), (range() is an expression), the range expression is evaluated once eagerly. Maybe, if you (or someone else) can make a compelling point to add `lazy ident = expr()` for the case where it should be evaluated lazily every time it's spliced in, I'd put it in the DIP.

Also:

Q: Why brackets?
A: Because any other syntax that came to my mind is either less readable or entails code breakage.
June 10, 2020
On Wednesday, 10 June 2020 at 04:03:46 UTC, mw wrote:
> I like it. Actually I was thinking about it since my previous question and answer from Ali:
>
> https://forum.dlang.org/post/ravtsk$1uav$1@digitalmars.com
>
> The reason that template is so hard to be write tidily, is because D does not have an easy way to stringfy a token.
>
> As a D beginner, I can come-up with this simple version:
> https://forum.dlang.org/post/cggmhdgmsxsutangohlf@forum.dlang.org
>
> --------------------
> enum RW(string T, string name) =
>   format(q{
>     private %1$s _%2$s;
>     public  %1$s  %2$s()        {return _%2$s;}
>     public  auto  %2$s(%1$s v)  {_%2$s = v;  return this;}
>   }, T, name);

Please don't do `string T`, typeName would be way better. Actually, my proposal doesn't improve that in length, but you could do

    // Assuming typeName and name in scope:
    mixin[typeName, name, _name = '_' ~ name]
    {
        private typeName _name;
        public typeName name() { return _name; }
        public ref name(return typeName value) { _name = value; return value; }
    }

In contrast to format() approaches, it at least can be readable.

At the points you're doing these:

> class Point {
>   mixin(RW!("int",     "x"));
>   mixin(RW!("double",  "y"));
>   mixin(RW!("string",  "z"));
> }

you'd need to copy the parts from above over and over. There's no (easy) way to make the stuff in the mixin[]{} a string literal that can be mixed in without resorting to, well, format() again. You could keep it to the []-list at least:

    enum RW(string typeName, string myname) = "
        mixin[tName = \"%s\", name = \"%s\", _name = \"_%s\"]
        {
            ...
        }
    ".format(typeName, myname, myname);

    mixin(RW!("int",     "x"));
    ...

I do think it's an improvement.

> Other people (esp. Ali) tried to work around this problem by using template parameter T for type, and dispatch(name) to stringfy that token. It is the only way to do it now in D, and it took an D language expert (Ali, book author) quite sometime to achieve this:
>
>   mixin RW!int.x;    // <-- NICE :)
>
> in his second attempt. And even he said it's "convoluted".
>
> The language shouldn't be made it so hard to achieve something this simple, e.g. a novice C++ programmer can write this equivalent thing without too much effort:
>
> -----------------------------
> #define ATTR_NAME(name)  _##name
>
> #define READER(type, name)      \
> public:    type name () const {return ATTR_NAME(name);}
>
> #define WRITER(type, name)      \
> public:  void name(type val)  {ATTR_NAME(name) = val;}
>
> #define READ_ONLY_ATTR(type, name)      \
> protected: type ATTR_NAME(name);        \
>            READER(type, name)
>
> #define DECL_ATTR(type, name)   \
>    READ_ONLY_ATTR(type, name)   \
>            WRITER(type, name)
>
> class Point {
>   DECL_ATTR(int, x);
>   DECL_ATTR(int, y);
>   DECL_ATTR(int, z);
> };
> -----------------------------
>
> And do you think this simple C++ macro version is much more easily readable & maintainable than the D expert's version?

I think the C++ macro does a good job here. Macros are not evil, just easy to misuse. C++-Macros have a ton of issues, but the mere idea of token-based replacement is not bad at all.

> struct RW(T) {
>   template opDispatch(string name) {
>     static codeImpl() {
>       import std.format;
>
>       return format!q{
>         private %s _%s;
>         public auto %s() { return _%s; }
>         public auto %s(%s val) { _%s = val; return this; }
>       }(T.stringof, name,
>         name, name,
>         name, T.stringof, name);
>     }
>
>     mixin template opDispatch(alias code = codeImpl()) {
>       mixin (code);
>     }
>   }
> }
>
> struct Point {
>   mixin RW!int.x;    // <-- NICE :)
>   mixin RW!int.y;
> }

I like that approach, too. So much better than the above! Yes, here, in the opDispatch, my proposal wouldn't shorten anything (it would never do this), but make it much more readable.

> In short, we have the mechanism already (thru opDispatch(...)), but please export it as first class D citizen.
>
> I'm glad you made this proposal, and I hope it will be adopted.

I'd also hope it flies.
June 11, 2020
On Wednesday, 10 June 2020 at 23:51:45 UTC, Q. Schroll wrote:

About this:

>     enum RW(string typeName, string myname) = "
>         mixin[tName = \"%s\", name = \"%s\", _name = \"_%s\"]
>         {
>             ...
>         }
>     ".format(typeName, myname, myname);

It shouldn't be a template but rather a function. (For Stefan)
June 11, 2020
On Wednesday, 10 June 2020 at 03:03:49 UTC, Q. Schroll wrote:
> Mixin declarations and statements are often used in conjunction with large literals containing lots of code, where stuff is inserted at specific points using interruptions like mixin("...", identifier, "...") or %s format specifiers and .format afterwards.
> It hurts readability. In the best case, code should directly express what we mean.
> We mean in those cases: Insert the value of identifier and not "identifier" here.
> An observation is also that hardly anywhere, the value of the identifier and not "identifier" are used together.
>
> Suggestion:
>
>     mixin[op]
>     {
>         lhs op= rhs;
>     }
>
> It parses the stuff between the { and } the same as a q{}-string, but any identifier token found that is identical to the one between [ and ] will not be used literally, but be evaluated (mixed in). So the above is equivalent to:
>
>     mixin("lhs ", op, "= rhs;");
>
> For a simple idea(*): Pack everything between the braces in a q{} string. "Interrupt" the q{} string anywhere an identifier `ident` of the bracketed list is found by "},ident,q{".
> (*) Doesn't work when braces are involved.
>
> A prime example is pseudo-code we have on the spec that doesn't compile. We often write stuff like op= and expect people to interpret it properly. With that, the compiler could, too.
>
> FAQ.
>
> Q: Can I use multiple identifiers?
> A: Yes. Comma-separated in the brackets. E.g. [ident1, ident2]
>
> Q: Can I use expressions?
> A: Instead of a plain identifier, use [ident = expression()].
>
> Q: So [ident] alone is equivalent to [ident = ident]?
> A: You got the idea right, but you cannot actually write that. Use [ident] or [ident2 = ident].
>
> Q: Can I mix assignments and single identifiers?
> A: Sure.
>
> Q: Where can I use it?
> A: Anywhere a mixin() statement or declaration can be. It's just an alternative syntax to mixin per se. Potentially also everywhere a {}-block can be. (Let's see what others think about that.)
>
> Q: Wait, what about mixin() expressions?
> A: Unfortunately not those. Block mixins are for complete statements or delcarations (depending on the context). Note that the examples didn't use a semicolon at their end.
>
> Q: Can I use the identifiers in [ and ] in the block mixin without being mixed in (i.e. "escaped" in some sense)?
> A: No. Use [ident2 = ident]. Then ident will not be replaced by its value.
>
> Syntax up for debate.
>
> Disclaimer: I don't really know how q{} strings are lexed and parsed. I'd assume the they're lexed as separate tokens, parsed token-wise until the closing brace is found.
>
> What do you think about it?

This and/or string interpolation would writing mixins more comfortable.
Another good point that have this solution, its that makes mixin code highlighted and IDEs could be his magic (autocomplete, refactor, etc) with mixing code.
June 11, 2020
On Thursday, 11 June 2020 at 00:20:35 UTC, Q. Schroll wrote:
> On Wednesday, 10 June 2020 at 23:51:45 UTC, Q. Schroll wrote:
>
> About this:
>
>>     enum RW(string typeName, string myname) = "
>>         mixin[tName = \"%s\", name = \"%s\", _name = \"_%s\"]
>>         {
>>             ...
>>         }
>>     ".format(typeName, myname, myname);
>
> It shouldn't be a template but rather a function. (For Stefan)

std.format.format is mighty expensive ;)

your proposal would make that a built-in right?
June 20, 2020
On Thursday, 11 June 2020 at 13:20:43 UTC, Stefan Koch wrote:
> On Thursday, 11 June 2020 at 00:20:35 UTC, Q. Schroll wrote:
>> On Wednesday, 10 June 2020 at 23:51:45 UTC, Q. Schroll wrote:
>>
>> About this:
>>
>>>     enum RW(string typeName, string myname) = "
>>>         mixin[tName = \"%s\", name = \"%s\", _name = \"_%s\"]
>>>         {
>>>             ...
>>>         }
>>>     ".format(typeName, myname, myname);
>>
>> It shouldn't be a template but rather a function. (For Stefan)
>
> std.format.format is mighty expensive ;)
>
> your proposal would make that a built-in right?

Not really. What format does here is putting strings together. My proposal never intended to splice in numeric or other things.
November 05, 2020
Here is the DIP PR: https://github.com/dlang/DIPs/pull/194 and the DIP document: https://github.com/Bolpat/DIPs/blob/TokenMixins/DIPs/DIP-1NN2-QFS.md

Please let me know of any suggestions that come to your mind in this thread or the PR discussion.
« First   ‹ Prev
1 2