Thread overview
Make an enum idiomatic D - enhancing GNU Bison's Dlang support
Sep 01, 2020
Adela Vais
Sep 01, 2020
H. S. Teoh
Sep 02, 2020
Adela Vais
Sep 02, 2020
Alexandru Ermicioi
Sep 03, 2020
Bastiaan Veelo
Sep 03, 2020
Bastiaan Veelo
September 01, 2020
Hello,

I am working on GNU Bison, in order to enhance the current Dlang support.
The issue I have is regarding only D, there is no need to be familiar with Bison.

I need some advice on how to modify the current form of an enum, SymbolKind, to make it more idiomatic D (or leave it as it is - similar to how it is handled for the C++ support).

The problem is explained here [0]. (Java allows methods in enums [1].)

The way I modified it is by wrapping it in a struct and adding functionalities to change the already existing interaction with the enum as little as possible [2]. The code I changed in the repository is written in M4, so I will add the output after Bison is run:

  /* Symbol kinds.  */
  struct SymbolKind
  {
    public enum SymbolKindEnum
    {
    S_YYEMPTY = -2,  /* No symbol.  */
    S_YYEOF = 0,                   /* "end of file"  */
    S_YYerror = 1,                 /* error  */
    ...
    S_input = 14,                  /* input  */
    S_line = 15,                   /* line  */
    S_exp = 16,                    /* exp  */
    }

    private SymbolKindEnum yycode_;
    alias SymbolKindEnum this;

    this(int code)
    {
      yycode_ = cast(SymbolKindEnum) code;
    }

    public void opAssign(in SymbolKindEnum code)
    {
      yycode_ = code;
    }

    public bool opEquals(const int s)
    {
      return yycode_ == s;
    }

    SymbolKindEnum value() const
    {
      return yycode_;
    }

    /* YYTNAME[SYMBOL-NUM] -- String name of the symbol SYMBOL-NUM.
       First, the terminals, then, starting at \a YYNTOKENS_, nonterminals.  */
    private static string[] yytname_ = [
  "\"end of file\"", "error", "\"invalid token\"", "\"=\"", "\"+\"",
  "\"-\"", "\"*\"", "\"/\"", "\"(\"", "\")\"", "\"end of line\"",
  "\"number\"", "UNARY", "$accept", "input", "line", "exp", null
    ];

    public string toString() const
    {
      return yytname_[yycode_];
    }
  }

Should I modify it like this, or is it overkill? Is there a better way to deal with this?


[0]: https://github.com/akimd/bison/blob/master/TODO#L255
[1]: https://docs.oracle.com/javase/tutorial/java/javaOO/enum.html
[2]: https://github.com/adelavais/bison/tree/symbol-kind
September 01, 2020
On Tue, Sep 01, 2020 at 05:47:24PM +0000, Adela Vais via Digitalmars-d wrote: [...]
> I need some advice on how to modify the current form of an enum, SymbolKind, to make it more idiomatic D (or leave it as it is - similar to how it is handled for the C++ support).
> 
> The problem is explained here [0]. (Java allows methods in enums [1].)
> 
> The way I modified it is by wrapping it in a struct and adding functionalities to change the already existing interaction with the enum as little as possible [2]. The code I changed in the repository is written in M4, so I will add the output after Bison is run:

There is no need to wrap SymbolKind in any way.  Idiomatic D code can easily introspect the enum to obtain its string representation; e.g., the standard library std.format already prints the string representation when handed an enum:

	enum SymbolKind {
		S_YYEMPTY = -2,
		S_YY_EOF = 0,
		S_YYerror = 1,
	}

	import std;
	pragma(msg, format("%d", SymbolKind.S_YYEMPTY));
	pragma(msg, format("%s", SymbolKind.S_YYEMPTY));

Output:
	-2
	S_YYEMPTY

All you have to do is to ensure that when the enum value is to be printed out, you specify to use the string representation.  Depending on whether you choose to have @nogc compatibility, you may or may not use std.format, but either way, it's not hard to obtain the string representation of an enum value in D.

I'd only wrap an enum if I need to do something really rare and unusual, like operator overloading or enum inheritance. For normal usage, I'd just leave the enum the way it is. (Even adding methods to an enum can be simulated with UFCS, so I wouldn't even wrap an enum in that case. Only for truly unusual use cases would I do it.)


T

-- 
We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare.  Now, thanks to the Internet, we know this is not true. -- Robert Wilensk
September 02, 2020
On Tuesday, 1 September 2020 at 18:18:06 UTC, H. S. Teoh wrote:
> 	enum SymbolKind {
> 		S_YYEMPTY = -2,
> 		S_YY_EOF = 0,
> 		S_YYerror = 1,
> 	}
>
> 	import std;
> 	pragma(msg, format("%d", SymbolKind.S_YYEMPTY));
> 	pragma(msg, format("%s", SymbolKind.S_YYEMPTY));
>
> Output:
> 	-2
> 	S_YYEMPTY
>
> All you have to do is to ensure that when the enum value is to be printed out, you specify to use the string representation.  Depending on whether you choose to have @nogc compatibility, you may or may not use std.format, but either way, it's not hard to obtain the string representation of an enum value in D.


Thank you for your answer!

The problem is that I need specifically the Bison-generated string representation, so S_YYEOF.toString should give "\"end of file\"", not "S_YYEOF".

I think I *can* change the name of the elements of the structure, but I don't think it's a great idea, as they are also automatically generated by Bison and the other languages' parsers leave them as is.

The string representation is needed only in case of an error output message. The enum's members are used as integers throughout the program, so I can't do something like:

  enum SymbolKind {
    S_YYEOF = "\"end of file\"",
    S_YYerror = "error",
    ...
  }

Based on what you said about using UFCS in my advantage, I thought about adding a method (to the class that contains the enum) that gives the string representation. But I have to think about how I can avoid a call like this.toString(token). Should I try to do this?
September 02, 2020
On Wednesday, 2 September 2020 at 13:20:14 UTC, Adela Vais wrote:
> On Tuesday, 1 September 2020 at 18:18:06 UTC, H. S. Teoh wrote:
>> [...]
>
>
> Thank you for your answer!
>
> The problem is that I need specifically the Bison-generated string representation, so S_YYEOF.toString should give "\"end of file\"", not "S_YYEOF".
>
> I think I *can* change the name of the elements of the structure, but I don't think it's a great idea, as they are also automatically generated by Bison and the other languages' parsers leave them as is.
>
> The string representation is needed only in case of an error output message. The enum's members are used as integers throughout the program, so I can't do something like:
>
>   enum SymbolKind {
>     S_YYEOF = "\"end of file\"",
>     S_YYerror = "error",
>     ...
>   }
>
> Based on what you said about using UFCS in my advantage, I thought about adding a method (to the class that contains the enum) that gives the string representation. But I have to think about how I can avoid a call like this.toString(token). Should I try to do this?

One design pattern that you can use is attaching additional information to each enum member by using UDAs [1]. Here's an example: https://run.dlang.io/gist/PetarKirov/f725d1f7e22d7a31860898d2d37489d0

That way you can keep the same integer-based enum, but associate with each element arbitrary things like strings, struct objects, types, functions, etc.

[1]: http://ddili.org/ders/d.en/uda.html
September 02, 2020
On Wednesday, 2 September 2020 at 13:20:14 UTC, Adela Vais wrote:
> ...

You may try use a struct as enum type that holds both, order, and string message. Then you can have also toString method that will return your message during printing/or any string operation, while being used as value in rest of the cases through your code.

You may also make it immutable if needed.

- Alex.
September 03, 2020
On Wednesday, 2 September 2020 at 13:20:14 UTC, Adela Vais wrote:
> Thank you for your answer!
>
> The problem is that I need specifically the Bison-generated string representation, so S_YYEOF.toString should give "\"end of file\"", not "S_YYEOF".

I think what you want is also suggested by H.S. Teoh:

>> (Even adding methods to an enum can be simulated with UFCS, so I wouldn't even wrap an enum in that case. Only for truly unusual use cases would I do it.)

What he means is you can define a free function that can be used as if it were a member function or property of a type by using UFCS. This is how (https://run.dlang.io/is/u3EBeW):

import std.stdio;

enum SymbolKindEnum {
    S_YYEMPTY = -2,  /* No symbol.  */
    S_YYEOF = 0,     /* "end of file"  */
    S_YYerror = 1,   /* error  */
    /* ... */
    S_input = 14,    /* input  */
    S_line = 15,     /* line  */
    S_exp = 16,      /* exp  */
}

string toString(SymbolKindEnum kind) pure @nogc
{
    final switch (kind) with(SymbolKindEnum) {
        case S_YYEMPTY:
            assert(false, "Tried representing no symbol as a string.");
        case S_YYEOF:
            return `"end of file"`;
        case S_YYerror:
            return `error`;
        /*...*/
        case S_input:
            return `input`;
        case S_line:
            return `line`;
        case S_exp:
            return `exp`;
    }
}

void main()
{
    auto kind = SymbolKindEnum.S_YYEOF;
    writeln(kind.toString); // "end of file"
}



-- Bastiaan.
September 03, 2020
On Thursday, 3 September 2020 at 14:05:10 UTC, Bastiaan Veelo wrote:
> enum SymbolKindEnum {
>     S_YYEMPTY = -2,  /* No symbol.  */
>     S_YYEOF = 0,     /* "end of file"  */
>     S_YYerror = 1,   /* error  */
>     /* ... */
>     S_input = 14,    /* input  */
>     S_line = 15,     /* line  */
>     S_exp = 16,      /* exp  */
> }

By the way, since you are specifically asking about idiomatic D, the common prefix "S_" in enum members is unnecessary in D, since D already punts these in a separate name space. I can immagine you'd prefer to keep things like it is to have some uniformity among back ends, but if D were the only one you could do this instead:

enum S {
     YYEMPTY = -2,  /* No symbol.  */
     YYEOF = 0,     /* "end of file"  */
     YYerror = 1,   /* error  */
     /* ... */
     input = 14,    /* input  */
     line = 15,     /* line  */
     exp = 16,      /* exp  */
}

And refer to enum members as S.YYEOF (instead of S_YYEOF in other back ends).

-- Bastiaan.