Thread overview
enum ubyte[] vs enum ubyte[3]
Dec 20, 2010
Johannes Pfau
Dec 20, 2010
bearophile
Dec 20, 2010
Johannes Pfau
Dec 20, 2010
bearophile
Dec 20, 2010
Nick Voronin
Dec 20, 2010
Johannes Pfau
Dec 21, 2010
Nick Voronin
Dec 21, 2010
Johannes Pfau
Dec 20, 2010
Jonathan M Davis
December 20, 2010
Hi,
I'm currently patching Ragel (http://www.complang.org/ragel/) to generate D2 compatible code. Right now it creates output like this for static arrays:
------------------------
enum ubyte[] _parseResponseLine_key_offsets = [
	0, 0, 17, 18, 37, 41, 42, 44,
	50, 51, 57, 58, 78, 98, 118, 136,
	138, 141, 143, 146, 148, 150, 152, 153,
	159, 160, 160, 162, 164
];
------------------------
Making it output "enum ubyte[30]" would be more complicated, so I wonder if there's a difference between "enum ubyte[]" and "enum ubyte[30]"?

-- 
Johannes Pfau
December 20, 2010
Johannes Pfau:

Hello Johannes and thank you for developing your tool for D2 too :-)


> Making it output "enum ubyte[30]" would be more complicated, so I wonder if there's a difference between "enum ubyte[]" and "enum ubyte[30]"?

In D1 a enum ubyte[] is a compile-time constant dynamic array of unsigned bytes, it is a 2 word long struct that contains a pointer and a length.
In D1 you express the same thing with "const ubyte[]".

In D2 a "enum ubyte[30]" is a compile-time constant fixed size array of 32 unsigned bytes that gets passed around by value.
In D1 a "const ubyte[30]" is a compile-time constant fixed size array of 32 unsigned bytes that gets passed around by reference.

So they are two different things and you use one or the other according to your needs. Currently there are also some low performance issues in D with enums that get re-created each time you use them (this is true for associative arrays, but I don't remember if this is true for dynamic arrays too). So better to take a look at the produced asm to be sure, if you want to avoid performance pitfalls.

Regardless the array kind you want to use, also take a look at "Hex Strings":
http://www.digitalmars.com/d/2.0/lex.html
That allow you to write bytes arrays as hex data:
x"00 FBCD 32FD 0A"

Bye,
bearophile
December 20, 2010
At 20.12.2010, 11:02, bearophile wrote <bearophileHUGS@lycos.com>:
>
> Hello Johannes and thank you for developing your tool for D2 too :-)
>
Actually it's not mine, I'm just a regular user. I don't think I could ever understand the finite state machine code (especially because it's c++), but patching the c/d1 codegen to output d2 code is easy enough ;-)

> In D1 a enum ubyte[] is a compile-time constant dynamic array of unsigned bytes, it is a 2 word long struct that contains a pointer and a length.
Did you mean in D2? I feared that, so I'll have to do some extra work...

> In D2 a "enum ubyte[30]" is a compile-time constant fixed size array of 32 unsigned bytes that gets passed around by value.
Yep, that's what I want.

>
> Regardless the array kind you want to use, also take a look at "Hex Strings":
> http://www.digitalmars.com/d/2.0/lex.html
> That allow you to write bytes arrays as hex data:
> x"00 FBCD 32FD 0A"
That's interesting, I'll have a look at it, but ragel shares big parts of the c/c++/d code, so as long as the C syntax works there's no need to change that.

>
> Bye,
> bearophile

Thanks for your help!

-- 
Johannes Pfau
December 20, 2010
Johannes Pfau:

> Did you mean in D2?

Right, sorry.
Bye,
bearophile
December 20, 2010
On Mon, 20 Dec 2010 10:26:16 +0100
"Johannes Pfau" <spam@example.com> wrote:

> Hi,
> I'm currently patching Ragel (http://www.complang.org/ragel/) to generate
> D2 compatible code.

Interesting. Ragel-generated code works fine for me in D2. I suppose it mostly uses such a restricted C-like subset of language that it didn't change much from D1 to D2. But if you are going to patch it, please make it add extra {} around action code! The thing is that when there is a label before {} block (and in ragel generated code I saw it's always so) the block isn't considered as a new scope which causes problems when you have local variables declaration inside actions.

Anyway, good luck with whatever you plan :) Ragel is cool.

> Right now it creates output like this for static arrays:
> ------------------------
> enum ubyte[] _parseResponseLine_key_offsets = [
> 	0, 0, 17, 18, 37, 41, 42, 44,
> 	50, 51, 57, 58, 78, 98, 118, 136,
> 	138, 141, 143, 146, 148, 150, 152, 153,
> 	159, 160, 160, 162, 164
> ];
> ------------------------
> Making it output "enum ubyte[30]" would be more complicated, so I wonder if there's a difference between "enum ubyte[]" and "enum ubyte[30]"?

One is fixed size array and other is dynamic. Honestly I doubt that it matters for code generated by Ragel, since this is constant and won't be passed around. If it's harder to make it fixed-size then don't bother.

-- 
Nick Voronin <elfy.nv@gmail.com>
December 20, 2010
On Monday 20 December 2010 01:26:16 Johannes Pfau wrote:
> Hi,
> I'm currently patching Ragel (http://www.complang.org/ragel/) to generate
> D2 compatible code. Right now it creates output like this for static
> arrays:
> ------------------------
> enum ubyte[] _parseResponseLine_key_offsets = [
> 	0, 0, 17, 18, 37, 41, 42, 44,
> 	50, 51, 57, 58, 78, 98, 118, 136,
> 	138, 141, 143, 146, 148, 150, 152, 153,
> 	159, 160, 160, 162, 164
> ];
> ------------------------
> Making it output "enum ubyte[30]" would be more complicated, so I wonder if there's a difference between "enum ubyte[]" and "enum ubyte[30]"?

ubyte[] is a dynamic array. ubyte[30] is a static array. They are inherently different types. The fact that you're dealing with an enum is irrelevant. So, the code that you're generating is _not_ a static array. It's a dynamic array. This is inherently different from C or C++ where having [] on a type (whether it has a number or not) is _always_ a static array.

- Jonathan M Davis
December 20, 2010
On Monday, December 20, 2010, Nick Voronin <elfy.nv@gmail.com> wrote:

> On Mon, 20 Dec 2010 10:26:16 +0100
> "Johannes Pfau" <spam@example.com> wrote:
>
>> Hi,
>> I'm currently patching Ragel (http://www.complang.org/ragel/) to generate
>> D2 compatible code.
>
> Interesting. Ragel-generated code works fine for me in D2. I suppose it mostly uses such a restricted C-like subset of language that it didn't change much from D1 to D2.
The most important change is const correctness. Because of that table based output didn't work with D2. And you couldn't directly pass const data (like string.ptr) to Ragel.

> But if you are going to patch it, please make it add extra {} around action code! The thing is that when there is a label before {} block (and in ragel generated code I saw it's always so) the block isn't considered as a new scope which causes problems when you have local variables declaration inside actions.

You mean like this code:
---------------------------------
tr15:
#line 228 "jpf/http/parser.rl"
    {
        if(start != p)
        {
            key = line[(start - line.ptr) .. (p - line.ptr)];
        }
    }
---------------------------------
should become: ?
---------------------------------
tr15:
#line 228 "jpf/http/parser.rl"
    {{
        if(start != p)
        {
            key = line[(start - line.ptr) .. (p - line.ptr)];
        }
    }}
---------------------------------

> One is fixed size array and other is dynamic. Honestly I doubt that it matters for code generated by Ragel, since this is constant and won't be passed around. If it's harder to make it fixed-size then don't bother.
>
Could a dynamic array cause heap allocations, even if it's data is never changed? If not, dynamic arrays would work fine.

-- 
Johannes Pfau
December 21, 2010
On Mon, 20 Dec 2010 17:17:05 +0100
"Johannes Pfau" <spam@example.com> wrote:

> > But if you are going to patch it, please make it add extra {} around action code! The thing is that when there is a label before {} block (and in ragel generated code I saw it's always so) the block isn't considered as a new scope which causes problems when you have local variables declaration inside actions.
> 
> You mean like this code:
> ---------------------------------
> tr15:
> #line 228 "jpf/http/parser.rl"
>      {
>          if(start != p)
>          {
>              key = line[(start - line.ptr) .. (p - line.ptr)];
>          }
>      }
> ---------------------------------
> should become: ?
> ---------------------------------
> tr15:
> #line 228 "jpf/http/parser.rl"
>      {{
>          if(start != p)
>          {
>              key = line[(start - line.ptr) .. (p - line.ptr)];
>          }
>      }}
> ---------------------------------

Yes. This way it becomes a scope which is kind of what one would expect from it.


> > One is fixed size array and other is dynamic. Honestly I doubt that it matters for code generated by Ragel, since this is constant and won't be passed around. If it's harder to make it fixed-size then don't bother.
> >
> Could a dynamic array cause heap allocations, even if it's data is never changed? If not, dynamic arrays would work fine.

Sorry, I can't provide reliable information on what can happen in general, but right now there is no difference in produced code accessing elements of enum ubyte[] and enum ubyte[30]. In both cases constants are directly embedded in code.

In fact as long as you only access its elements (no passing array as an argument, no assignment to another variable and no accessing .ptr) there is no array object at all. If you do -- new object is created every time you do. I believe Ragel doesn't generate code which passes tables around, so it doesn't matter.

-- 
Nick Voronin <elfy.nv@gmail.com>
December 21, 2010
On Tuesday, December 21, 2010, Nick Voronin <elfy.nv@gmail.com> wrote:

> On Mon, 20 Dec 2010 17:17:05 +0100
> "Johannes Pfau" <spam@example.com> wrote:
>
>> > But if you are going to patch it, please make it add extra {} around
>> > action code! The thing is that when there is a label before {} block
>> > (and in ragel generated code I saw it's always so) the block isn't
>> > considered as a new scope which causes problems when you have local
>> > variables declaration inside actions.
>>
>> You mean like this code:
>> ---------------------------------
>> tr15:
>> #line 228 "jpf/http/parser.rl"
>>      {
>>          if(start != p)
>>          {
>>              key = line[(start - line.ptr) .. (p - line.ptr)];
>>          }
>>      }
>> ---------------------------------
>> should become: ?
>> ---------------------------------
>> tr15:
>> #line 228 "jpf/http/parser.rl"
>>      {{
>>          if(start != p)
>>          {
>>              key = line[(start - line.ptr) .. (p - line.ptr)];
>>          }
>>      }}
>> ---------------------------------
>
> Yes. This way it becomes a scope which is kind of what one would expect from it.
OK, I sent an updated patch to the ragel mailing list.

>
>> > One is fixed size array and other is dynamic. Honestly I doubt that it
>> > matters for code generated by Ragel, since this is constant and won't  
>> be
>> > passed around. If it's harder to make it fixed-size then don't bother.
>> >
>> Could a dynamic array cause heap allocations, even if it's data is never
>> changed? If not, dynamic arrays would work fine.
>
> Sorry, I can't provide reliable information on what can happen in general, but right now there is no difference in produced code accessing elements of enum ubyte[] and enum ubyte[30]. In both cases constants are directly embedded in code.
>
> In fact as long as you only access its elements (no passing array as an argument, no assignment to another variable and no accessing .ptr) there is no array object at all. If you do -- new object is created every time you do. I believe Ragel doesn't generate code which passes tables around, so it doesn't matter.
>
Well Adrian Thurston said he'd look into this issue when he merges the D2 patch, so I guess we'll get the correct arrays anyway ;-)

-- 
Johannes Pfau