Regarding hex strings (page 5) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Regarding hex strings (page 5)

October 20, 2012

Re: Regarding hex strings

Posted by Nick Sabalausky
in reply to foobar

Nick Sabalausky

Posted in reply to foobar

On Fri, 19 Oct 2012 15:07:09 +0200
"foobar" <foo@bar.com> wrote:

> On Friday, 19 October 2012 at 00:14:18 UTC, Nick Sabalausky wrote:
> > On Thu, 18 Oct 2012 12:11:13 +0200
> > "foobar" <foo@bar.com> wrote:
> >> 
> >> How often large binary blobs are literally spelled in the source code (as opposed to just being read from a file)?
> >
> >
> > Frequency isn't the issue. The issues are "*Is* it ever
> > needed?" and
> > "When it is needed, is it useful enough?" The answer to both is
> > most
> > certainly "yes". (Remember, D is supposed to usable as a systems
> > language, it's not merely a high-level-app-only language.)
> 
> Any real-world use cases to support this claim?

I've used it. And Denis just posted an example of where it was used to make code far more readable.

> Does C++ have such a feature?

It does not. As one consequence off the top of my head, including binary data into GBA homebrew became more of an awkward bloated mess than it needed to be.

> My limited experience with kernels is that this feature is not needed.

"I haven't needed it" isn't remotely sufficient to demonstrate that something doesn't "pull it's own weight".

> The solution we used for this was to define an extern symbol and load it with a linker script (the binary data was of course stored in separate files).
> 

Yuck!

s/solution/workaround/

> >
> > Keep in mind, the question "Does it pull it's own weight?" is
> > for
> > adding new features, not for going around gutting the language
> > just because we can.
> 
> Ok, I grant you that but remember that the whole thread started because the feature _doesn't_ work so lets rephrase - is it worth the effort to fix this feature?
> 

The only bug is that it tries to validate it as UTF contrary to the spec. Making it *not* try to validate it sounds like a very minor effort. I think you're blowing it out of proportion.

And yes, I think it's definitely worth it.

> >
> >> In any case, I'm not opposed to such a utility library, in
> >> fact I think it's a rather good idea and we already have a
> >> precedent with "oct!"
> >> I just don't think this belongs as a built-in feature in the
> >> language.
> >
> > I think monarch_dodra's test proves that it definitely needs to
> > be
> > built-in.
> 
> It proves that DMD has bugs that should be fixed, nothing more.

Right so let's jettison x"..." just because *someday* CTFE might become
good enough that we can bring the feature back. How does that make
any sense?

We already have it, it basically works (aside from only a fairly
trivial issue). *When* CTFE is good enough to replace it, *then* we can
have a sane debate about actually doing so. Until then, "Let's get
rid of x"..." because it can be done in the library" is a pointless
argument because at least for now it's NOT TRUE.

October 20, 2012

Re: Regarding hex strings

Posted by foobar
in reply to H. S. Teoh

foobar

Posted in reply to H. S. Teoh

On Saturday, 20 October 2012 at 21:03:20 UTC, H. S. Teoh wrote:
> On Sat, Oct 20, 2012 at 04:39:28PM -0400, Nick Sabalausky wrote:
>> On Sat, 20 Oct 2012 14:59:27 +0200
>> "foobar" <foo@bar.com> wrote:
>> > On Saturday, 20 October 2012 at 10:51:25 UTC, Denis Shelomovskij
>> > wrote:
>> > >
>> > > Maybe. Just an example of a real world code:
>> > >
>> > > Arrays:
>> > > https://github.com/D-Programming-Language/druntime/blob/fc45de1d089a1025df60ee2eea66ba27ee0bd99c/src/core/sys/windows/dll.d#L110
>> > >
>> > > vs
>> > >
>> > > Hex strings:
>> > > https://github.com/denis-sh/hooking/blob/69105a24d77fcb6eca701282a16dd5ec7311c077/tlsfixer/ntdll.d#L130
>> > >
>> > > By the way, current code isn't affected by the topic issue.
>> > 
>> > I personally find the former more readable but I guess there would always be someone to disagree. As the say, YMMV.
>> 
>> Honestly, I can't imagine how anyone wouldn't find the latter vastly
>> more readable.
>
> If you want vastly human readable, you want heredoc hex syntax,
> something like this:
>
> 	ubyte[] = x"<<END
> 	32 2b 32 3d 34 2e 20 32 2a 32 3d 34 2e 20 32 5e
> 	32 3d 34 2e 20 54 68 65 72 65 66 6f 72 65 2c 20
> 	2b 2c 20 2a 2c 20 61 6e 64 20 5e 20 61 72 65 20
> 	74 68 65 20 73 61 6d 65 20 6f 70 65 72 61 74 69
> 	6f 6e 2e 0a 22 36 34 30 4b 20 6f 75 67 68 74 20
> 	74 6f 20 62 65 20 65 6e 6f 75 67 68 22 20 2d 2d
> 	20 42 69 6c 6c 20 47 2e 2c 20 31 39 38 34 2e 20
> 	22 54 68 65 20 49 6e 74 65 72 6e 65 74 20 69 73
> 	20 6e 6f 74 20 61 20 70 72 69 6d 61 72 79 20 67
> 	6f 61 6c 20 66 6f 72 20 50 43 20 75 73 61 67 65
> 	END";
>
> (I just made that syntax up, so the details are not final, but you get
> the idea.) I would propose supporting this in D, but then D already has
> way too many different ways of writing strings, some of questionable
> utility, so I will refrain.
>
> Of course, the above syntax might actually be implementable with a
> suitable mixin template that takes a compile-time string. Maybe we
> should lobby for such a template to go into Phobos -- that might
> motivate people to fix CTFE in dmd so that it doesn't consume
> unreasonable amounts of memory when the size of CTFE input gets
> moderately large (see other recent thread on this topic).
>
>
> T

Yeah, I like this. I'd prefer brackets over quotes but it not a big dig as the qoutes in the above are not very noticeable. It should look distinct from textual strings.
As you said, this could/should be implemented as a template.

Vote++

October 20, 2012

Re: Regarding hex strings

Posted by foobar
in reply to foobar

foobar

Posted in reply to foobar

On Saturday, 20 October 2012 at 21:16:44 UTC, foobar wrote:
> On Saturday, 20 October 2012 at 21:03:20 UTC, H. S. Teoh wrote:
>> On Sat, Oct 20, 2012 at 04:39:28PM -0400, Nick Sabalausky wrote:
>>> On Sat, 20 Oct 2012 14:59:27 +0200
>>> "foobar" <foo@bar.com> wrote:
>>> > On Saturday, 20 October 2012 at 10:51:25 UTC, Denis Shelomovskij
>>> > wrote:
>>> > >
>>> > > Maybe. Just an example of a real world code:
>>> > >
>>> > > Arrays:
>>> > > https://github.com/D-Programming-Language/druntime/blob/fc45de1d089a1025df60ee2eea66ba27ee0bd99c/src/core/sys/windows/dll.d#L110
>>> > >
>>> > > vs
>>> > >
>>> > > Hex strings:
>>> > > https://github.com/denis-sh/hooking/blob/69105a24d77fcb6eca701282a16dd5ec7311c077/tlsfixer/ntdll.d#L130
>>> > >
>>> > > By the way, current code isn't affected by the topic issue.
>>> > 
>>> > I personally find the former more readable but I guess there would always be someone to disagree. As the say, YMMV.
>>> 
>>> Honestly, I can't imagine how anyone wouldn't find the latter vastly
>>> more readable.
>>
>> If you want vastly human readable, you want heredoc hex syntax,
>> something like this:
>>
>> 	ubyte[] = x"<<END
>> 	32 2b 32 3d 34 2e 20 32 2a 32 3d 34 2e 20 32 5e
>> 	32 3d 34 2e 20 54 68 65 72 65 66 6f 72 65 2c 20
>> 	2b 2c 20 2a 2c 20 61 6e 64 20 5e 20 61 72 65 20
>> 	74 68 65 20 73 61 6d 65 20 6f 70 65 72 61 74 69
>> 	6f 6e 2e 0a 22 36 34 30 4b 20 6f 75 67 68 74 20
>> 	74 6f 20 62 65 20 65 6e 6f 75 67 68 22 20 2d 2d
>> 	20 42 69 6c 6c 20 47 2e 2c 20 31 39 38 34 2e 20
>> 	22 54 68 65 20 49 6e 74 65 72 6e 65 74 20 69 73
>> 	20 6e 6f 74 20 61 20 70 72 69 6d 61 72 79 20 67
>> 	6f 61 6c 20 66 6f 72 20 50 43 20 75 73 61 67 65
>> 	END";
>>
>> (I just made that syntax up, so the details are not final, but you get
>> the idea.) I would propose supporting this in D, but then D already has
>> way too many different ways of writing strings, some of questionable
>> utility, so I will refrain.
>>
>> Of course, the above syntax might actually be implementable with a
>> suitable mixin template that takes a compile-time string. Maybe we
>> should lobby for such a template to go into Phobos -- that might
>> motivate people to fix CTFE in dmd so that it doesn't consume
>> unreasonable amounts of memory when the size of CTFE input gets
>> moderately large (see other recent thread on this topic).
>>
>>
>> T
>
> Yeah, I like this. I'd prefer brackets over quotes but it not a big dig as the qoutes in the above are not very noticeable. It should look distinct from textual strings.
> As you said, this could/should be implemented as a template.
>
> Vote++

** not a big deal

October 20, 2012

Re: Regarding hex strings

Posted by Nick Sabalausky
in reply to H. S. Teoh

Nick Sabalausky

Posted in reply to H. S. Teoh

On Sat, 20 Oct 2012 14:05:21 -0700
"H. S. Teoh" <hsteoh@quickfur.ath.cx> wrote:

> On Sat, Oct 20, 2012 at 04:39:28PM -0400, Nick Sabalausky wrote:
> > On Sat, 20 Oct 2012 14:59:27 +0200
> > "foobar" <foo@bar.com> wrote:
> > > On Saturday, 20 October 2012 at 10:51:25 UTC, Denis Shelomovskij wrote:
> > > >
> > > > Maybe. Just an example of a real world code:
> > > >
> > > > Arrays: https://github.com/D-Programming-Language/druntime/blob/fc45de1d089a1025df60ee2eea66ba27ee0bd99c/src/core/sys/windows/dll.d#L110
> > > >
> > > > vs
> > > >
> > > > Hex strings: https://github.com/denis-sh/hooking/blob/69105a24d77fcb6eca701282a16dd5ec7311c077/tlsfixer/ntdll.d#L130
> > > >
> > > > By the way, current code isn't affected by the topic issue.
> > > 
> > > I personally find the former more readable but I guess there would always be someone to disagree. As the say, YMMV.
> > 
> > Honestly, I can't imagine how anyone wouldn't find the latter vastly more readable.
> 
> If you want vastly human readable, you want heredoc hex syntax, something like this:
> 
> 	ubyte[] = x"<<END
> 	32 2b 32 3d 34 2e 20 32 2a 32 3d 34 2e 20 32 5e
> 	32 3d 34 2e 20 54 68 65 72 65 66 6f 72 65 2c 20
> 	2b 2c 20 2a 2c 20 61 6e 64 20 5e 20 61 72 65 20
> 	74 68 65 20 73 61 6d 65 20 6f 70 65 72 61 74 69
> 	6f 6e 2e 0a 22 36 34 30 4b 20 6f 75 67 68 74 20
> 	74 6f 20 62 65 20 65 6e 6f 75 67 68 22 20 2d 2d
> 	20 42 69 6c 6c 20 47 2e 2c 20 31 39 38 34 2e 20
> 	22 54 68 65 20 49 6e 74 65 72 6e 65 74 20 69 73
> 	20 6e 6f 74 20 61 20 70 72 69 6d 61 72 79 20 67
> 	6f 61 6c 20 66 6f 72 20 50 43 20 75 73 61 67 65
> 	END";
> 
> (I just made that syntax up, so the details are not final, but you get the idea.) I would propose supporting this in D, but then D already has way too many different ways of writing strings, some of questionable utility, so I will refrain.
> 
> Of course, the above syntax might actually be implementable with a suitable mixin template that takes a compile-time string. Maybe we should lobby for such a template to go into Phobos -- that might motivate people to fix CTFE in dmd so that it doesn't consume unreasonable amounts of memory when the size of CTFE input gets moderately large (see other recent thread on this topic).
> 

Can't you already just do this?:

 	auto blah = x"
 	32 2b 32 3d 34 2e 20 32 2a 32 3d 34 2e 20 32 5e
 	32 3d 34 2e 20 54 68 65 72 65 66 6f 72 65 2c 20
 	2b 2c 20 2a 2c 20 61 6e 64 20 5e 20 61 72 65 20
 	74 68 65 20 73 61 6d 65 20 6f 70 65 72 61 74 69
 	6f 6e 2e 0a 22 36 34 30 4b 20 6f 75 67 68 74 20
 	74 6f 20 62 65 20 65 6e 6f 75 67 68 22 20 2d 2d
 	20 42 69 6c 6c 20 47 2e 2c 20 31 39 38 34 2e 20
 	22 54 68 65 20 49 6e 74 65 72 6e 65 74 20 69 73
 	20 6e 6f 74 20 61 20 70 72 69 6d 61 72 79 20 67
 	6f 61 6c 20 66 6f 72 20 50 43 20 75 73 61 67 65
 	";

I thought all string literals in D accepted embedded newlines?

October 22, 2012

Re: Regarding hex strings

Posted by Dejan Lekic
in reply to H. S. Teoh

Dejan Lekic

Posted in reply to H. S. Teoh

>
> If you want vastly human readable, you want heredoc hex syntax,
> something like this:
>
> 	ubyte[] = x"<<END
> 	32 2b 32 3d 34 2e 20 32 2a 32 3d 34 2e 20 32 5e
> 	32 3d 34 2e 20 54 68 65 72 65 66 6f 72 65 2c 20
> 	2b 2c 20 2a 2c 20 61 6e 64 20 5e 20 61 72 65 20
> 	74 68 65 20 73 61 6d 65 20 6f 70 65 72 61 74 69
> 	6f 6e 2e 0a 22 36 34 30 4b 20 6f 75 67 68 74 20
> 	74 6f 20 62 65 20 65 6e 6f 75 67 68 22 20 2d 2d
> 	20 42 69 6c 6c 20 47 2e 2c 20 31 39 38 34 2e 20
> 	22 54 68 65 20 49 6e 74 65 72 6e 65 74 20 69 73
> 	20 6e 6f 74 20 61 20 70 72 69 6d 61 72 79 20 67
> 	6f 61 6c 20 66 6f 72 20 50 43 20 75 73 61 67 65
> 	END";
>

Having a heredoc syntax for hex-strings that produce ubyte[] arrays is confusing for people who would (naturally) expect a string from a heredoc string. It is not named hereDOC for no reason. :)

October 22, 2012

Re: Regarding hex strings

Posted by Dejan Lekic
in reply to bearophile

Dejan Lekic

Posted in reply to bearophile

On Thursday, 18 October 2012 at 00:45:12 UTC, bearophile wrote:
> (Repost)
>
> hex strings are useful, but I think they were invented in D1 when strings were convertible to char[]. But today they are an array of immutable UFT-8, so I think this default type is not so useful:
>
> void main() {
>     string data1 = x"A1 B2 C3 D4"; // OK
>     immutable(ubyte)[] data2 = x"A1 B2 C3 D4"; // error
> }
>
>
> test.d(3): Error: cannot implicitly convert expression ("\xa1\xb2\xc3\xd4") of type string to ubyte[]
>
>
> Generally I want to use hex strings to put binary data in a program, so usually it's a ubyte[] or uint[].
>
> So I have to use something like:
>
> auto data3 = cast(ubyte[])(x"A1 B2 C3 D4".dup);
>
>
> So maybe the following literals are more useful in D2:
>
> ubyte[] data4 = x[A1 B2 C3 D4];
> uint[]  data5 = x[A1 B2 C3 D4];
> ulong[] data6 = x[A1 B2 C3 D4 A1 B2 C3 D4];
>
> Bye,
> bearophile

+1 on this one
I also like the x[ ... ] literal because it makes it obvious that we are dealing with an array.

October 22, 2012

Re: Regarding hex strings

Posted by Simen Kjaeraas
in reply to bearophile

Simen Kjaeraas

Posted in reply to bearophile

On 2012-45-18 02:10, bearophile <bearophileHUGS@lycos.com> wrote:

> So maybe the following literals are more useful in D2:
>
> ubyte[] data4 = x[A1 B2 C3 D4];
> uint[]  data5 = x[A1 B2 C3 D4];
> ulong[] data6 = x[A1 B2 C3 D4 A1 B2 C3 D4];

That syntax is already taken, though.

Still, I see no reason for x"..." not to return ubyte[].

-- 
Simen

October 22, 2012

Re: Regarding hex strings

Posted by H. S. Teoh
in reply to Dejan Lekic

H. S. Teoh

Posted in reply to Dejan Lekic

On Mon, Oct 22, 2012 at 01:14:21PM +0200, Dejan Lekic wrote:
> >
> >If you want vastly human readable, you want heredoc hex syntax, something like this:
> >
> >	ubyte[] = x"<<END
> >	32 2b 32 3d 34 2e 20 32 2a 32 3d 34 2e 20 32 5e
> >	32 3d 34 2e 20 54 68 65 72 65 66 6f 72 65 2c 20
> >	2b 2c 20 2a 2c 20 61 6e 64 20 5e 20 61 72 65 20
> >	74 68 65 20 73 61 6d 65 20 6f 70 65 72 61 74 69
> >	6f 6e 2e 0a 22 36 34 30 4b 20 6f 75 67 68 74 20
> >	74 6f 20 62 65 20 65 6e 6f 75 67 68 22 20 2d 2d
> >	20 42 69 6c 6c 20 47 2e 2c 20 31 39 38 34 2e 20
> >	22 54 68 65 20 49 6e 74 65 72 6e 65 74 20 69 73
> >	20 6e 6f 74 20 61 20 70 72 69 6d 61 72 79 20 67
> >	6f 61 6c 20 66 6f 72 20 50 43 20 75 73 61 67 65
> >	END";
> >
> 
> Having a heredoc syntax for hex-strings that produce ubyte[] arrays is confusing for people who would (naturally) expect a string from a heredoc string. It is not named hereDOC for no reason. :)

What I meant was, a syntax similar to heredoc, not an actual heredoc, which would be a string.


T

-- 
Knowledge is that area of ignorance that we arrange and classify. -- Ambrose Bierce

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation