std.digest can't CTFE? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » std.digest can't CTFE?

Thread overview

Re: std.digest can't CTFE?
Jun 01, 2018 Johannes Pfau
Jun 01, 2018 Kagamin
Jun 01, 2018 Johannes Pfau
Jun 01, 2018 Kagamin
Jun 02, 2018 Atila Neves
Jun 08, 2018 Johannes Pfau
Jun 08, 2018 Manu
Jun 10, 2018 Johannes Pfau

June 01, 2018

Re: std.digest can't CTFE?

Posted by Johannes Pfau

Johannes Pfau

Am Thu, 31 May 2018 18:12:35 -0700 schrieb Manu:

> Hashing's not low-level. It would be great if these did CTFE; generating
> compile-time hashes is a thing that would be really useful!
> Right here, I have a string class that carries a hash around with it for
> comparison reasons. Such string literals would prefer to have CT hashes.
> 

As I was the one who wrote that doc comment: For basically all hash implementations you'll be casting from an integer type to the raw bytes representation somewhere. As the binary presentation needs to be portable, you need to be aware of the endianess of the system you're running your code on. AFAIR CTFE does (did?) not provide any way to do endianess-dependent conversions at all and there's also no way to know the CTFE endianess, so this is a fundamental limitation. (E.g. if you have a cross-compiler targeting a system with a different endianess, version(BigEndian) will give you the target endianess. But what will actually be used in CTFE?).

I don't know if anything changed in this regard since std.digest was written some time ago. But if you get the std.bitmanip  nativeTo*Endian and *EndianToNative functions to work in CTFE, std.digest should work as well.

There may be some workaround, as IIRC druntimes core.internal.hash works in CTFE? It's either this, or it's buggy in that cross-compilation scenario ;-)

-- 
Johannes

June 01, 2018

Re: std.digest can't CTFE?

Posted by Kagamin
in reply to Johannes Pfau

Kagamin

Posted in reply to Johannes Pfau

On Friday, 1 June 2018 at 08:37:33 UTC, Johannes Pfau wrote:
> I don't know if anything changed in this regard since std.digest was written some time ago. But if you get the std.bitmanip  nativeTo*Endian and *EndianToNative functions to work in CTFE, std.digest should work as well.

Standard cryptographic algorithms are by design not dependent on endianness, rather they set on a specific endianness.

June 01, 2018

Re: std.digest can't CTFE?

Posted by Johannes Pfau
in reply to Kagamin

Johannes Pfau

Posted in reply to Kagamin

Am Fri, 01 Jun 2018 08:50:19 +0000 schrieb Kagamin:

> On Friday, 1 June 2018 at 08:37:33 UTC, Johannes Pfau wrote:
>> I don't know if anything changed in this regard since std.digest was written some time ago. But if you get the std.bitmanip  nativeTo*Endian and *EndianToNative functions to work in CTFE, std.digest should work as well.
> 
> Standard cryptographic algorithms are by design not dependent on endianness, rather they set on a specific endianness.

However you want to call it, the algorithms interpret data as numbers which means that the binary representation differs based on endianess. If you want portable results, you can't ignore that fact in the implementation. So even though the algorithms are not dependent on the endianess, the representation of the result is. Therefore standards do usually propose an internal byte order.

-- 
Johannes

June 01, 2018

Re: std.digest can't CTFE?

Posted by Kagamin
in reply to Johannes Pfau

Kagamin

Posted in reply to Johannes Pfau

On Friday, 1 June 2018 at 10:04:52 UTC, Johannes Pfau wrote:
> However you want to call it, the algorithms interpret data as numbers which means that the binary representation differs based on endianess. If you want portable results, you can't ignore that fact in the implementation. So even though the algorithms are not dependent on the endianess, the representation of the result is. Therefore standards do usually propose an internal byte order.

Huh? The algorithm packs bytes into integers and does it independently of platform. Once integers are formed, the arithmetic operations are independent of endianness. It works this way even in pure javascript, which is not sensitive to endianness.

June 02, 2018

Re: std.digest can't CTFE?

Posted by Atila Neves
in reply to Kagamin

Atila Neves

Posted in reply to Kagamin

On Friday, 1 June 2018 at 20:12:23 UTC, Kagamin wrote:
> On Friday, 1 June 2018 at 10:04:52 UTC, Johannes Pfau wrote:
>> However you want to call it, the algorithms interpret data as numbers which means that the binary representation differs based on endianess. If you want portable results, you can't ignore that fact in the implementation. So even though the algorithms are not dependent on the endianess, the representation of the result is. Therefore standards do usually propose an internal byte order.
>
> Huh? The algorithm packs bytes into integers and does it independently of platform. Once integers are formed, the arithmetic operations are independent of endianness. It works this way even in pure javascript, which is not sensitive to endianness.

It's a common programming misconception that endianness matters much. It's one of those that just won't go away, like "GC languages are slow" or "C is magically fast". I recommend reading this:

https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html

In short, unless you're a compiler writer or implementing a binary protocol endianness only matters if you cast between pointers and integers. So... Don't.

Atila

June 08, 2018

Re: std.digest can't CTFE?

Posted by Johannes Pfau
in reply to Atila Neves

Johannes Pfau

Posted in reply to Atila Neves

Am Sat, 02 Jun 2018 06:31:37 +0000 schrieb Atila Neves:

> On Friday, 1 June 2018 at 20:12:23 UTC, Kagamin wrote:
>> On Friday, 1 June 2018 at 10:04:52 UTC, Johannes Pfau wrote:
>>> However you want to call it, the algorithms interpret data as numbers which means that the binary representation differs based on endianess. If you want portable results, you can't ignore that fact in the implementation. So even though the algorithms are not dependent on the endianess, the representation of the result is. Therefore standards do usually propose an internal byte order.
>>
>> Huh? The algorithm packs bytes into integers and does it independently of platform. Once integers are formed, the arithmetic operations are independent of endianness. It works this way even in pure javascript, which is not sensitive to endianness.
> 
> It's a common programming misconception that endianness matters much. It's one of those that just won't go away, like "GC languages are slow" or "C is magically fast". I recommend reading this:
> 
> https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html
> 
> In short, unless you're a compiler writer or implementing a binary protocol endianness only matters if you cast between pointers and integers. So... Don't.
> 
> Atila

That's an interesting point. When I said the algorithm depends on the
system endianess I was indeed always thinking in terms of machine code
(i.e. if system endianess=data endianess you hopefully do nothing at all,
otherwise you need some conversion).
But it is indeed true that describing conversion as mathematical shift
operations + indexing will leave handling these differences to the
compilers. So you can probably say the algorithm doesn't depend on system
endianess, although a low level representation of implementations will. I
guess this is what Kagamin wanted to explain, please excuse me for not
getting the point.

So in our case, we can obviously use that higher-abstraction-level interpretation and the idiom used in the article indeed works fine in CTFE. So somebody (@Manu?) just has to fix std.bitmanip *EndianToNative nativeTo*Endian functions to use this (probably benchmarking performance impacts). Then std.digest should simply start working or should at least be easy to fix for CTFE support.

-- 
Johannes

June 08, 2018

Re: std.digest can't CTFE?

Posted by Manu
in reply to Johannes Pfau

Manu

Posted in reply to Johannes Pfau

On Fri, 8 Jun 2018 at 11:35, Johannes Pfau via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> Am Sat, 02 Jun 2018 06:31:37 +0000 schrieb Atila Neves:
>
> > On Friday, 1 June 2018 at 20:12:23 UTC, Kagamin wrote:
> >> On Friday, 1 June 2018 at 10:04:52 UTC, Johannes Pfau wrote:
> >>> However you want to call it, the algorithms interpret data as numbers which means that the binary representation differs based on endianess. If you want portable results, you can't ignore that fact in the implementation. So even though the algorithms are not dependent on the endianess, the representation of the result is. Therefore standards do usually propose an internal byte order.
> >>
> >> Huh? The algorithm packs bytes into integers and does it independently of platform. Once integers are formed, the arithmetic operations are independent of endianness. It works this way even in pure javascript, which is not sensitive to endianness.
> >
> > It's a common programming misconception that endianness matters much. It's one of those that just won't go away, like "GC languages are slow" or "C is magically fast". I recommend reading this:
> >
> > https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html
> >
> > In short, unless you're a compiler writer or implementing a binary protocol endianness only matters if you cast between pointers and integers. So... Don't.
> >
> > Atila
>
> That's an interesting point. When I said the algorithm depends on the
> system endianess I was indeed always thinking in terms of machine code
> (i.e. if system endianess=data endianess you hopefully do nothing at all,
> otherwise you need some conversion).
> But it is indeed true that describing conversion as mathematical shift
> operations + indexing will leave handling these differences to the
> compilers. So you can probably say the algorithm doesn't depend on system
> endianess, although a low level representation of implementations will. I
> guess this is what Kagamin wanted to explain, please excuse me for not
> getting the point.
>
> So in our case, we can obviously use that higher-abstraction-level interpretation and the idiom used in the article indeed works fine in CTFE. So somebody (@Manu?) just has to fix std.bitmanip *EndianToNative nativeTo*Endian functions to use this (probably benchmarking performance impacts). Then std.digest should simply start working or should at least be easy to fix for CTFE support.

I'm already burning about 3x my reasonably allocate-able free time to
DMD PR's...
I'd really love if someone else would look at that :)

I'm not quite sure what you mean though; endian conversion functions
are still endian conversion functions, and they shouldn't be affected
here.
The problem is in the std.digest code where it *calls* endian
functions (or makes endian assumptions). There need be no reference to
endian in std.digest... if code is pulling bytes from an int (ie,
cast(byte*)) or something, just use ubyte[4] and index it instead if
uint, etc. I'm surprised that digest code would use anything other
than byte buffers.
It may be that there are some optimised version()-ed fast-paths might
be endian conscious, but the default path has no reason to not work.

June 10, 2018

Re: std.digest can't CTFE?

Posted by Johannes Pfau
in reply to Manu

Johannes Pfau

Posted in reply to Manu

Am Fri, 08 Jun 2018 11:46:41 -0700 schrieb Manu:
> 
> I'm already burning about 3x my reasonably allocate-able free time to
> DMD PR's...
> I'd really love if someone else would look at that :)

I'll see if I can allocate some time for that. Should be a mostly trivial change.

> I'm not quite sure what you mean though; endian conversion functions are still endian conversion functions, and they shouldn't be affected here.

Yes, but the point made in that article is that you can implement *Endian<=>native conversions without knowing the native endianness. This would immediately make these functions CTFE-able.

> The problem is in the std.digest code where it *calls* endian functions (or makes endian assumptions). There need be no reference to endian in std.digest... if code is pulling bytes from an int (ie, cast(byte*)) or something, just use ubyte[4] and index it instead if uint, etc. I'm surprised that digest code would use anything other than byte buffers. It may be that there are some optimised version()-ed fast-paths might be endian conscious, but the default path has no reason to not work.

That's not how hash algorithms are usually specified. These algorithms perform bit rotate operations, additions, multiplications on these values*. You could probably implement these on byte[4] values instead, but you'll waste time porting the algorithm, benchmarking possible performance impacts and it will be more difficult to compare the implementation to the reference implementation (think of audits).

So it's not realistic to change this.

* An interesting question here is if you could actually always ignore system endianess and do simple casts when cleverly adjusting all constants in the algorithm to fit?
-- 
Johannes

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation