November 27, 2015
On Thursday, 26 November 2015 at 11:12:07 UTC, tcak wrote:
> What is needed is to be able to bind a hash value to any block with a name.

I've thought about this too in the past and asked on the forums but I haven't gotten any response.

It is possible. The problem is easier in dynamic languages. See for instance a the following solution in a specific Python runtime here: http://pgbovine.net/incpy.html

`hashOf` is for AAs not for content digests.

I believe the only realistic solution to this problem is to implement a specific pass in the D compiler that recursively calculates hash-digests (hash-chains) for all the code and data involved in a function call. It should probably only work for pure functions. AFAICT, it is possible but it's far from easy to get 100% correct :)

DMD pull requests should be very welcomed, at least by me ;)

See also: https://en.wikipedia.org/wiki/Hash_chain
November 27, 2015
On Friday, 27 November 2015 at 08:09:27 UTC, tcak wrote:
> Yours are not helping, making everything more complex.

Yes, because to achieve what you're asking for, you NEED a complex solution.

The code WILL change with every release..thats the point of a release.. so any hashing mechanism like you're describing will just trigger every time, making it useless. Even if this was not the case, you still wouldn't know where the changes were.

    Bit
November 27, 2015
On Friday, 27 November 2015 at 16:18:52 UTC, bitwise wrote:
> On Friday, 27 November 2015 at 08:09:27 UTC, tcak wrote:
>> Yours are not helping, making everything more complex.
>
> Yes, because to achieve what you're asking for, you NEED a complex solution.
>
> The code WILL change with every release..thats the point of a release.. so any hashing mechanism like you're describing will just trigger every time, making it useless. Even if this was not the case, you still wouldn't know where the changes were.
>
>     Bit

Let me explain:

It is not complex. What makes it complex is that you envision a very detailed thing.

Hash of a Function = MD5( Token List of Function /* but ignore comments */ );


You do not have to know where the changes are. You need to know what has changed,
how it acts currently briefly.


If behaviour of code changes, it is good that you know it. With above hashing method, a piece of code that hasn't changed would have same hash value always. And
if you do not like it, don't check the hash value. Just continue writing your codes as you wish. But in business perspective, if the software's consistency is worth millions of dollars, a software engineer would want it to be giving error whenever
codes change. Do we want D to be a child language, or have more useful features?
November 27, 2015
On Friday, 27 November 2015 at 18:51:54 UTC, tcak wrote:
> On Friday, 27 November 2015 at 16:18:52 UTC, bitwise wrote:
>> On Friday, 27 November 2015 at 08:09:27 UTC, tcak wrote:
>>> Yours are not helping, making everything more complex.
>>
>> Yes, because to achieve what you're asking for, you NEED a complex solution.
>>
>> The code WILL change with every release..thats the point of a release.. so any hashing mechanism like you're describing will just trigger every time, making it useless. Even if this was not the case, you still wouldn't know where the changes were.
>>
>>     Bit
>
> Let me explain:
>
> It is not complex. What makes it complex is that you envision a very detailed thing.
>
> Hash of a Function = MD5( Token List of Function /* but ignore comments */ );
>
>
> You do not have to know where the changes are. You need to know what has changed,
> how it acts currently briefly.
>
>
> If behaviour of code changes, it is good that you know it. With above hashing method, a piece of code that hasn't changed would have same hash value always. And
> if you do not like it, don't check the hash value. Just continue writing your codes as you wish. But in business perspective, if the software's consistency is worth millions of dollars, a software engineer would want it to be giving error whenever
> codes change. Do we want D to be a child language, or have more useful features?

Your approach is prone to false positives.

if(1) doSomething();
if(1) { doSomething(); }

Same behaviour, different code.
I hope you have a heck of a coding standard written up ;)

Worse still, consider the following example:

void foo() { if(bar()) deleteSomeFiles(); }
int bar() { return 0; }

Your proposed approach would not notify you that foo(), a potentially dangerous function, has changed it's behaviour if someone made bar() return 1.

*insert witty comeback to your comment about "business perspective" here*

    Bit

November 27, 2015
On Friday, 27 November 2015 at 20:00:16 UTC, bitwise wrote:
> On Friday, 27 November 2015 at 18:51:54 UTC, tcak wrote:
>> On Friday, 27 November 2015 at 16:18:52 UTC, bitwise wrote:
>>> On Friday, 27 November 2015 at 08:09:27 UTC, tcak wrote:
>>>> Yours are not helping, making everything more complex.
>>>
>>> Yes, because to achieve what you're asking for, you NEED a complex solution.
>>>
>>> The code WILL change with every release..thats the point of a release.. so any hashing mechanism like you're describing will just trigger every time, making it useless. Even if this was not the case, you still wouldn't know where the changes were.
>>>
>>>     Bit
>>
>> Let me explain:
>>
>> It is not complex. What makes it complex is that you envision a very detailed thing.
>>
>> Hash of a Function = MD5( Token List of Function /* but ignore comments */ );
>>
>>
>> You do not have to know where the changes are. You need to know what has changed,
>> how it acts currently briefly.
>>
>>
>> If behaviour of code changes, it is good that you know it. With above hashing method, a piece of code that hasn't changed would have same hash value always. And
>> if you do not like it, don't check the hash value. Just continue writing your codes as you wish. But in business perspective, if the software's consistency is worth millions of dollars, a software engineer would want it to be giving error whenever
>> codes change. Do we want D to be a child language, or have more useful features?
>
> Your approach is prone to false positives.
>
> if(1) doSomething();
> if(1) { doSomething(); }
>
> Same behaviour, different code.
> I hope you have a heck of a coding standard written up ;)
>
> Worse still, consider the following example:
>
> void foo() { if(bar()) deleteSomeFiles(); }
> int bar() { return 0; }
>
> Your proposed approach would not notify you that foo(), a potentially dangerous function, has changed it's behaviour if someone made bar() return 1.
>
> *insert witty comeback to your comment about "business perspective" here*
>
>     Bit

Question: Has the behaviour of foo changed?

If foo cares about bar's behaviour, foo checks bar's hash value.

--

if(1) doSomething();
if(1) { doSomething(); }

You are correct here about hash calculation, but unless someone touches to codes, this never happens, and no hash changes would be seen. If someone is touching it as you exampled, checking the documentation about what has happened would be the correct approach. Importance of behaviour change is perceptional, computer cannot know that already.
November 27, 2015
On Friday, 27 November 2015 at 20:19:40 UTC, tcak wrote:
> if(1) doSomething();
> if(1) { doSomething(); }
>
> You are correct here about hash calculation, but unless someone touches to codes, this never happens, and no hash changes would be seen. If someone is touching it as you exampled, checking the documentation about what has happened would be the correct approach. Importance of behaviour change is perceptional, computer cannot know that already.

If you really want to integrate this into the language, you should consider future improvements.

Hashing the tokens is a conservative approximation of "behavior change", as the example above shows. Another example would be variable renames. The specification of the hash algorithm should provide the freedom that both variants above get the same hash, but still be correct in the sense that different behavior always yields different hashes.

Overall, I'm not convinced that this needs to be a language extension or trait. It could simple a static analysis tool independent of the compiler.
November 27, 2015
On Friday, 27 November 2015 at 08:09:27 UTC, tcak wrote:
> On Friday, 27 November 2015 at 05:33:52 UTC, deadalnix wrote:
>> I see many solution here that do not require any language change. To start, have a linter yell at the programmer when (s)he submit a diff. Dev commit directly ? What the fuck are you doing ? Do code review and get a linter.
>>
>> Alternatively, generate a di file and hash it. You can have a bot do it and commit with a commit hook.
>>
>> DMD can dump infos about the program in json format. hash this and run with it.
>>
>> You may also change your strategy in term of source control: https://www.youtube.com/watch?v=W71BTkUbdqE . Unified source code aleviate completely these kind of issues to boot.
>
> Not one thing in your solutions give any simple solution like:
>
> static assert( __traits( hashOf, std.file.read ) == 0x1234, "They have changed implementation again." );
>
> static assert( __traits( hashOf, facebook.apis.addUser ) == 0x5543, "Check API documentation again for addUser." );
>
>
>
> di file wouldn't work. It doesn't contain implementation code. Also, all APIs are in it. We need specific hash for each API, so it doesn't take long time to find where the problem is.
>
> JSON is same as di. No difference.
>
>
> Yours are not helping, making everything more complex.

If the API signature change, the type system will yell at you. All the proposed solution will work.

If the implementation change, you can apply the same solution on the binary, tadaaa ! If you want less hash change, a good idea can be to dump llvm ir from ldc, and run the cannibalization on it using opt.

Also, if you have so much code that rely on implementation details that aren't in the API to the extent it is such a problem that you need language extension to handle it, you are doing something very very wrong.

Indeed I'm not helping. You think you need a language extension, when it is quite obvious you have some methodology problem on your side and refuse to reconsider.

What about, I know it is crazy, use a unified repository, have test and continuous integration, and submit diff with code review. If one change an API in a way that break the client code, the client ill fail and the CI tool will warn the developer that he needs to fix the client code or rework his API change. If the client code was not tested, then the problem is clearly not the API hash.

Not only this doesn't require language extension, but this solves way more problems than the one you want to solve here.

Now, don't get we wrong, I know how it is. Companies with broken work culture won't change anything unless the it is on the edge of bankruptcy. I understand. This is how it works.

Please understand that, on the other side, it doesn't seems like the right move to export broken work environment as language features.

1 2
Next ›   Last »