November 01, 2018
On Thu, Nov 01, 2018 at 01:59:35PM -0400, Steven Schveighoffer via Digitalmars-d wrote:
> On 11/1/18 10:29 AM, Atila Neves wrote:
[...]
> > Unit tests should go through the public interface like anyone else. All private members and functions are implementation details that shouldn't matter and shouldn't break any tests if they're all removed.
> 
> So... don't test private functions? I do all the time. I want to test every single function I can.

Exactly.  There's value to white-box testing. For one thing, it allows you to guard yourself against regressions in implementation details that would otherwise be hard or infeasible to test from outside.

But black-box testing is also needed, e.g., to specify expected external behaviours that should not break when the implementation details change.

IMO, we should have both.


[...]
> However, I would *love* to be able to specify unit tests that *can't* access private internals for testing purposes. IIRC, it was proposed at some point to require this for documented unittests.
[...]

That's a good idea.  Nothing is worse than documentation that references private members, that you copy-n-paste only to find that you can't write the code that way.

Also, for proper black-box testing, it would be nice to have the assurance that it's actually specifying outside-observable behaviour rather than implementation details.

The two seem related, though.  If a unittest test tests outside behaviour, chances are it should be a ddoc'd unittest too, so that users know what the expected outside behaviour is.  Perhaps it makes sense to conflate the two, so you could specify a black-box test just by attaching a ddoc comment to it.


T

-- 
Elegant or ugly code as well as fine or rude sentences have something in common: they don't depend on the language. -- Luca De Vitis
November 01, 2018
On 11/1/18 2:46 PM, Patrick Schluter wrote:
> On Thursday, 1 November 2018 at 14:28:27 UTC, Atila Neves wrote:
>> On Tuesday, 30 October 2018 at 08:18:57 UTC, Bastiaan Veelo wrote:
>>> On Tuesday, 30 October 2018 at 00:01:18 UTC, unprotected-entity wrote:
>>>> On Monday, 29 October 2018 at 22:24:22 UTC, Bastiaan Veelo wrote:
>>>>>
>>>>> I hear you and understand you, but personally I still prefer the D spec as it is in this regard, except for the confusing aspect that you shouldn’t `alias this` a private member; but that is rather a limitation of alias, not of encapsulation.
>>>>>
>>>>
>>>> If your were sitting on a plane that was running the following D code, you might think differently ;-)
>>>>
>>>> --------
>>>> class Plane
>>>> {
>>>>     private static int MAX_SPEED = 1000;
>>>>
>>>>     public void change_max_speed(int speed)
>>>>     {
>>>>         if(speed >= 1000)
>>>>             MAX_SPEED = speed;
>>>>     }
>>>>
>>>> }
>>>>
>>>> immutable Plane p = new Plane();
>>>>
>>>> // god no! why aren't you using the public interface of the class for this!
>>>> void SetMaxSpeed() { p.MAX_SPEED = -1; }
>>>>
>>>> void Bar() { SetMaxSpeed(); } // mmm...trouble is about to begin...
>>>>
>>>> void main()
>>>> {
>>>>     import std.stdio;
>>>>
>>>>     Bar(); // oohps. thanks D module. we're all going to die!
>>>>
>>>>     // by the time you see this, it's too late for you, and your passengers.
>>>>     writeln(p.MAX_SPEED);
>>>>
>>>> };
>>>>
>>>>
>>>> -------
>>>
>>> :-) Why is MAX_SPEED mutable or even a runtime value, let alone signed?
>>
>> Strictly my opinion below, but I believe wholeheartedly in this:
>>
>> Using unsigned integers for anything except bit patterns and talking to hardware is an open invitation to bugs. The C++ community has recognised this and admitted that using size_t everywhere in the STL was a mistake.
>>
>> "Ah, but I want type safety!". Uh-huh:
>>
>> -----------------
>> int[] ints;
>> // indexing uses size_t, but...
>> auto i = ints[-1];  // look ma, no errors or warnings!
>> -----------------
>>
>> The problem in D and C++ is that they inherit behaviour from C, and in this case the important part of that inheritance is that integers of different types convert implicitly to each other. There is *no* type-safety when using unsigned integers. And because bit-patterns and actual integers share the same type now, people will invariably do something "they're not supposed to" and perform arithmetic with these values and usually wrap around. It's not pretty and it's not fun to debug.
>>
>> I have literally lost count of how many bugs I've had to fix due to unsigned integers. `long` is large enough for anything you need, as Java has shown for 2 decades now.
>>
>> Friends don't let friends use unsigned types for counting.
> 
> My experience is exactly the opposite. I inherited a code base of ~200K lines of C written over 3 decades by several developers of varying skills. You can not imagine how many bugs I found thanks to replacing int and long by size_t and uint32_t/uint64_t.
> 
> int i = 0;
> ints[--i];  // look ma, no errors or warnings! indeed
> 
> with size_t i at least you get segfault or a bus error.

My testing looks like it's the same:

#include <stdio.h>
int main()
{
    int arr[5];
    int *ptr = arr + 1; // point at second element so we can do negative indexing legally
    int idx1 = 0;
    ptr[--idx1] = 5;
    printf("%d\n", arr[0]); // 5
    size_t idx2 = 0;
    ptr[--idx2] = 6;
    printf("%d\n", arr[0]); // 6
    return 0;
}

-Steve
November 01, 2018
On Thursday, 1 November 2018 at 19:26:05 UTC, Steven Schveighoffer wrote:
>
> My testing looks like it's the same:
>
> #include <stdio.h>
> int main()
> {
>     int arr[5];
>     int *ptr = arr + 1; // point at second element so we can do negative indexing legally
>     int idx1 = 0;
>     ptr[--idx1] = 5;
>     printf("%d\n", arr[0]); // 5
>     size_t idx2 = 0;
>     ptr[--idx2] = 6;
>     printf("%d\n", arr[0]); // 6
>     return 0;
> }
>
> -Steve

`ptr[--idx2]` is undefined behavior (out-of-bounds array access), so the fact that it works that way is just a coincidence.
November 01, 2018
On 11/1/18 4:15 PM, Paul Backus wrote:
> On Thursday, 1 November 2018 at 19:26:05 UTC, Steven Schveighoffer wrote:
>>
>> My testing looks like it's the same:
>>
>> #include <stdio.h>
>> int main()
>> {
>>     int arr[5];
>>     int *ptr = arr + 1; // point at second element so we can do negative indexing legally
>>     int idx1 = 0;
>>     ptr[--idx1] = 5;
>>     printf("%d\n", arr[0]); // 5
>>     size_t idx2 = 0;
>>     ptr[--idx2] = 6;
>>     printf("%d\n", arr[0]); // 6
>>     return 0;
>> }
>>
> 
> `ptr[--idx2]` is undefined behavior (out-of-bounds array access), so the fact that it works that way is just a coincidence.

Either way, I don't see it causing a segfault -- the purported benefit.

But maybe on Patrick's actual arch it did. In any case, undefined behavior is bad whether you do it with an int or size_t :)

-Steve
November 01, 2018
On Thursday, 1 November 2018 at 19:26:05 UTC, Steven Schveighoffer wrote:
> On 11/1/18 2:46 PM, Patrick Schluter wrote:
>> [...]
>
> My testing looks like it's the same:
>
> #include <stdio.h>
> int main()
> {
>     int arr[5];
>     int *ptr = arr + 1; // point at second element so we can do negative indexing legally
>     int idx1 = 0;
>     ptr[--idx1] = 5;
>     printf("%d\n", arr[0]); // 5
>     size_t idx2 = 0;
>     ptr[--idx2] = 6;
>     printf("%d\n", arr[0]); // 6
>     return 0;
> }
>
Compiled with -m64 ? On a 32 bit machine it makes indeed no difference as the effective calculated address is in any case modulo 2^32-1
(a + 2^32-1) % 2^32 == a-1

November 01, 2018
On 11/1/18 4:47 PM, Patrick Schluter wrote:
> On Thursday, 1 November 2018 at 19:26:05 UTC, Steven Schveighoffer wrote:
>> On 11/1/18 2:46 PM, Patrick Schluter wrote:
>>> [...]
>>
>> My testing looks like it's the same:
>>
>> #include <stdio.h>
>> int main()
>> {
>>     int arr[5];
>>     int *ptr = arr + 1; // point at second element so we can do negative indexing legally
>>     int idx1 = 0;
>>     ptr[--idx1] = 5;
>>     printf("%d\n", arr[0]); // 5
>>     size_t idx2 = 0;
>>     ptr[--idx2] = 6;
>>     printf("%d\n", arr[0]); // 6
>>     return 0;
>> }
>>
> Compiled with -m64 ? On a 32 bit machine it makes indeed no difference as the effective calculated address is in any case modulo 2^32-1
> (a + 2^32-1) % 2^32 == a-1
> 

amd64 system (Intel Mac)

I wouldn't be surprised if on other architectures, you get a bus error.

-Steve
November 01, 2018
On Thursday, 1 November 2018 at 20:15:57 UTC, Paul Backus wrote:
> On Thursday, 1 November 2018 at 19:26:05 UTC, Steven Schveighoffer wrote:
>>
>> My testing looks like it's the same:
>>
>> #include <stdio.h>
>> int main()
>> {
>>     int arr[5];
>>     int *ptr = arr + 1; // point at second element so we can do negative indexing legally
>>     int idx1 = 0;
>>     ptr[--idx1] = 5;
>>     printf("%d\n", arr[0]); // 5
>>     size_t idx2 = 0;
>>     ptr[--idx2] = 6;
>>     printf("%d\n", arr[0]); // 6
>>     return 0;
>> }
>>
>> -Steve
>
> `ptr[--idx2]` is undefined behavior (out-of-bounds array access), so the fact that it works that way is just a coincidence.

It works also because the optimizer eludes the ptr accesses. So to test the example, at least put -O0.

        mov     esi, 5
        mov     edi, OFFSET FLAT:.LC0
        xor     eax, eax
        call    printf
        mov     esi, 6
        mov     edi, OFFSET FLAT:.LC0
        xor     eax, eax
        call    printf

and in real life, the erroneaous accesses where I had the index underflowing the array were much more complicated and have not been found for decades. We had noticed sometimes that things were wrong in the documents processed, but it was impossible to trace back were the error was. After refactoring to use more agressively size_t a whole bunch of these glitchy glitches transformed into nice actionable debugable core dumps.
November 01, 2018
On Thursday, 1 November 2018 at 21:00:21 UTC, Steven Schveighoffer wrote:
> On 11/1/18 4:47 PM, Patrick Schluter wrote:
>> On Thursday, 1 November 2018 at 19:26:05 UTC, Steven Schveighoffer wrote:
>>>[...]
>> Compiled with -m64 ? On a 32 bit machine it makes indeed no difference as the effective calculated address is in any case modulo 2^32-1
>> (a + 2^32-1) % 2^32 == a-1
>> 
>
> amd64 system (Intel Mac)
>
> I wouldn't be surprised if on other architectures, you get a bus error.
>
the optimizer removed all the array code, in your example and only kept the printf with precalculated constants. In gcc and in clang.

November 01, 2018
On 11/1/18 5:08 PM, Patrick Schluter wrote:
> On Thursday, 1 November 2018 at 21:00:21 UTC, Steven Schveighoffer wrote:
>> On 11/1/18 4:47 PM, Patrick Schluter wrote:
>>> On Thursday, 1 November 2018 at 19:26:05 UTC, Steven Schveighoffer wrote:
>>>> [...]
>>> Compiled with -m64 ? On a 32 bit machine it makes indeed no difference as the effective calculated address is in any case modulo 2^32-1
>>> (a + 2^32-1) % 2^32 == a-1
>>>
>>
>> amd64 system (Intel Mac)
>>
>> I wouldn't be surprised if on other architectures, you get a bus error.
>>
> the optimizer removed all the array code, in your example and only kept the printf with precalculated constants. In gcc and in clang.
> 

Well, I didn't specify any optimizations, so I'm not sure that's the case.

-Steve
November 01, 2018
On 11/1/18 5:03 PM, Patrick Schluter wrote:
> On Thursday, 1 November 2018 at 20:15:57 UTC, Paul Backus wrote:
>> On Thursday, 1 November 2018 at 19:26:05 UTC, Steven Schveighoffer wrote:
>>>
>>> My testing looks like it's the same:
>>>
>>> #include <stdio.h>
>>> int main()
>>> {
>>>     int arr[5];
>>>     int *ptr = arr + 1; // point at second element so we can do negative indexing legally
>>>     int idx1 = 0;
>>>     ptr[--idx1] = 5;
>>>     printf("%d\n", arr[0]); // 5
>>>     size_t idx2 = 0;
>>>     ptr[--idx2] = 6;
>>>     printf("%d\n", arr[0]); // 6
>>>     return 0;
>>> }
>>>
>>
>> `ptr[--idx2]` is undefined behavior (out-of-bounds array access), so the fact that it works that way is just a coincidence.
> 
> It works also because the optimizer eludes the ptr accesses. So to test the example, at least put -O0.
> 
>          mov     esi, 5
>          mov     edi, OFFSET FLAT:.LC0
>          xor     eax, eax
>          call    printf
>          mov     esi, 6
>          mov     edi, OFFSET FLAT:.LC0
>          xor     eax, eax
>          call    printf

As I said, I didn't use optimizations. And I still didn't get any segfaults.

> 
> and in real life, the erroneaous accesses where I had the index underflowing the array were much more complicated and have not been found for decades. We had noticed sometimes that things were wrong in the documents processed, but it was impossible to trace back were the error was. After refactoring to use more agressively size_t a whole bunch of these glitchy glitches transformed into nice actionable debugable core dumps.

Well, one thing that is very possible, is that if you use uint on a 64-bit system, you will get segfaults (potentially).

-Steve