August 11, 2012
Thiez:

> A few extra instructions (a CMOV followed by ADD should suffice, yes?) seems like a small price to pay if it can prevent bugs. Why hasn't the Python-modulo been made the default back when D was designed?

Maybe because C99 requires that kind of modulus, and D tries to act as C99 where possible. It's sad.

Bye,
bearophile
August 11, 2012
> A few extra instructions (a CMOV followed by ADD should suffice, yes?) seems like a small price to pay if it can prevent bugs.

The price would really be quite insignificant since IDIV takes tens of cycles and the additional work needed to make module behave intuitively would be just a few cycles. Besides, modulo of unsigned integers would still work the same, so you could just use that when you cared about the difference in performance. So the only situation where you couldn't avoid performance penalty would be when you needed to calculate modulo of signed integers and you actually wanted the current behavior. I think that is a pretty rare situtation.

> Why hasn't the Python-modulo been made the default back when D was designed?

Probably because this is one of the design goals of D: Where D code looks the same as C code, have it either behave the same or issue an error.
August 12, 2012
On 11-08-2012 20:54, Thiez wrote:
> On Saturday, 11 August 2012 at 17:15:21 UTC, Norbert Nemec wrote:
>> On 11.08.2012 18:13, bearophile wrote:
>>> David:
>>>
>>>> Thanks! I thought modulo should alawys yield the same ... seems like I
>>>> was wrong ;)
>>>
>>> It's C design that's wrong.
>>
>> And it's the processor design that makes it inefficient to correct it
>> nowadays.
>>
>> Python's definition of modulo is far more useful than C's. Implemented
>> in machine code, however, it takes several additional commands because
>> the integer division is hardwired to the C definition. I guess that
>> hardware integer division in processors became popular only when C was
>> already widely in use.
>
> A few extra instructions (a CMOV followed by ADD should suffice, yes?)
> seems like a small price to pay if it can prevent bugs. Why hasn't the
> Python-modulo been made the default back when D was designed? The
> ever-so-slightly more efficient C-modulo could be provided in a library.
> Of course it's way too late to change it now...

Keep in mind that not all CPUs have cmov, so you end up having to branch based on whether the CPU has it or not.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
August 12, 2012
> Besides, modulo of unsigned integers would
> still work the same, so you could just use that when you cared about the
> difference in performance. So the only situation where you couldn't
> avoid performance penalty would be when you needed to calculate modulo
> of signed integers and you actually wanted the current behavior. I think
> that is a pretty rare situtation.

My implementation of py_mod:

int py_mod(int a, int b) {
    return cast(uint)a % b;
}

So a simple cast from signed to unsigned does exactly what I need (Prf_Jakob showed me that in IRC)

How much does a cast cost? I mean is that even an extra asm instruction?

>> Why hasn't the Python-modulo been made the default back when D was
>> designed?
>
> Probably because this is one of the design goals of D: Where D code
> looks the same as C code, have it either behave the same or issue an error.


Unfortunatly.
August 12, 2012
On Sunday, 12 August 2012 at 08:43:53 UTC, Alex Rønne Petersen wrote:
> On 11-08-2012 20:54, Thiez wrote:
>> On Saturday, 11 August 2012 at 17:15:21 UTC, Norbert Nemec wrote:
>>> On 11.08.2012 18:13, bearophile wrote:
>>>> David:
>>>>
>>>>> Thanks! I thought modulo should alawys yield the same ... seems like I
>>>>> was wrong ;)
>>>>
>>>> It's C design that's wrong.
>>>
>>> And it's the processor design that makes it inefficient to correct it
>>> nowadays.
>>>
>>> Python's definition of modulo is far more useful than C's. Implemented
>>> in machine code, however, it takes several additional commands because
>>> the integer division is hardwired to the C definition. I guess that
>>> hardware integer division in processors became popular only when C was
>>> already widely in use.
>>
>> A few extra instructions (a CMOV followed by ADD should suffice, yes?)
>> seems like a small price to pay if it can prevent bugs. Why hasn't the
>> Python-modulo been made the default back when D was designed? The
>> ever-so-slightly more efficient C-modulo could be provided in a library.
>> Of course it's way too late to change it now...
>
> Keep in mind that not all CPUs have cmov, so you end up having to branch based on whether the CPU has it or not.

I think the following trick would work even on the 80386 (386+ should have the SETcc instructions):
assume divident in AEX, divisor in EBX or ECX (not EDX, obviously). I'll assume it's in EBX.
 CDQ // sign extend EAX to EDX:EAX
 IDIV EBX // signed division
 XOR EAX,EAX // clear EAX
 BT EDX,31 // check if sign-bit of remainder is set
 SETNC AL // AL = (EDX negative ? 0 : 1)
 DEC AEX // EAX = (EDX negative ? 0xFFFF : 0)
 AND EAX,EBX // EAX = (EDX negative ? EBX : 0)
 ADD EDX,EAX // EAX = the correct modulo result

Obviously it's less appealing than a CMOV, but it should work (in theory, I didn't test it...) and perhaps someone smarter could make it even shorter. Apart form the IDIV, all instructions should take only a single clock cycle on most x86s as far as I'm aware.

As for other CPU architectures, perhaps those actually have a 'sane' modulo operator, which would make the hypothetical presence or absence of a CMOV irrelevant :)
August 12, 2012
On Sunday, 12 August 2012 at 13:11:36 UTC, David wrote:
> My implementation of py_mod:
>
> int py_mod(int a, int b) {
>     return cast(uint)a % b;
> }
>
> So a simple cast from signed to unsigned does exactly what I need (Prf_Jakob showed me that in IRC)

It seems to work when b is a power of 2, but not for other numbers. Example: -3 % 7 = 4, but cast(uint)-3 = 4294967293, and 4294967293 % 7 = 1.

> How much does a cast cost? I mean is that even an extra asm instruction?

Casting ints to uints and back is free because the bits can remain the same (only the interpretation changes).
August 12, 2012
> It's a localized but important design bug of languages like C that D has carried over for backward compatibility reasons. D % has caused bugs in my code. I suggest to not use the built-in % on numbers that can be negative,

Better to show an example, reduced from a larger program. This
little function has to look for the next true value in a circular
array 'data', before or after a given starting index, according
to a int 'movement' variable that is allowed to be only +1 or -1:


This is Python code, it correctly performs the wrap-around in all
cases:


def find_next(data, index, movement):
     assert movement == 1 or movement == -1
     while True:
         index = (index + movement) % len(data)
         if data[index]:
             break
     return index

#         0      1      2      3      4     5
data = [False, False, False, False, True, False]
print find_next(data, 5, -1) # OK, prints 4
print find_next(data, 0, -1) # OK, prints 4


The D code doesn't work, because of the C %:


size_t findNext(in bool[] data, size_t index, in int movement)
pure nothrow in {
     assert(movement == 1 || movement == -1);
} body {
     do {
         index = (index + movement) % data.length;
     } while (!data[index]);
     return index;
}

void main() {
     import std.stdio;
     //              0      1      2      3      4     5
     const data = [false, false, false, false, true, false];

     writeln(findNext(data, 5, -1)); // OK, prints 4
     writeln(findNext(data, 0, -1)); // not OK
}


Bye,
bearophile
September 07, 2012
On Sat, 11 Aug 2012 09:48:15 -0400, David <d@dav1d.de> wrote:

> -1 % 16 = -1
>
> Shouldn't that be 15? It seems like the sign is ignored for the modulo.
>
> Is this a bug or intended behaviour? The Python implementation returns here, as expected, 15.

In case you are still looking for an expression which mods always positive (even for non-powers of 2)

(a % b + b) % b

Obviously, b must be positive.

The direct translation of this to assembly isn't the most efficient way.

Technically, in assembly we could check for the sign bit, and only add b again (without doing the second mod) if the value was negative.  But I'm not sure you can write an expression that causes that, maybe:

a % b < 0 ? a % b + b : a % b

Certainly not as appealing.  To get a direct translation from the above assembly, it would be something like:

typeof(a) x;

auto expr = (x = a % b) < 0 ? x + b : x

-Steve
September 07, 2012
On Saturday, 11 August 2012 at 18:54:09 UTC, Thiez wrote:
> A few extra instructions (a CMOV followed by ADD should suffice, yes?) seems like a small price to pay if it can prevent bugs. Why hasn't the Python-modulo been made the default back when D was designed? The ever-so-slightly more efficient C-modulo could be provided in a library. Of course it's way too late to change it now...

According to TDPL regarding integral math: "Anything that compiles in both C and D WILL produce the same results".

Anything less would make porting code a nightmare.
=>It is absolutely not conceivable to change "%" semantics'.

However, D *could* provide the "Mathematician's Modulo" operator, either as a library functionality, or as a new operator.

I guess there hasn't been "a strong need" yet.
September 07, 2012
monarch_dodra:

> I guess there hasn't been "a strong need" yet.

Or some people have fulfilled this need since years (D1) writing
their own function.

Regarding an operator, %% ? :-)

Bye,
bearophile