最近在用python,偶尔有一次涉及到了负数取余操作: -3 % 4
按照之前写c/c++、java的习惯,这里肯定是等于-3。但是python下返回的结果竟然是1!
为什么会有不同的表现呢?先从取余操作本身说起:
关于取余(modulo operation)
取余的定义
由于不同的架构有不同的数字表示法和运算法,因此他们对于取余的定义也有可能略有不同。
一般来说,公认的取余操作(被除数a、除数n、商q以及余数r,n % a = r)需要满足以下三条:
然而这个定义是非常宽泛的。以 -3 % 4为例:
-3既可以表示为:-3 = 4 * -1 + 1
也可以表示为:-3 = 4 * 0 + (-3)
这两种表示法分别对应了两种实现方法:Truncated Division及 Floored Division。两种实现方法的求商方式不同,导致其最终结果也不同。
Truncated Division
在Truncated Division下,求商时,采用的是q = trunc(a/n) ,即将算数除法结果的小数部分移除,余下的整数部分作为商。
(也有一种更形象的说法,除法取整时,是朝着靠近0的方向取整)
以-3 % 4为例:
q = trunc(-3/4) = trunc(-0.75) = 0
因此,余数r = a - nq = -3 - (0 * 4) = -3
此实现有如下特点:
1. 余数的符号与被除数相同
2. 这种实现在某些地方被称之为取模
Floored Division
在Floored Division下,求商时,采用的是q = ⌊a/n⌋,即将算数除法结果进行向下取整作为商。
以-3 % 4为例:
q = ⌊-3/4⌋ = ⌊-0.75⌋ = -1
因此,余数r = a - nq = -3 - (-1 * 4) = 1
此实现有如下特点:
1. 余数的符号与除数相同
2. 这种实现在某些地方被称之为求余
关于语言标准
由上可知,存在着多种实现方式,那么具体地,c和python是如何选取的呢?
C语言标准中关于取余运算的说明
C89中,并未做严格规定:
When integers are divided and the division is inexact, if both operands are positive the result of the/ operator is the largest integer less than the algebraic quotient and the result of the % operator is positive. If either operand is negative, whether the result of the / operator is the largest integer less than the algebraic quotient or the smallest integer greater than the algebraic quotient is implementation-defined, as is the sign of the result of the % operator. If the quotient a/b is representable, the expression (a/b)*b + a%b shall equal a.
而C99,明确指定了需要使用Truncated Version:
When integers are divided, the result of the / operator is the algebraic quotient with any fractional part discarded.88) If the quotient a/b is representable, the expression (a/b)*b + a%b shall equal a.
88) This is often called "truncation toward zero".
Python标准中关于取余运算的说明
Python 2:从说明中关于余数符号的部分,可以推断出其指定了Floored Version
The % (modulo) operator yields the remainder from the division of the first argument by the second. The numeric arguments are first converted to a common type. A zero right argument raises the ZeroDivisionError exception. The arguments may be floating point numbers, e.g., 3.14%0.7 equals 0.34 (since 3.14 equals 4*0.7 + 0.34.) The modulo operator always yields a result with the same sign as its second operand (or zero); the absolute value of the result is strictly smaller than the absolute value of the second operand [2].
The integer division and modulo operators are connected by the following identity: x == (x/y)*y + (x%y). Integer division and modulo are also connected with the built-in function divmod(): divmod(x, y) == (x/y,x%y). These identities don’t hold for floating point numbers; there similar identities hold approximately where x/y is replaced by floor(x/y) or floor(x/y) - 1 [3].
Python 3:和2一样,可推断出指定Floored Version
The % (modulo) operator yields the remainder from the division of the first argument by the second. The numeric arguments are first converted to a common type. A zero right argument raises the ZeroDivisionError exception. The arguments may be floating point numbers, e.g., 3.14%0.7 equals 0.34 (since 3.14 equals 4*0.7 + 0.34.) The modulo operator always yields a result with the same sign as its second operand (or zero); the absolute value of the result is strictly smaller than the absolute value of the second operand [1].
The floor division and modulo operators are connected by the following identity: x == (x//y)*y + (x%y). Floor division and modulo are also connected with the built-in function divmod(): divmod(x, y) == (x//y, x%y). [2].
结论
一些推断及其他
最后,来小小地猜测一下,为什么c语言和python会选取不同的实现方案。
对于c:
在X86架构下,对于整数的除法和取余操作,都是通过同一个指令进行的(有符号除法使用idiv,而无符号为div),商和余数会同时作为结果返回在寄存器AX和DX中。
这个特性被c的库函数divmod所利用,导致编译器有可能会将除法及取余操作都优化为使用一个指令完成。
这样一来,通过如下三条:
1. c的除法正好为trunc除(即Truncated Version中取商的方式)
2. 通常程序里除法出现的频率都远远高过取余的频率,取余可能会被优化为一个指令
3. c语言设计风格为性能和简洁优先,而c++又要与c兼容
因此c语言采用了Truncated Version。
而这也为一个常用的判断埋下了坑:
如果你想判断一个数是否是奇数,
1 | bool isEven(int x){ return x % 2 = 1} |
会埋下大坑(对负数存在bug)
而需要写成
1 | bool isEven(int x){ return x % 2 != 0} |
对于python:
不会像c那样将性能放到第一位,会比c有更多的语法糖,更贴近常用思维,因此使用Floored Version会更加适合。
参考文献
https://en.wikipedia.org/wiki/Modulo_operation#Remainder_calculation_for_the_modulo_operation
https://docs.python.org/2/reference/expressions.html#arithmetic-conversions
http://stackoverflow.com/questions/3609572/does-either-ansi-c-or-iso-c-specify-what-5-10-should-be
http://stackoverflow.com/questions/11630321/why-does-c-output-negative-numbers-when-using-modulo
7 条评论
bool isOdd(int x){ return x % 2 != 0}
这样写,是奇数返回True吧?
已经改了!
不错,不错,看看了!
博客多久更新一次?
看看!
博客不错,嘎嘎!
讲的太棒了。