Binary GCD algorithm

The binary GCD algorithm is an algorithm published by J. Stein in 1967 which computes the greatest common divisor of two positive integers. It gains a measure of efficiency over the ancient Euclidean algorithm by avoiding divisions and replacing them with bitwise operations that are cheaper when operating on the binary representation used by modern computers. This is particularly critical on embedded platforms that have no direct processor support for division.

Contents

Concept

The algorithm reduces the problem of finding the GCD by repeatedly applying these identities:

  1. If u = 0, then gcd(u,v) = v, since everything divides zero, and v is the largest number that divides v.
  2. If u is even and v is even, then GCD(u,v) = 2·GCD(u ÷ 2, v ÷ 2), since we know 2 is a common divisor.
  3. If u is even and v is odd, then GCD(u,v) = GCD(u ÷ 2, v), since we know 2 is not a common divisor.
  4. If u is odd and v is even, then GCD(u,v) = GCD(u, v ÷ 2), similarly.
  5. If u is odd and v is odd, then GCD(u,v) = GCD(|u-v| ÷ 2, v) = GCD(u, |u-v| ÷ 2). This is a combination of one step of the simple Euclidean algorithm, which uses subtraction at each step, and an application of identity 3 or 4 above. We know the division by 2 is possible because the difference of two odd numbers is even. We choose the version that makes the new operand smallest.

The algorithm

Following is a straightforward implementation of the algorithm in the C programming language, taking two positive arguments u and v and using only bit operations and subtraction. It first removes all common factors of 2 using identity 1, computes the gcd of the remaining numbers using identities 2 through 5, then combines these to form the final answer:

 unsigned int gcd(unsigned int u, unsigned int v) {
      unsigned int k = 0;
      while ((u & 1) == 0  &&  (v & 1) == 0) { /* while both u and v are even */
          u >>= 1;   /* shift u right, dividing it by 2 */
          v >>= 1;   /* shift v right, dividing it by 2 */
          k++;       /* add a power of 2 to the final result */
      }
      /* At this point either u or v (or both) is odd */
      do {
          if ((u & 1) == 0)      /* if u is even */
              u >>= 1;           /* divide u by 2 */
          else if ((v & 1) == 0) /* else if v is even */
              v >>= 1;           /* divide v by 2 */
          else if (u >= v)       /* u and v are both odd */
              u = (u-v) >> 1;
          else                   /* u and v both odd, v > u */
              v = (v-u) >> 1;
      } while (u > 0);
      return v << k;  /* returns v * 2^k */
  }
 

Efficiency can be enhanced on platforms with special instructions to find the least significant 1 bit in a word, allowing trailing zero bits to be removed in much larger groups. The latter while loop also benefits greatly from branch predication, which enables it to avoid many of the expensive branches caused by the many conditionals.

The algorithm requires O((log2 uv)2) worst-case time, or in other words time proportional to the square of the number of bits in u and v together. Although each step reduces at least one of the operands by at least 2 times, the subtract and shift operations do not take constant time for very large integers (although they're still quite fast in practice, requiring about one operation per word of the representation).

Although this algorithm can outperform the traditional Euclidean algorithm on many platforms, its asymptotic performance is the same, it is considerably more complex, and it does not generalize to negative integers or other fields such as polynomials where division with remainder is defined, nor is its extended version (analogous to the extended Euclidean algorithm) as well-known. (An algorithm is given on page 646 of Knuth's "The Art of Computer Programming, Volume 2: Seminumerical Algorithms (3rd Edition)," along with pointers to other versions.)

Implementations in other languages

A simple functional version of the binary GCD algorithm, such as this version in ML, more directly reflects the original identities, but is inefficient unless modified to use tail recursion and unless the implementation transforms the multiplies, divides, and mod operations to bit operations:

 fun gcd(u,v) =
      if u = 0 then
          v                            (* Identity 5 *)
      else if u mod 2 = 0 andalso v mod 2 = 0 then
          2*gcd(u div 2, v div 2)      (* Identity 1 *)
      else if u mod 2 = 0 then
          gcd(u div 2, v)              (* Identity 2 *)
      else if v mod 2 = 0 then
          gcd(u, v div 2)              (* Identity 3 *)
      else if u >= v then
          gcd((u-v) div 2, v)          (* Identity 4 *)
      else
          gcd(u, (v-u) div 2)          (* Identity 4 *)
 

This version of binary GCD in ARM assembly with GNU Assembler syntax highlights the benefit of branch predication:

gcd:
     @ Arguments arrive in registers r0 and r1
     mov     r3, #0            @ Initialize r3, the power of 2 in the result
     tst     r0, #1            @ Check least significant bit of r0
     bne     gcd_loop          @ If nonzero, r0 is odd, jump into middle of next loop
     tst     r1, #1            @ Check least significant bit of r1
     bne     gcd_loop          @ If nonzero, r1 is odd, jump to beginning of next loop
 remove_twos_loop:
     mov     r0, r0, lsr #1    @ Shift r0 right, dividing it by 2
     mov     r1, r1, lsr #1    @ Shift r1 right, dividing it by 2
     add     r3, r3, #1        @ Increment r3
     tst     r0, #1            @ Check least significant bit of r0
     bne     gcd_loop          @ If nonzero, r0 is odd, jump into middle of next loop
     tst     r1, #1            @ Check least significant bit of r1
     beq     remove_twos_loop  @ If zero, r0 and r1 still even, keep looping
 gcd_loop:                     @ Loop finding gcd of r0, r1 not both even
     tst     r0, #1            @ Check least significant bit of r0
     moveq   r0, r0, lsr #1    @ If r0 is even, shift r0 right, dividing it by 2, and...
     beq     gcd_loop_continue @ ... continue to next iteration of loop.
     tst     r1, #1            @ Check least significant bit of r1
     moveq   r1, r1, lsr #1    @ If r1 is even, shift it right by 1 and...
     beq     gcd_loop_continue @ ... continue to next iteration of loop.
     cmp     r0, r1            @ Compare r0 to r1
     subcs   r2, r0, r1        @ If r0 >= r1, take r0 minus r1, and...
     movcs   r0, r2, lsr #1    @ ... shift it right and put it in r0.
     subcc   r2, r1, r0        @ If r0 < r1, take r1 minus r0, and...
     movcc   r1, r2, lsr #1    @ ... shift it right and put it in r1.
 gcd_loop_continue:
     cmp     r0, #0            @ Has r0 dropped to zero?
     bne     gcd_loop          @ If not, iterate
     mov     r0, r1, asl r3    @ Put r1 << r3 in the result register
     bx      lr                @ Return to caller
 

External links

See also: Binary GCD algorithm, 1967, ARM architecture, Big-O notation, Bitwise operation, Branch predication, C programming language, Euclidean algorithm, Extended Euclidean algorithm, Functional programming language