"bobbruce000" <bobbruce000@...> writes:
> Anyone have fast working code for unsigned 32 bit division?
> Currently, I am using the __udivsi3 routine that comes with
> gcc, but it is slow and appears to be written to work on
> early ARMs without hardware multipliers.
Below my sig is the division routine, in ARM assembler, that I am using
in my Forth. I don't know that it is any faster. It doesn't use the
multiplier. How would you use the multiplier such that it would be
faster? I can see how to do it with a known divisor (by multiplying by
the reciprocal of the divisor) but not with a routine that accepts an
arbitrary divisor. Do you have some sort of Newton's Method that would
be faster? I don't actually do enough division at the moment to need it
any faster, but it is an interesting topic.
Also, see the question in the code in case anyone knows a way to reduce
the two instructions to just one, repeated here:
@mov r1, r1, rol #1 @ this is what we'd like to do
mov r1, r1, lsl #1 @ but we accomplish it with these
adc r1, r1, #0 @ two instructions. Is there a
@ way to do it in just one?
In the listing, 'DSTK' is the register R10 and 'nxt' is more or less the
Forth equivalent of a subroutine return in that it transfers control to
the next Forth word.
I suspect there are some "early out" improvements that could be made for
smaller divisors, etc. Also, I think ARM may have had some sample code
including a division algorithm.
--
Frank
http://pygmy.utoh.org/riscy
@@@@ U/MOD ( dividend divisor - uR uQ)
@ Divide 32bits by 32bits with 32bit remainder and 32bit quotient.
@ See Patterson & Hennessy, Computer Organization & Design, around
@ pp. 219-221. This algorithm is a cross between the book's
@ second and third versions.
UslashMOD:
@ divisor is in r0 (TOS)
@ put dividend into r2 and extend it to 64 bits with r1=0
ldr r2, [DSTK]
mov r1, #0
@ Divide r1:r2 by r0
@ For each of the 32 bits,
@ shift r1:r2 left
@ subtract r0 from r1
@ if result is negative
@ undo subtraction and new quotient bit (lsbit of r2) gets 0
@ otherwise
@ new quotient bit (lsbit of r2) gets 1
@ Remainder is in r1 and quotient is in r2
@ Use r3 to count the iterations
mov r3, #32
@ First, shift dividend (r1:r2 left by one bit)
1: movs r2, r2, lsl #1 @ msbit is now in Carry
@mov r1, r1, rol #1 @ this is what we'd like to do
mov r1, r1, lsl #1 @ but we accomplish it with these
adc r1, r1, #0 @ two instructions. Is there a
@ way to do it in just one?
@ Subtract divisor from dividend. If result is not negative,
@ set least significant bit of quotient (in r2). Otherwise,
@ undo the subtraction by adding divisor back to r1.
subs r1, r1, TOS
@ new quotient bit should be 1 only if subtraction result
@ was non-negative, i.e. add complement of sign to r2
addpl r2, r2, #1
@ if subtraction result was negative, add back the divisor
@ to r1, i.e. undo the subtraction
addmi r1, r1, TOS
subs r3, r3, #1
bne 1b
@ At this point, r2 contains the quotient and r1 the remainder
str r1, [DSTK]
mov TOS, r2
nxtMessage
Re: 32 bit Division
2004-09-07 by Frank Sergeant
Attachments
- No local attachments were found for this message.