Yahoo Groups archive

Lpc2000

Index last updated: 2026-04-28 23:31 UTC

Message

Re: Example of C and inline ASM in a file?

2006-04-11 by Stephen Pelc

> From: Robert Adsett <subscriptions@...>
> Subject: Re: Re: Example of C and inline ASM in a file?
> 
> At 02:43 PM 4/10/2006 +0100, Stephen Pelc wrote:
> >When I look at where MPE (as opposed to its clients) uses in-
> >line assembly, we find it in device-specific drivers, e.g. UART
> >and Ethernet drivers, and in CPU specific sections, e.g.
> >schedulers, where we are striving for performance. In our world,
> >interrupt latency and task-switching speed really matter,
> >regardless of CPU performance - the faster the CPU, the more we
> >are asked to do with it.
> 
> Why not do them in assembly rather than inline?  I don't expect either of
> us to change the others mind but I am curious as to the reasoning.

Here's my standard response to this question. Just to confuse 
you, the example is not in C, and it's not for an ARM, but the 
principles apply. The example is for a very simple UART driver 
with interrupts enabled on receive.

: key0		\ -- char
\ *G Wait for character from USART0 and return it.
  begin
    di  rx0-avail c@ 0=
   while
    [asm  bis # _cpuoff _gie + sr  asm]	\ cpu to sleep, GIE set
  repeat
  rx0-char c@  0 rx0-avail c!
  ei
;

The compiler 'knows' about DI and EI to disable/enable 
interrupts. The ways to the put an MSP430 to sleep are myriad. 
The objective is to minimise the time for which interrupts are 
disabled. Putting in a single assembler instruction achieves 
that objective and documents everything the user needs to know.

Doing it this way gives faster code, better interrupt latency, 
shorter source, and keeps everything together. Yes, I could have 
tweaked the compiler to do this, but I prefer to do that only 
when the required functionality is itself portable across CPUs. 

Where I will code a complete routine is for something like the 
TCP/IP checksum, where a hand-coded assembly routine will beat 
the pants off the output of most compilers.

For an ARM7, each call/ret pair costs at least six cycles. For 
matched entry/exit routines, that costs at least 12 additional 
clocks. On the LPC2xxx there are four potential MAM stalls.

When you have an effectively 12MHz ARM (not an LPC2xxx), 
thirteen tasks, and a bomb-disposal machine to control, you 
learn to be paranoid!

Stephen


--
Stephen Pelc, stephen@...
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads

Attachments

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.