Synth-DIY Yahoo! Groups Archives

Hi Martin,

the answer to your question about the number of cycles needed for
certain instructions id "it depends". First of all the LPC2100 devices
are regular ARM7 devices so in case you have a table in general it is
valid. In particular there are several accelerators in the LPC2100
that speed up execution. 
First it has a 128-bit wide Flash interface. This means that you will
always fetch 128 bit after a branch. In fact the device has even two
memory blocks of 128-bit width which are accessed alternatively. The
long story short you get the best performance out of the device if you
can locate the start address of a time critical function on a 128-bit
boundry!
This is my best availble hint for generally speeding up the micro.

Cheers, Bob


--- In lpc2100@yahoogroups.com, capiman@t... wrote:
> This was a great idea !
> 
> I am now at around 5,8956 MBytes / second, which is close to 5,898
MBytes /
> sec. ( = Fosc * 4 / 10).
> So the two operations ( ldr ip, [r0, #0] and strb ip, [r2], #1) seems to
> take in sum 10 cycles.
> Is this correct ?
> 
> Re-adding my shift instruction: gives me around 5,360400 MBytes / s,
so 11
> cycles. So shift itself takes 1 cycle ?
> 
> BTW: Is there a cycle table somewhere for LPC21xx on the net or in an
> appnote / manual ? Is it the same as for original ARMs ?
> 
> Greetings,
> 
>           Martin
> 
> ----- Original Message ----- 
> From: "Ben Dooks" <ben@f...>
> To: <lpc2100@yahoogroups.com>
> Sent: Sunday, February 01, 2004 11:55 AM
> Subject: Re: [lpc2100] Optimization of Capture Routine
> 
> 
> > On Sat, Jan 31, 2004 at 09:01:05PM +0100, capiman@t... wrote:
> > > Hello,
> > >
> > > i want to read in 1 byte multiple times from the port pins as
fast as
> possible:
> > > Currently i have the following C code:
> > >
> > > unsigned char Data[60000];
> > >
> > > void CaptureBuffer()
> > > {
> > >     unsigned char *ptr = &Data[0];
> > >     unsigned char *ptrend = &Data[60000];
> > >
> > >     while(ptr < ptrend)
> > >     {
> > >         (*ptr) = (IOPIN >> LA_D0_BIT) & 0xff;
> > >         ptr++;
> > >     }
> > > }
> > >
> > > When i compile it with gcc and option -O3, i can capture with
around 3,9
> MBytes/sec.
> > > Avoiding the shift (by using P0.0 - P0.7) gives me 4,2 MBytes/sec.
> > > Leaving out the (*ptr) = (IOPIN...) instruction gives me 11,8
> MBytes/sec, but no more functionality :-)
> > >
> > > Can i improve the speed with inline assembler ? Produced
assembler code
> already looks very compact...
> > >
> > > .L142:
> > >  ldr ip, [r0, #0]
> > >  strb ip, [r2], #1
> > >  cmp r2, r1
> > >  bcc .L142
> > >
> > > Are there any other tricks ?
> >
> > unrolling the loop a bit may help, as it reduces the number of branch
> > instructions needed.
> >
> > -- 
> > Ben
> >
> > Q:      What's a light-year?
> > A:      One-third less calories than a regular year.
> >
> >
> >
> >
> > Yahoo! Groups Links
> >
> > To visit your group on the web, go to:
> >  http://groups.yahoo.com/group/lpc2100/
> >
> > To unsubscribe from this group, send an email to:
> >  lpc2100-unsubscribe@yahoogroups.com
> >
> > Your use of Yahoo! Groups is subject to:
> >  http://docs.yahoo.com/info/terms/
> >
> >
Lpc2000

Re: Optimization of Capture Routine -> cycle table ?

Attachments

Move to quarantaine