Synth-DIY Yahoo! Groups Archives

Tom,

great contribution! It is indeed difficult to count cycles for
instructions in the ARM. Nevertheless, the LPC2000 family does not use
a cache. The internal SRAM is something in ARM terms called tightly
coupled memory (TCM). This offers the same speed as cache but a lot
less undeterministic behavior. If a real cache gets a cache miss, the
next instruction could take 20 or more cycles, if there is a branch in
the SRAM or even hte flash of the LPC2000, it is just the regular 2
cycles of the branch in SRAM  + 1 cycle for the first instruction from
Flash.  All subsequent linear instructions either from SRAM or Flash
will execute the same speed. This is something special in the LPC2000
series because the bandwidth of the flash based on one fetchcycle
loads 16 bytes and this can be done every 50 ns. The bandwidth
therefor is up to 320 MBytes/sec, faster than anything else with flash
in this market. 

Bob


--- In lpc2000@yahoogroups.com, Tom Walsh <tom@o...> wrote:
>
> rtstofer wrote:
> 
> >I have spent literally hours searching the Philips web site for 
> >documentation other than the User Manual for the LPC2106 and a few app 
> >notes.
> >
> >Is there a document that has the electrical and timing specs for this 
> >chip?
> >
> >One reason I am curious is that I don't seem to be able to get very 
> >short output pulses when I wiggle a pin.  Something on the order of 
> >100 nS is about the shortest I can get.  Now, I am running at 4x 
> >14.7456 MHz (I think!) and the VPBDIV is set to 0x01.  It would seem 
> >that, at 59 MHz, I should be able to get very short pulses.
> >
> >This isn't a problem, just a curiousity.  But, I would like to review 
> >the specs.
> >
> >  
> >
> That is something that is hard to quantify on a pipelined processor.  
> This could predicted while using a simple CPU like a 68HC11 or 8051 
> processor.  Those CPI (Clocks Per Instruction) are governed by a fixed 
> set of conditions.  And the IPC (Instructions Per Clock) is always one 
> (1).  You could go to a data sheet and lookup how many clock cycles a 
> "mov a, r0" would take, then using your calculator, determine the
period 
> of your clock, then multiply by 12 (8051) to get one machine cycle
time, 
> then multiply the number of machine cyles for the "mov acc, rn" opcode.
> 
> You cannot really do that with a modern processor, yeah, ARM falls
under 
> that category.  Modern CPUs do two things differently which make 
> computing finite execution times difficult: they cache instructions and 
> execute instructions in parallel.  While looping inside a cache
boundry, 
> you get your best performance time, the CPU doesn't need to reload / 
> dump cache from External RAM (in the case of the LPC2xxx, it is the 
> on-chip SRAM, no difference, just a bit faster than external [S]DRAM).
> 
> ARM processors also execute opcodes in paralell with each other, 
> predicative execution.  Take the "moveq r1,r1,#0" instruction, that
is a 
> conditional instruction based on the result of the zero flag.  While
the 
> previous instruction is executing, ARM pipelines the next instruction 
> into the microcode unit and sets it up.  In this case, it gets a value 
> of ZERO all set to be put into R1, but the instruction is held up until 
> the value of the zero flag is known to be stable.  Once it is time to 
> execute the move, the processor either does the instruction or
discards it.
> 
> Meanwhile, another instruction has already been loaded and it, too, is 
> ready to go!  The predictive exectution can extend beyond just a few 
> instructions, but can encompass the width of the cache.
> 
> TomW
> 
> 
> 
> -- 
> Tom Walsh - WN3L - Embedded Systems Consultant
> http://openhardware.net, http://cyberiansoftware.com
> "Windows? No thanks, I have work to do..."
> ----------------------------------------------------
>
Lpc2000

Re: LPC2106 Electrical & Timing Specs

Attachments

Move to quarantaine