Thanks a lot !
Got it !
In fact, for sequential read code, we can reach 0 wait state at full
speed but for non sequential operations, when MAMTIM=3, we will add 2
wait states for each read ... Am I correct ?
And to be able to reach 0 wait state at 60MHz, according to the
errata, we must have MAMTIM=3 and have the MAM enabled, does it mean
that we must in ll cases use the MAM ?
--- In lpc2000@yahoogroups.com, "Karl Olsen" <kro@...> wrote:
>
> ---- Original Message ----
> From: "croquettegnu" <croquettegnu@...>
> To: <lpc2000@yahoogroups.com>
> Sent: Wednesday, April 12, 2006 4:32 PM
> Subject: [lpc2000] Flash Access at 0 wait state at 60 MHz ?
>
> > I have understood that the MAM module allows operating up to 60MHz but
> > is it possible at 0 wait state ?
> >
> > On the MAM usage notes, it is refered that the MAMTIM must be
> > configured regarding the the operating frequency, for example for
> > system clock faster than 40 MHz, MAMTIM must be set to 3 CCLKs, does
> > it mean that we introduce 2 wait states ?
>
> Yes. But the point of the MAM is that you often can read with 0
waitstates
> from the MAM buffers instead of from flash with 2 waitstates.
>
> Most important is the 128-bit prefetch buffer that prefetches
instructions
> so that normal sequential instructions can be fetched and executed
without
> waitstates. Jumps take 5 clocks (1 base clock plus 2 clocks pipeline
> flush plus 2 clocks MAM waitstates) instead of 3 (if we had 0-waitstate
> memory). Since sequential instructions are the most normal, the net
> performance of instruction fetching is like close to 0 waitstates.
>
> There is a separate 128-bit data buffer so that when reading
constants from
> flash, you often do that with 0 waitstates instead of 2. You get 0
> waitstates when you read a word in the same 16-byte block as the
previous
> flash data read. This is often the case when reading constants from the
> literal pool.
>
> There is also a branch target buffer that remembers the 16-byte
block at the
> last jump destination (non-sequential code read). This optimizes simple
> loops so that the first jump back to the loop top takes the full 5
clocks,
> but the subsequent jumps to the loop top only take 3 clocks because the
> instructions at the loop top can be fetched from the branch target
buffer.
> This of course only works when there are no other jumps in the loop
body.
Show quoted textHide quoted text
>
> The MAM gets you close to 0 waitstates flash in average, wih more
> predictability and fewer transistors than a cache. It is much easier to
> predict when you get the 2-clock penalty with the MAM than with a cache.
>
> Karl Olsen
>