Hi Robert,
Thanks for the insights, and of course while the group was off the air
for a while
I'm pretty near getting on top of it.
It turns out it's not the INTs that are causing this after all.
As you might know, this is to do with the BASIC interpeter, and I can
execute
Basic that prints data really fast with no problems at all.
Also, when I just loop really fast in a printf() with a long string,
it works good
as gold too.
It's when I use one specific procedure call that I've narrowed it down
to.
I think it's a CG issue, but not sure, it's very hard to track down
:-(
It might be my interpreter code too, but I doubt it.
Maybe to clarify a bit :
> Not really related to your actual problem but since you are using printf
> why bother with a transmit interrupt? Unless you are planning on
> multithreading? I've always found that all a serial transmit interrupt on
>; single threaded apps does is introduce needless complexity (read bugs).
There's a few very specific reasons it's set up that way.
That UART channel normally carries ASCII data, and then of course no
INTs
would be needed, given that it's a human I/F. However, when I will add the
RF
frontend it is very important to have these INTs, for example you can give
a
Connect statement to the interpeter, and that immediately sets up a TDD
session
between 2 nodes, so I need to carry binary protocols along, and I can't
afford to
waste CPU time on polling. (the polling loop on tx_chars dropping down is
temporary
of course. I find it's a good test to see if everything is up to scratch,
obviously
something's not :-)
The large (sort of) buffer is the minimum needed to create a
pseudo-full
duplex RS232 wireless link between 2 nodes. While data is being serviced on
the RF in one direction, the other side's RS232 RX buffer can fill up, and
compensate
for the Half Duplex nature and Line Turn Around delays.
When it's in Binary mode, the scheme above isn't used, the buffer is filled
up,
and then the INTs are enabled, so the CPU can spend most of its time doing
other
things. RTS/CTS is used there.
And yes, afterwards my own small but fast RTOS will be plugged back in, so
I'll use
counting semaphores to handle the ring buffers.
> That raises a bunch of design issues to mind. Why such a large
gap? Why
> not add to buffer as soon as any room is available? And finally why two
> different variables to keep track of the room in the ring buffer?
> not add to buffer as soon as any room is available? And finally why two
> different variables to keep track of the room in the ring buffer?
Partially as above, the large gap is there to reduce CPU loading, if too
much data
is being written too fast into the ring buffer.
I find using a Head and a Tail more convenient to manage, rather than
having to
reset the pointer all the time, it's also easier to do the ground work w/o
the RTOS,
but have the whole system built up so it's easy to assign tasks and only
change the
"foreground" code, then _actually_ being foreground code instead of some
sort of
a superloop. The dead CPU time is then being used better. I'm trying to
optimise
what will be in what task, and minimise # of tasks, because many full
featured OSs
reschedule as slow as buggery. Latency issues galore then.
> I don't see how that could happen given your outline. The
pointer is
> updated exactly once, as is the count. One possibility is that there is an
> access control problem.
>
> Are your pointers and counters declared as volatile?
> updated exactly once, as is the count. One possibility is that there is an
> access control problem.
>
> Are your pointers and counters declared as volatile?
Of course.
> Is access to both protected with interrupt disabling and re-enabling
sequences?
That's a good idea, but I know now that's not where this specific
problem is.
It's very repeatable with differing Basic application programs, so I
doubt it's
just that.
It's mainly really that a CG problem with context switch threw me out
> One thing that can happen on the ARM that won't on micros with small
> register sets is that the pointer and counters can be held in the register
> sets and won't get spilled out to memory unless they are declared as
> volatile. I'd expect the issue to be a little more dramatic in that case
> but...
>
> This does 'feel' more like an access control race where the interrupt
> decrements the counter and returns to normal execution which immediately
> overwrites it with an old updated value. Something like:
>
> r1 = tx_chars
> ; - jump to transmit interrupt
> - Save appropriate registers
> - Perform transmit
> - r1 = tx_chars
> - r1 -= 1
> ; - tx_chars = r1
> - restore saved registers
> ; - return from interrupt
> r1 += 1
> tx_chars = r1
>
> And now tx_chars is one higher than it should be. As bugs go these ones
> tend to be very timing sensitive.
>
> Ring any bells?
> register sets is that the pointer and counters can be held in the register
> sets and won't get spilled out to memory unless they are declared as
> volatile. I'd expect the issue to be a little more dramatic in that case
> but...
>
> This does 'feel' more like an access control race where the interrupt
> decrements the counter and returns to normal execution which immediately
> overwrites it with an old updated value. Something like:
>
> r1 = tx_chars
> ; - jump to transmit interrupt
> - Save appropriate registers
> - Perform transmit
> - r1 = tx_chars
> - r1 -= 1
> ; - tx_chars = r1
> - restore saved registers
> ; - return from interrupt
> r1 += 1
> tx_chars = r1
>
> And now tx_chars is one higher than it should be. As bugs go these ones
> tend to be very timing sensitive.
>
> Ring any bells?
That's a very good point you make Robert.
I will make sure TX INTs are disabled when I modify tx_chars.
I've had it sitting there dumping data out while executing Basic for hours,
and it ran
with no problems, whereas a specific Basic program caused it to hang in
that "loop"
within 30-40 secs at the most.
It must be a library call somewhere that messes up something.
I'll certainly keep you posted.
B regards,
Kris