Yahoo Groups archive

AVR-Chat

Index last updated: 2026-04-28 22:41 UTC

Thread

Re: Understanding, trapping and debugging stack overflow on an AVR

RE: [AVR-Chat] Re: Understanding, trapping and debugging stack overflow on an AVR

2011-01-15 by Chuck Hackett

> From: Gary Skinner
> 
> Are you sure the breakpoint at main is not because of a watchdog timeout?

Good thought, but, Yes I'm sure for a couple of reasons:

1) I disable the WDT when debugging (more below), and

2) I had a breakpoint in the 'init3' section where I decipher the reason for a reset
such as power, WDT, brownout, or software trap.  If the code reaches a "This can
never happen" condition I use a macro to store a trap code in no-init SRAM and then
loop until the WDT resets the processor - at which time I pick up the (non-zero)
trap code and flash it in an on-board status LED and the controller recovers if this
was caused by a particular event timing/sequence.  If this (trap code) condition
happens when debugging I can break into the infinite loop (since the WDT is
disabled) and look at the execution context.

> From: David VanHorn
> ....
> Suggestion:  Put a breakpoint at the reset vector, if you get there,
> you will know.

Yup, I did that, as well as adding the "Catch All" ISR as suggested by Don, below,
no interrupts were occurring ...

> From: Don Kinzer
>
> ....
> > Other than that, and suggestions on tracking this
> > down would be welcome.
> It is often useful to capture the "reset flags" (typically in the MCUCR
> register but check your datasheet) and then clear them.  This register has
> bits for each reset cause.  If you get back to main() and no reset flags are
> set then you know that you got there by a means other than a reset.

Yup, had this code in for other reasons (see my answer to Gary above) and this is
how I determined that the 'restart' was not due to "the usual suspects".

> You can set a breakpoint on the "catch all" ISR to detect when that is the
> cause of restarting.

Thanks for that suggestion - I have now added that code

> You can "seed" the stack area with a known value and then inspect it at
> various times to detect when the stack is all "used up".  This will be more
> complicated in a multi-tasking environment because you undoubtedly have
> multiple stacks.

FreeRTOS does this for you a little bit even when 'stack checking' is not turned on
(compile time).  With it turned on it fills the stack with a known value so it can
report the 'high-water' mark for each task.

> >[...] but very shortly after that it re-executes the "main"
> >function (as shown by a breakpoint).
> That sounds like it could be data corruption due to the stack colliding with
> statically allocated data.  It could, however, be an interrupt for which you
> have defined no handler.

Or in my case, since FreeRTOS allocates task stacks from the heap, either the stack
growing out of its allotted area or a "wayward pointer" or "buffer overflow" of
another allocated area.  In my case, I was suspecting stack overflow, but ...

I ended up finding the problem - it was caused by a wayward "&" in pointer
calculations within one of my doubly-linked list routines.  The clue was that the
problem occurred when the second item was added to a particular linked list.  It was
using the location of the pointer (on the stack) as opposed to the contents of the
pointer.  The return PC value happened to be located such that it was clobbered when
the function was setting the "previous" pointer of the linked element.

One of those head-slapping "dah!" moments :-(

Thanks to all for the assist.  Now I can get back to making progress instead of
beating my head against the wall - or at least move on to another wall :-)

I've got about three weeks to get as much functionality done as possible before I
have to have the next version of the controller in the field for testing during a
large gathering we'll be having in February.  Lots of trains running, lots of
opportunities to shake things out ...
 
Cheers,

Chuck Hackett
"Good judgment comes from experience, experience comes from bad judgment"
7.5" gauge Union Pacific Northern (4-8-4) 844 http://www.whitetrout.net/Chuck

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.