Yahoo Groups archive

AVR-Chat

Index last updated: 2026-04-28 22:41 UTC

Thread

Re: [AVR-Chat] Re: STK500, STK501 & AVR Studio not a complete environment

Re: [AVR-Chat] Re: STK500, STK501 & AVR Studio not a complete environment

2004-04-25 by erikc

Gotcha.  Much more sensible.  In fact, that looks a little
like FORTH because that VM (and hardware based on it) also
uses two stacks.  BTW, has anybody implemented a FORTH
interpreter on the AVR?

Erikc  - firewevr@airmail.net
///
"An Fhirinne in aghaidh an tSaoil."
"The Truth against the World."
        -- Bardic Motto
/// Wanted: Will trade illusion of safety for slightly used
governmental system w/ checks and balances.

----- Original Message -----
From: "Paul Curtis" <plc@rowley.co.uk>
To: <AVR-Chat@yahoogroups.com>
Sent: Sunday, April 25, 2004 18:13
Subject: RE: [AVR-Chat] Re: STK500, STK501 & AVR Studio not
a complete environment


> Eric,
>
> > > > > As for a compiler, I find GCC code to be unsuited
to
> > > > > the AVR
> > > >
> > > > What about gcc do you find unsuitable for the avr?
> > >
> > > The fact is unifies the two stacks; the entry/exit
code
> > for a function
> > > is particularly unappealing.
> >
> > Can you explain this, perhaps with an example?
>
> Sure.
>
> Take this simple source code:
>
> void g(int *);
> void h(int);
>
> void foo(void)
> {
>   int x;
>   g(&x);
>   h(x);
> }
>
> Gcc generates this with -O3, an abomination in my eyes:
>
> foo:
> /* prologue: frame size=2 */
>         push r28
>         push r29
>         in r28,__SP_L__
>         in r29,__SP_H__
>         sbiw r28,2
>         in __tmp_reg__,__SREG__
>         cli
>         out __SP_H__,r29
>         out __SREG__,__tmp_reg__
>         out __SP_L__,r28
> /* prologue end (size=10) */
>         mov r24,r28
>         mov r25,r29
>         adiw r24,1
>         rcall g
>         ldd r24,Y+1
>         ldd r25,Y+2
>         rcall h
> /* epilogue: frame size=2 */
>         adiw r28,2
>         in __tmp_reg__,__SREG__
>         cli
>         out __SP_H__,r29
>         out __SREG__,__tmp_reg__
>         out __SP_L__,r28
>         pop r29
>         pop r28
>         ret
>
> == 26 words ==
>
> It uses Y as a frame pointer and combines what should be
two separate
> stacks (a hardware call stack and a software data stack)
into one
> unified stack, managing the hardware stack with software
and bringing a
> lot of overhead with it.  Not only that, it needs to
disable interrupts
> on function entry and exit, something that's bad form when
you need fast
> interrupt response.
>
> CrossWorks, on the other hand, does much better:
>
> _foo
>         SBIW    R28, 2
>         MOVW    R26, R28
>         CALL    _g
>         LD      R26, Y
>         LDD     R27, Y+1
>         CALL    _h
>         ADIW    R28, 2
>         RET
>
> == 8 words ==
>
> Now, which would you think is more efficient from an
execution
> standpoint, a code density standpoint, and an interrupt
latency
> standpoint?  No contest.
>
> -- Paul.
>
>
> ------------------------ Yahoo! Groups
Sponsor ---------------------~-->
> Buy Ink Cartridges or Refill Kits for your HP, Epson,
Canon or Lexmark
> Printer at MyInks.com.  Free s/h on orders $50 or more to
the US & Canada.
> http://www.c1tracking.com/l.asp?cid=5511
> http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/dN_tlB/TM
> ----------------------------------------------------------
-----------~->
Show quoted textHide quoted text
>
>
> Yahoo! Groups Links
>
>
>
>
>

RE: [AVR-Chat] Re: STK500, STK501 & AVR Studio not a complete environment

2004-04-25 by Paul Curtis

Eric,

> > > > As for a compiler, I find GCC code to be unsuited to
> > > > the AVR
> > >
> > > What about gcc do you find unsuitable for the avr?
> >
> > The fact is unifies the two stacks; the entry/exit code
> for a function
> > is particularly unappealing.
> 
> Can you explain this, perhaps with an example?

Sure.

Take this simple source code:

void g(int *);
void h(int);

void foo(void)
{
  int x;
  g(&x);
  h(x);
}

Gcc generates this with -O3, an abomination in my eyes:

foo:
/* prologue: frame size=2 */
        push r28
        push r29
        in r28,__SP_L__
        in r29,__SP_H__
        sbiw r28,2
        in __tmp_reg__,__SREG__
        cli
        out __SP_H__,r29
        out __SREG__,__tmp_reg__
        out __SP_L__,r28
/* prologue end (size=10) */
        mov r24,r28
        mov r25,r29
        adiw r24,1
        rcall g
        ldd r24,Y+1
        ldd r25,Y+2
        rcall h
/* epilogue: frame size=2 */
        adiw r28,2
        in __tmp_reg__,__SREG__
        cli
        out __SP_H__,r29
        out __SREG__,__tmp_reg__
        out __SP_L__,r28
        pop r29
        pop r28
        ret

== 26 words ==

It uses Y as a frame pointer and combines what should be two separate
stacks (a hardware call stack and a software data stack) into one
unified stack, managing the hardware stack with software and bringing a
lot of overhead with it.  Not only that, it needs to disable interrupts
on function entry and exit, something that's bad form when you need fast
interrupt response.

CrossWorks, on the other hand, does much better:

_foo
        SBIW    R28, 2
        MOVW    R26, R28
        CALL    _g
        LD      R26, Y
        LDD     R27, Y+1
        CALL    _h
        ADIW    R28, 2
        RET

== 8 words ==

Now, which would you think is more efficient from an execution
standpoint, a code density standpoint, and an interrupt latency
standpoint?  No contest.

-- Paul.

RE: [AVR-Chat] Re: STK500, STK501 & AVR Studio not a complete environment

2004-04-26 by mpdickens

--- Paul Curtis <plc@rowley.co.uk> wrote:

> Gcc generates this with -O3, an abomination in my
> eyes:

Larger numbers behind the -O option do not
automatically cause "better" optimization. There is no
universal definition for how to get "better" optimized
code (It depends on what the application is what is
you are trying to accomplish). With regards to
optimization.  anyway you look at it, it's a speed vs.
code size tradeoff.

With that said, In the example that you cited, using
-O3, gcc attempts to inline all "simple" functions.
So, you got what you asked for: A lot of inline
functions resulting in large code size. For the AVR
target, this normally constitutes a large 
pessimization due to the code increasement. However,
there are exceptions (But, your example is not one of
them...). 

So, to sum up your example, when you used the -O3
switch, you told gcc to include TONS of inline code.
gcc did exactly what you asked it to do: increase code
size.

To check my statements, query the internet for :

avr-libc-user-manual

and read the section of the manual on:

Selected general compiler options 



Regards

Marvin Dickens

=====
Registered Linux User No. 80253
If you use linux, get counted at: 
http://www.linuxcounter.org


	
		
__________________________________
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25�
http://photos.yahoo.com/ph/print_splash

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.