Index: [thread] [date] [subject] [author]
  From: WolfWings ShadowFlight <wolfwings@lightspeed.net>
  To  : ggi-develop@eskimo.com
  Date: Fri, 7 Aug 1998 02:09:57 -0700 (PDT)

Re: ggiGetPixelFormat and proposed new DirectBuffer scheme

On Fri, 7 Aug 1998, Jim Kjellin wrote:

>>On 386, the fastest way is a trio of bit-shifts, I believe.
>>ror ax, 8
>>ror eax, 16
>>ror ax, 8
>
>
>You should get another speedup using "xchg ah,al" instead of ror ax,8 since ax
>is 16-bit.

On a 386, the ror takes 2+1 cycles for the shift and the opcode-size
override, according to the tech-spec manual on the 386 I have, for Intel
chips, the xchg takes 4 cycles, so the ror is 1 cycle faster on the 386
and 486 (both xchg and ror being sped up by 1 cycle a piece) but yes, on
pentium and above, the xchg is faster as the newer CPU's seem to
positively _hate_ opcode-size overrides.
             _
     _     _|_  WolfWings ShadowFlight
| | | | | | | | wolfwings@lightspeed.net
| | | | | | | | "Love is a bird,
|_|_| |_|_| | |  She needs to fly...
 _           /   Let all the hurt,
 \-.______,-'    Inside of you die..." - Madonna, Frozen

Index: [thread] [date] [subject] [author]