Index: [thread] [date] [subject] [author]
  From: wolfwings@lightspeed.net
  To  : ggi-develop@eskimo.com
  Date: Thu, 9 Jul 98 12:15:31 pst

Re: Fast lines

>Why bother optimizing for these ancient machines anyway.

Because they're still in use? You sound like Microsoft. "User: I can't run your
program. Microsoft: Buy a new system."

I'm sorry, if it can be optomized to run well on those older systems, then it
should be. I could re-hash the same arguments as to why we include support for
text-mode consoles even though that's a video mode _nobody_ uses anymore today,
as I think I heard one person put it quite a while back when they offered to
implement a chipset driver, but said they wouldn't implement support for any
non-graphical modes.

>Shifts aren't necessary in the fixed-point inner loop.
>You could either use use a if, or maybe use the special x86 instruction
>that sign extends a number to twice the width.

That special x86 instruction is the same speed as a shift, actually. :-) And
if you're using if's, then it's not fixed-point but just fancy carry-over and
wrap-around logic, I.E. plain straight bresenham.

>On the Y-major lines it is possible to use the carry bit, which 
>simplifies things.

Oh? How so? Never seen that before. *shrug*

>Have you got any numbers for those processors.

No, I don't anymore except for the 286/7, but I know that on that, before I
upgraded it to a 486, it was able to crank out 14000 100-pixel lines per second
with a sliced-bresenham algo, and that was mostly a memory bottleneck speed.
Also though, I had 8 "axes" for the slices to run along, not just plus/minus
X/Y, but I also included diagonal slices so the slices always stayed at least
2 pixels long. PiTA to write that code, but the performance was worth it. :-)


Code was roughly 2.5k of assembler, I believe. In the stub library, the sliced-bresenham
is just fine though, as I think someone else just pointed out why exactly.

On specific library allecerator libs, like plain VGA, yes, then special code
should be put in place. We're talking about the stub libs though. :-) They're
meant as "fallback" libs, not "always use these" libs, that's what special accelerator
libs are for. :-)

>On the machines that I have tested (Pentium,MIPS R4600, R5000, R10000,
>microsparc)
>both of my straight bresenham algoritms were faster than a sliced
>bresenham.
>The fixed-point algotitm was fast too, especially on the longer lines.

That may well be, on the CPU's I've tested line-drawing on (8086->80486, P200MMX,
PII) sliced-bresenham was nice and speedy though. :-) We're talking from different
experiences, mine is mostly x86-based, with quite a bit of assembler code written
for it, not just a high-level language like C.

>PS The x86 architecture sucks, because it has too few registers,
>   so it isn't possible for the compiler to unroll the loops.

Blah, then the compiler doesn't optimize well enough. But, that's what assembly
code is for, but again, that's not for a stub library.

I have that old 8086 MCGA assembler line-drawing code somewhere on floppy that
I managed to keep all the values in-register. Mind you, this is on the old 8086
real mode that has all the limitations as to what registers can be used for
what purpose and all that rot on top of the limited register count.

Today's compilers, for all their flaunting otherwise, still can't optimize code
worth beans on many architectures like the x86 sometimes.

I think I used all but ES, SP, and SS, actually, and I restored all the register
values on return. And yes, it was a sliced-bresenham algo, the one that special-cased
dual-axes-major lines as well as single-axes-major lines. I'll try to hunt it
up among the mountain of 5-year-old floppies I have laying around, if you'd
like. :-)

WolfWings ShadowFlight
wolfwings@lightspeed.net

Index: [thread] [date] [subject] [author]