Vertical retrace
  several
  (ed. 98/2/4)

  A collection of ideas and possibilities to synchronise a graphics
  application with the monitor ray.

  1.  Summarizing what has been discussed

  By Brian Julin, Jan 28, 1998:


  1. If an IRQ is available (we haven't seen many SVGA cards that do
     more than _say_ they can activate IRQ), the kernel _could_ catch
     it. However, latency is high for a signal to get to user space, so
     the IRQs are of limited use.  We would actually want the signal to
     arrive when _blanking_, not vretrace is active anyway, and for the
     signal to be issued even before that so the application gets it at
     or right before blanking starts.

  2. Some common uses for vretrace (page flipping) should be available
     as kernel-space driver functions to avoid this latency.

  3. If using the RTC, it has to be set up for one-shot mode and reset
     on every trigger, as well as calibrated with the hardware, the way
     the PC speaker driver does it, but fortunately at a much lower
     frequency :-).  The periodic RTC freqs don't yeild good results
     because they are not flexible enough.

  4. Very few cards seem to allow you to read the CRTC address counter
     back out of the card.  This being the case, the only way to know
     when blanking is asserted is to calibrate with the vsync pulse via
     polling the "retrace active" registers, and calculate the amount of
     time to offset the event from the mode parameters.  Thus you have
     something like:

     ___________________________________________________________________

              (RTC IRQ)  -> Send Event
                            Set one-shot to 5ms
                            return
              (RTC IRQ)  -> Poll until vretrace bit active
                            unset vretrace bit latch
                            do some calibration calculations.
                            Set one-shot to 60ms + error term
                            return
     ___________________________________________________________________


  5. The kernel clock calibrator might benefit from using the "hretrace
     active" register once it has locked onto the right scanline if it
     does not drift too far.  It still must check back occasionally and
     verify sync with the vretrace.

     The new method of sharing the GC page could allow for some driver-
     libs to aid calibration from user space -- a field could be set in
     the shared page by the driver when the event is issued.  The
     driver-lib could compare jiffies and pass back an average latency
     for the signal and context switch.  Then the clock sync loop would
     be able to work that latency into the offset it applies to the
     vsync pulse.  Thus only the kernel ever does active polling, and
     does so very efficiently we can hope.


  2.  Using the vertical blank


  Sengan Baring-Gould found this method, using not the vertical retrace,
  but the vertical blank. The author is Bruce Foley,
  brucef@central.co.nz.

  Sengan quotes:

  This whole issue of snow while using mode 13h baffles me a bit.

  I never really had trouble with this even when I was using the
  vertical retrace to time the writing of my double-buffer out to video
  memory.

  I did have a problem with slightly glitchy scrolling though.  Not so
  much in native DOS, but in a windows 95 DOS box it was noticable
  enough to be annoying.  I guess the main reason for this was that a
  Win95 DOS box steals back chunks of system resource via interrupts...
  The price we pay for preemptive multitasking I guess.

  Anyway, using the vertical non-display has all but eliminated this
  last little problem.  I think the main reason for this is the huge
  amount of extra time you get to write out your data.  Here are the
  timing comparrisons on a VGA, according to Wilton:

  Vertical Retrace:       0.064 Milliseconds

  Vertical Nondisplay:    1.430 Milliseconds (!)

  (timings based on 640x480)

  In computer terms, that's a big difference!!! 1.366 Milliseconds extra
  for writing out that doublebuffer - awsome.

  The price comes in the form of more complex code.  This is because
  unlike the vertical retrace, there is no VGA status register that we
  can read to find out when the vertical non-display is actually
  happening.

  The solution to this is bit 0 of the VGA CRT status register.  This
  bit signifies the Display Enable state.  0 means enabled, 1 means
  disabled.  This bit toggles between these two states during the
  horizontal retrace, and guess what?  Once the gun hits the bottom of
  the screen, it stays off until it starts drawing lines at the top
  again.

  This means all we have to do is time how long it takes for a
  horizontal retrace to occur, and store this value as a trigger.  When
  the the Display Enable is disabled (1) for longer than our trigger
  value (which we double -just to be sure) then we know that the
  Vertical Non-display has occurred, and can safely write to video
  memory without risk of getting caught in the middle of a redraw.  Cool
  eh?

  Implementation will come in two components then.  One to record the
  trigger value, and another one, which will delay writing to Video
  Memory until the trigger value has been exceeded.

  Your client program (probably C, like mine) will look like this.


  ______________________________________________________________________
  int vndtime;            // Our trigger variable

  void Initialize(void)
  {
  ....
    vndtime =  vndtimeout();              // trigger returned as a 16 bit
  value
  ...
  }
  ______________________________________________________________________


  Then, somewhere in your main line, when you are ready to write out
  your double buffer, you call the routine (in external 386 enabled
  Assembler I hope!)  to do it.  You will need to amend the routine to
  examine bit 0 of the CRT register, and do some time-out logic against
  the previously stored trigger.

  Knowing that you have definately read the above before looking at the
  code below, it should all make perfect sense.  Here is the code
  fragment for getting the trigger value:


  ______________________________________________________________________
  VGA_INPUT_STATUS_1              EQU     3DAh

  PUBLIC _vndtimeout

  _vndtimeout PROC

  mov     dx, VGA_INPUT_STATUS_1

  ; wait for vertical retrace

  L101:
  in      al, dx
  test    al, 8
  jz      L101

  ; initialize the loop counter
  mov     cx, 0FFFFh

  ; wait for display enable

  L102:
  in      al, dx
  test    al, 1
  jnz     L102

  ; wait for the end of display enable

  cli

  L103:
  in      al, dx
  test    al, 1
  jz      L103

  ; loop until display enable becomes active

  L104:
  in      al, dx
  test    al, 1
  loopnz  L104

  sti

  neg     cx      ; make CX positive
  add     cx, cx  ; double it for safety
  mov     ax, cx  ; save the result in return register

  ret

  _vndtimeout ENDP
  ______________________________________________________________________


  Ok.  ax now holds the trigger, which MSC & Borland interpret as the
  return variable for word sized values.

  The next code fragment makes use of this variable by pausing within a
  loop until the trigger value is exceeded.  Note that it should be used
  just prior to your access to video memory.  I actually included this
  code in the same routine that does the double buffer copy (via MOVSD).
  The reason being that if your code was in another routine, then there
  is overhead code being executed before Video Memory is accessed.  A
  waste of precious time!!!

  ______________________________________________________________________
  EXTRN _vndtime:WORD             ; Our C trigger value

  PUBLIC _write_double_buffer

  _write_double_buffer PROC

  mov     dx, VGA_INPUT_STATUS_1
  cli                             ; Stop interrupts

  ; wait for display enable

  L201:
  in      al, dx
  test    al, 1
  jnz     L201

  ; wait for the end of display enable

  L202:
  mov     cx, _vndtime    ; CX = maximum number of loops

  L203:
  in      al, dx
  test    al, 1
  jz      L203

  ; wait for display enable

  L204:
  in      al, dx
  test    al, 1
  loopnz  L204
  jz      L202    ; jump if display enable detected
  sti
  ______________________________________________________________________


  If we get to here, then it means the Display Enable Bit has been set
  to 1 for longer than our trigger value.  The Vertical Non-display is
  now happening, so you can start writing out to video memory....


  Well, that's it.  Even if you don't understand the above straight
  away, I can vouche for its correctness.  In all honesty, that is
  pretty much how it appears in the book.  So as long as you understand
  the principle of what it is doing and how to use it, then you are away
  and laughing.

  Since I have just spent the last hour or so of my time putting this
  all together, perhaps you can return the favor and tell me about what
  you do, and perhaps why it is that you spend hours staring into a
  computer screen (much like myself).

  I have been a programmer for many years, but have only recently got
  into PC based programming.  I love it, since it so much less
  restrictive than what I am used to.  I dream of being a games
  programmer, and plan to go part time in my job after I have released
  my first game (probably as shareware) and after I have finished a
  major project I am heavily involved in at work at the moment.

  Anyway, write back, and let me know how you guys get along,

  All the best,


  Bruce.

  --------------------------------------------------------

  This next mail is in response to questions about the above...

  -----------------------------------------------------------

  I will try to answer some of your questions...

  As you know, the Vertical Retrace (VR) is defined by bit 3 (I think)
  of the CRT register being set to on (1).  When this bit is on, we know
  that the display gun is doing a diagonal retrace from the bottom right
  of the screen back to the top left.

  Ok, so now you would like a definition of the VN right?  Consider it
  to be a superset of the VR.  The VN includes the VR but also includes
  some extra time BEFORE the VR has even started.  The key to
  understanding this lies in bit 0 of the CRT register.  As you know,
  this bit is like a mini VR, except it is for the horizontal retrace
  (HR).  The time it takes for each HR is constant.  Therefore, we can
  measure the amount of time that the a single HR takes while the screen
  is being drawn.  That is what the first routine I gave you does.  If
  you look near the end of the code in the first routine, you will see
  me add cx to cx.  cx holds the time that the HR took.  It is doubled,
  simply as a safety precaution -since we will be using it as a trigger
  to tell us when the VN is actually happening.  Consider this: once the
  gun hits the bottom of the screen, the HR stays set to 1 until the VR
  has completed and starts drawing lines at the top of the screen again.
  All we need to do is check to see if it has been set to 1 for an
  amount of time that EXCEEDS the value computed (and stored in cx) from
  the first routine.  If it does, then we know that it must have
  finished drawing the screen.  At this point, it is still a long way
  off from starting the VR, as the timings indicate.

  As for the second routine, this was to demonstrate how you would use
  the value calculated in the first routine.  the variable _vndtime is a
  C variable that holds cx from the first routine.  You will notice I
  moved it from cx to ax just before the ret.  This is because C
  programs that accept 16 bit return values expect them to be in ax.

  The second routine is somewhat confusing to read and understand.  This
  is because of its subtle use of loopnz and jz instructions.  In
  english, the logic flow goes something like this:


  1. Load cx with the trigger value.

  2. Loop 1: Repeat until a scan line is being drawn.

  3. Loop 2: Repeat until either cx = 0 or a scan line is being drawn.

     (The key to this loop is the loopnz instruction.  It decrements cx
     and also checks the zero flag)

  4. Go back to loop 1 IF a scan line was drawn.

     (The zero flag would have been set via the test instruction if this
     was the case)

  5. If we are here, then cx must have been decremented to zero!  This
     means our trigger timeout value has been reached and the VN is now
     official.  We can start writing to video memory right now, even
     though the VR has not started yet.  This gives us a big head start
     in terms of the amount of time we have available before the screen
     starts to repaint itself.