Index:
[thread]
[date]
[subject]
[author]
From: Steffen Seeger <s.seeger@physik.tu-chemnitz.de>
To : ggi-develop@eskimo.com
Date: Wed, 4 Aug 1999 17:06:25 +0200 (CEST)
Re: Accelleration: A summary
Hello,
> Ping-Pong buffers are implemented by taking two memory pages (isn't that a
> little huge for 2D accelleration ?). One of the pages is mapped into user
> space, the other only in kernel space. For this needs segfault trapping in
> the kernel, KGICON is not suitable, but KGI is the way to go.
>
> User level drivers get a remapped region, with size 2x the memory page
> size. This mmapped region is defined by a base pointer. Writes to this
> region are done in linear (increasing) order with wrap around after 2x
> memory page size, like:
>
> Offset = (Offset + CommandSize) & (2 * Pagesize - 1)
>
> Each time a write is done at the start of a new page (that is at offset
> "0" and "Pagesize"), a PageFault is generated (caused by the fact that the
> kerneldriver has unmapped that page). This pagefault is caught by the
> kernel driver which maps the page, unmaps the other page, and tells some
> accellerator procedure it has work to do. That procedure takes the data of
> the unmapped page and does something with that data.
>
> The data can consist of two things:
>
> 1) Commandstructures which have to be interpreted by the kernel
> driver. Negative item here is that this makes the kernel accellerator
> bigger compared with the 2nd option. Positive is that the kernel only has
> to check coordinates for security (If this can crash the system,
> otherwise even that might be omitted). Note here from me: 2D accelleration
> in kernel doesn't take that much space, see for example the ViRGE
> kgicon driver.
>
> 2) A list of registers and data. Positive here is that the kernel driver
> can be very small, for no interpretation is needed. Negative is that the
> register and data must be checked. (No DMA commands allowed, maybe
> coordinate checking if this can crash the video chipset)
>
> Am I right so far ?
Almost. This is the ggi-0.0.9 scheme. In short the KGI-0.9 extensions:
1) Pagesize can vary, but must be a power of two greater CPU_PAGE_SIZE,
e.g. 4k, 8k, ..., 128k
2) more than one buffer (Page) is allowed (the total size of the
linear region seen by the application is accordingly bigger).
3) partial buffers (Offset % Pagesize bytes) can be flushed by accessing
(Offset+Pagesize) & (Regionsize-1)
4) given the proper hardware/trusted code, buffers can be directly
fed to the accelerator using DMA transfers (== the data protocol
is driver/hardware specific) Wrong data here may crash the system,
but if it does, the resource may only be exported to trusted code.
(e.g. indicated by a suid-graphics bit).
5) each mapping is assigned a context and a priority, with context
switches done transparently by the driver. This allows you to
have a virtual accelerator for the 2D driver in X, a separate
(virtual) accelerator for the Mesa3D driver inside X, and other
processes may have their own mappings as well (direct rendering).
Planned, but only partially implemented:
6) depending contexts may be attached to a context, e.g. to have
graphics-process-local AGP texture memory, etc.
Well-designed hardware can easily use this mechanisms to provide an
API-independent very low-level, high-performance interface with minimal
kernel-driver code. Broken hardware may need work-arounds, but
that's a problem with this broken hardware, not the concept.
This is basically the core KGI acceleration concept.
I did not get any reasonable arguments from the 'kernel-gurus' why this is
bad except that it is KGI.
Steffen
PS: The Permedia is well-designed, which is why I am using it for a reference
driver.
----------------- e-mail: seeger@physik.tu-chemnitz.de -----------------
Index:
[thread]
[date]
[subject]
[author]