Index:
[thread]
[date]
[subject]
[author]
From: Jos Hulzink <josh@stack.nl>
To : ggi-develop@eskimo.com
Date: Tue, 3 Aug 1999 16:26:24 +0200 (CEST)
Re: Matrox GGI accellerator
On Tue, 3 Aug 1999, Rodolphe Ortalo wrote:
> However, I suspect that, to provide such mechanisms, in fact, we
> need to make assumptions on the way the hardware is designed.
> IF the 'safe' accelerations registers are well isolated, say
> between 0x800 and 0x8ff (offsets in the MMIO aperture) then
> we can allow direct access from userspace in that area. (At least
> from the security point of view: safe means that a malicious
> program will not burn your monitor or lock the PCI bus.)
> BUT this is apparently not the case with 'common' hw.
With many videoprocessors, you can forget that. Internally the huge
address range is converted to a small registermapping. The registers for
DMA access are the same as for linedrawing (On a ViRGE you even need to
reinitialize the engine if you use 2D and 3D accel at the same time
because many 2D configuration registers are also Z-buffer configuration
registers). So disabling DMA address space can not prevent using the 2D
line address space to do DMA !! (I know... shoot S3 and the others)
<SNIP>
> First:
>
> It seems people always think to virtualization at the registers
> level. But in fact, this is not mandatory. E.g. to access the
> useful acceleration features of a card like a ViRGE of
> Cirrus Logic, you need to virtualize things like
> 'DRAW_BOX', 'COPY_BOX' (2D) or 'LINE', 'TRIANGLE'
> (3D), or TEXTURE_SETUP, etc.
I thought that was called GGI ? :)
>
> So, first, I'd say that the accesses that could be convenient
> may not always be at the registers level. So, a single
> command unit may be longer than a byte, it may be a
> whole struct.
> (I don't say that exporting acceleration registers to
> userspace if bad: I say that there may exist different
> access granularities that may be satisfying enough not
> to bother doing something differently.)
>
> So, in fact, you need a good way to communicate such commands
> from the userspace application to the graphic card.
>
> If the command is simply:
> {register_offset, value}
> then.... well, you will have a lot of them, and in that situation,
> it may be slow, and difficult to control. So, it's typically NOT the
> ideal case. (Except for criticism of the whole kernel
> mediation interest. ;-)
>
> But if commands look like:
> { DRAW_BOX, X1, X2, Y1, Y2, COLOR }
> or {TRIANGLE, ... etc... [1] }
> you will have less commands.than in the above case. Especially
> when they are 2D commands. With 3D you may have a lot of
> commands (1000 triangles for one picture is not unusual) - but
> typically they come in a long stream.
The point is clear, but for huge commands, you will need a kernel-space
interpreter, increasing the size of the accelleration driver. And that
must be small.
>
> Note that, even in this case, you _may_ need to do some
> access control (cf: the example I gave you for the 546x).
> In case someone tries to use 'strange' accelerators
> commands. [2]
Is it possible to create 'Trusted software ?' I mean, knowing for sure
that the accel driver you give your file handle to for mmapping accel
space, is the GGI driver you wrote yourself ?
And with knowing for sure I mean knowing 22898902490247890 % sure, not
just 100 %....
Hmm, just thinking... Open source isn't that cool at all here: Some user
can hack the GGI libs and thus still crash the system. Damn.
>
> We end up with something that resembles to ioctl, but
> the problem is that we face long streams of commands,
> and _very_ _very_ impatient users.
>
> So, if, each time one command is issued by the application
> software there is a switch to kernel context, an access
> control check, and the emission of an accelerated
> command to the graphic card before going back to
> the application software for the next command....
> Well, we end up with only 4x or 5x acceleration in
> 2D. (No figures available yet for 3D. And that's a
> pity.)
> [This is the 'ioctl' way.]
I got 10 / 11 times here for 2D, never thought a ViRGE was that much
faster copying data as a Pentium 2 233, so I'm happy with it :) I must
say, got pgcc -O7 -march=pentiumpro kernel and libs, ioctl will be fast
here I guess...
> However, we can take advantage of the fact that
> these commands come in 'stream' to process
> them in whole batches.
> This is the idea of the pingpong buffer. It's pretty
> similar to buffering in fact.
> 2 mmap pages are setup. But only one of them
> is available in userspace. Both are swapped as soon
> as the userspace page is filled:
> The empty page is put up in userspace and the application
> program can continue to put its commands in it.
> The other page is handled by the card driver which
> reads the commands, controls them, and issue them
> to the card.
Heh, I finally got the idea. But define mmap pages please. Are these 1)
part of the framebuffer (Video cards RAM) or 2) Kernel memory ? 3)
Something very logical my stupid mind forgets ?
And how to tell a program that a mmapped space it has a pointer to is no
longer valid ? How does the kernel know that the buffer is full ? Still
ioctl ? I'm thinking of an IOCTL that demands the kernel for switching and
returns not before the other buffer is empty and ready to be filled by the
program again.
>
> You can put whatever command you want in that
> page (reg+value or complex commands). But the
> idea is that the latency of issuing ONE command
> will be greatly reduced, and this is very desirable,
> especially for 3D.
> Now, of course, you will always find someone who
> needs something even faster and who wants full
> control over this or that register. Well... I'd give
> it then (or cancel the account).
Direct register access ? Well... There is an OS for that, called Windows.
It has some more nice features including crashing every 5 minutes. Maybe
we can include some crashing features for compatability (see notes below)
I'm just wondering by myself how big a pingpongbuffer must me for optimal
speed. You don't want a sync to take ages, just because there was a huge
buffer still waiting to be filled completely that still needs to be
executed.
> Rodolphe (Ortalo)
>
>
> [1] I don't remember the parameters in fact... You need
> the top, left, right and 2 slopes no - and the color or
> texture of course ? :-(
> [2] Note that both the commands AND the accelerator
> are strange. ;-)
[1] You forget the fogging, alpha, and time before crashing parameters
(The last for Win98 compatability)
[2] Hey, aren't accellerator and strange the same ? Besides, you forget
the accellerator programmers.
Jos
P.s. sorry for this huge amount of questions, I'm an Electrical engineer
learning Kernel programming, not a real engineer in Computer Science
Index:
[thread]
[date]
[subject]
[author]