I put the blob (multi-touch) stuff online. If you're insane enough, you can build it yourself.
It lies on the "blob" branch on: git://people.freedesktop.org/~whot/xserver.git git://people.freedesktop.org/~whot/libXi.git git://people.freedesktop.org/~whot/inputproto.git
An example driver is on git://people.freedesktop.org/~whot/xf6-input-blob.git
The driver listens to events on the network. Which means you can hook up an external machine that does the tracking for you, or a simple test program (see xf86-input-blob/test/). Nothing special, but enough to test the events.
I need to go back to Austria on very short notice. I'm flying out tomorrow and won't be able to do much coding for at least two weeks. I spent a lot of time this weekend to get things ready, but didn't get everything done I wanted to do. So I'm just dumping what I have now online. This gives you a chance to have a think about it and tell me if I've forgotten something crucial.
XBlobEvents
XBlobEvents are the new type of events.
typeef struct {
int type; /* GenericEvent */
unsigned long serial; /* # of last request processed by server */
Bool send_event; /* true if this came from a SendEvent request */
Display *display; /* Display the event was read from */
int extension; /* XI extension offset */
int evtype; /* XI_BlobEvent */
XID deviceid;
Time time;
Window root;
Window event;
int hot_x;
int hot_y;
int hot_x_root;
int hot_y_root;
int bb_x1;
int bb_y1;
int bb_x2;
int bb_y2;
int bb_x1_root;
int bb_y1_root;
int bb_x2_root;
int bb_y2_root;
XBlobEventLongData* ldata;
} XBlobEvent;
typedef struct {
XID subdevice;
XID blobid;
int format;
int bits_per_pixel;
char* data;
} XBlobEventLongData;
(Xlib internals are the reason that the event is split up into two parts).
The important fields are: blobid ... blob id. (see below) hot_* ... hotspot coordinates. These coordinates are used for mouse emulation. bb_* ... bounding box coordinates. format ... BlobFormatMonochrome, BlobFormatRGB, BlobFormatRGBA subdevice ... subdevice of the blob (see below). data ... the actual data the touchscreen can provide.
The blobid needs to be assigned by the driver. As a blob appears on the touch table, the driver assigns it a 30 bit id. When the blob moves, the id stays the same, but is flagged with BlobContinue. The client can thus keep track of multiple blobs simultaneously, and link them to gestures. Once a blob disappears, the id has to be flagged with BlobStop. The client knows that the blob disappeared. From this point on, you can re-use the blob-id.
The BlobContinue and BlobStop flags are crucial! Without them a client will think each blob sent by the device is a new touch. And it will likely lock up mouse emulation, as you're effectively sending lots of button presses, but no releases.
Hotspot and bounding box should be self-explanatory. Subdevice may need more explanation. Right now, none of my tests use it, but it is passed to the client as detected by the device. So eventually we'll have a set of flags in the form of SubdevRH_Index, for index finger right hand. If your device is good enough, you can supply the client with a lot of information here.
If not, SubdevUnknown will do. These defines don't exist yet, I need to think about a reasonable naming scheme.
The actual data should be a bitmap as good as the device supports it. So if you have a video-based FTIR table or an iPhone, you could send BlobFormatMonochrome and send the outline of the finger.If you have a MS surface table, you can send full RGBA data. The client could then even do fingerprint detection.
All the smart things will still have to be done by the client. MPX just provides the layers to abstract the device-dependent stuff away.
A very simple example client can be viewed here . This should give you a clue how to use them.
(btw. there is a bug in the server that causes stack corruption. I spent 6 hours debugging it today and came to the conclusion that it's probably not my fault. I think it's in pixman. If you find it, I'd appreciate if you tell me)
Wow. That video yesterday did have some impact, my page hits went up exponential (and the number of emails in my inbox). Yesterday's post was mostly motivational, here's the hard facts.
Hardware setup and restrictions
The hardware I use at the moment is a MERL DiamondTouch my supervisor organised with MERL as part of MERL's university loan program. Such a table has a number of horizontal and vertical antennas, by touching the table a circuit is closed and the table can tell where the touch happened (see the original paper). It supports up to 4 users, each user has to touch a different conductive pad. This way, the device can differentiate between users.
The kernel driver (written by MERL) spits out a 900 byte package everytime the table delivers it (a few times a second). I've written an X driver that disassembles the package, extracts the blobs and passes them up to the X server. That's all the driver does. The X server does pointer emulation and the rest.
Now - the DT is good for detecting multi-touch from different users but not good at detecting multitouch from the same user. There's only one set of antennas for x and y, so when I press two fingers down, there are 4 possible touch-points. There's ways to work around that (as long as x/y of two touches aren't overlapping) and you may have seen me using two hands in the video.
The DT also doesn't give me real bitmaps for the touch, a rectangular bounding box is all I can get out of it.
If you have a FTIR table or some other vision-based system you won't have these problem and nothing stops you from doing true multi-touch. You may have issues identifying different users though.
The DiamondTouch X driver was the first real X input driver I've ever written, so there are still some bugs in it. Need to check with MERL whether I can put it online, one of the header files I used is from them.
Software
The applications in the video are google earth, firefox and gnome-calculator. All unmodified of course.
The drawing app is a quick test app I wrote a few days ago. It's less than 400 lines of C code, and that includes debug messages and stuff to open/display the window. It's also one of the worst apps I've ever written and until I cleaned up the code a bit I'm too ashamed to put the source online.
It simply adjusts the line width to the dimensions of the touch bounding box and draws a line from the last point to the current. Originally I used the bitmap as the line shape, but as I said before, my hardware doesn't give me usable bitmaps.
The window manager is my hacked-up window manager I always used to test MPX. It doesn't know about touch events.
So during the video you can see touch applications, standard single-user apps and an MPX app simultaneously working. This is the really novel thing. Not the blob drawing app...
The in-betweens
So what's actually happening between a touch and the client doing stuff. Easy: the driver fetches the bounding box of the touch, selects a hotspot (in my driver this is the center of the bounding box) and creates the bitmap. This information is passed on to the X Server. The X server creates a BlobEvent from this information and also emulates a core pointer event and a X Input pointer event on the hotspot.
During the event processing stage, the X server checks where the event needs to go. If it's a standard app, all events except the core event are discarded. If it's a touch appliation, the blob event is delivered. This is just normal stuff, the same thing X does with pointer and keyboard events anyway.
An app can listen to both pointer and blob events of course.
Now one of the reason why the whole thing actually works is XGE. This is an extension I've written a while ago that allows the X server to send events longer than 32 bytes (something that isn't in the original specs). We need that for the bitmaps after all.
Pointer emulation?
Pointer emulation is central to the whole principle. There are a few applications out there that only know about pointers and keyboards. If you show firefox the middle finger as a touch gesture, it will ignore it. It doesn't know anything but pointer and keyboard events. So we can either say "Hey, there's this great new touch-toolkit, rewrite all your apps. And btw. you can't use your email client or web browser until you've rewritten it to understand touch events". Good luck with that.
I don't have time to rewrite tens of thousands of apps. So what MPX does is give those apps the environment they know. Multi-user apps that know about multiple pointers can use additional features. Apps that know about blob events can use those.
So stop whining about pointer emulation and rewrite your apps.
Or maybe not. Not everybody has a touch table, but everybody has a mouse. So your app has to now support both mice and touch screens. And multiple users. And hotplugging. Welcome to hell :)
Choose what your app needs. Do you need touch-support in a terminal? probably not. In google earth? Well, maybe.
btw. a lot of UI elements perform an action on a button release event (clicking links for example). It's actually not easy to press and release on the same tiny button with a input device the size of your finger. That's why sometimes nothing happens although you think you've pressed a button.
When is it available?
I'm still cleaning up and checking the code. I will hopefully merge it into the mpx branch next week.
MPX already supported multiple input devices. Which blows pretty much all assumptions in user interfaces (input) out of the water. Now I've gone one step further and added support for multi-touch displays. Have a look at this video: http://www.youtube.com/watch?v=olWjnfBoY8E
Upfront: I did not build some kind of touchscreen or tracking system. I did not build some kind of gesture recognition system.
I built the stuff in between.
A while ago I started thinking about how multi-touch and gesture support could look like. Looking around on the web and in the research literature, I found that all the multitouch systems are a hack (I'm talking about software integration here, not the hardware!). Multi-touch support needs to be in the windowing system. Any client-side approach is wrong. (Feel free to disagree with me on that)
So how can we get multi-touch gestures into the windowing system?
We don't need gesture support in X. Gestures depend a lot on the context. A gesture in one context can mean something different in a different context. And the only thing that knows the context is the application. This is very similar to a button press. Pressing a mouse button can mean a zillion different things, depending where and when it happens. That's why all X does is relay the button press to a client application, which then does the right thing.
What we really need is a way to convey events from a touch device to a client application. MPX now has a new type of events: BlobEvents. These events specify a bounding box, a hotspot and a bitmap specifying the contents of the bounding box. The device driver generates these events and passes them to X. X relays them to the correct client.
A BlobEvent has a device-id and a field to specify a subdevice. The device-id should map to the user. Each user represents one device. The subdevice is the body-part the user is using. If your device is smart enough, it can specify that the blob was generated by User 1, middle-finger right hand.
The BlobEvent bitmap can have multiple formats. 1 bit per pixel, 8 bits, 32 bits per coordinate, whatever your device can think up. Just remember: the client has to know how to read the data.
Windowing system benefits
Now all that could be seen as just another way of transporting touch events. Right. But remember, it's in the windowing system. That means we know exactly what application is where on the screen and thus we know which application to send the event to. So we can run two or more touch-applications at the same time and it'll just work.
The other thing about BlobEvents is that MPX can automatically emulate a core pointer event for each BlobEvent. And an X Input pointer event. This is where things start getting interesting. Using BlobEvents you can have your multi-touch photo-sorting-lava-lamp application running on the same screen as your standard GNOME, KDE, etc. applications and use them all at the same time.
To your multi-touch driver all this doesn't matter. It sends blob events, the server takes care of the rest. You're guaranteed to be able to interact with any X application. X doesn't care about the hardware. You can use your DiamondTouch, your FTIR table or - if you can afford one - your MS Surface table.
Oh. And by the way. You can use a standard mouse and keyboard on the same box as you use the touchscreen. After all, everything is just a device.
This is the last big change. I'll now focus on getting MPX stable enough to put it upstream.
Contains the following video drivers: nv, ati, i810, vesa, fbdev
Contains the following input drivers: mouse, keyboard, evdev
Don't expect to run GNOME with it. Nautilus doesn't like multiple cursors.
And I had all of 10 minutes to test the packages. They shouldn't interfere with your standard X installation.
Whoops. I discovered today that XI keyboard events were completely broken, the actual keycode was never sent to the client. Seems to be broken in master too, is now amongst some other changes I'm supposed to push soon.
And some hardcoded goodness in XKB was resetting some fields in the internal device struct, so XI events were never being processed after the first key press. Not too good either.
Now I just need to figure out how the XLookupString() works with XI.
I just pushed recent changes. Includes working passive grabs and a solution to yesterdays problems.
One rule I added was: if a client issues a GrabPointer or GrabKeyboard request, all passive core grabs on the same client are deactivated*. This removes the race condition and is - I think - closer to the original intend of the specs. Now when a client grabs the pointer/keyboard, it is guaranteed that it only gets events from the grabbed device.
Generally the rules now are:
- core grabs (for legacy apps) means only one device can interact (usually the ClientPointer and it's paired keyboard).
- device grabs (for multi-user apps) means the device only sends to our client. All other devices can do whatever they want.
A full rebuild (incl drivers) is necessary, some struct sizes have changed again.
* GrabPointer only deactivates passive grabs from pointer devices, GrabKeyboard behaves likewise.
After a fix to the passive grab system, I can lock up nautilus:
I press button on pointer 2. Press event being sent.
pointer 1, which is Nautilus' ClientPointer, is actively grabbed
I press button on pointer 1. Press event being sent.
Nautilus goes into "I'm never ever gonna release my grab and live happily ever after" mode.
The problem seems to be that the ButtonPressMask is specified in the grab's event mask, but Nautilus, being used to one pointer only, doesn't expect to actually receive a ButtonPress now.
Another race condition I found in a simple test app.
Create a window with ButtonPressMask.
On button press GrabPointer (ClientPointer again set to some other pointer) with ButtonReleaseMask.
Tell app to UngrabPointer on release event.
Now what happens is that an implicit passive grab is created on a button press. A implicit passive grab takes the window's event mask (ButtonPressMask). The ClientPointer is set to something else, so we grab some other pointer. If we release our button from the first pointer, the release event is never sent, because the event mask is not the grab's event mask but the window's event mask. So our app hogs the pointer.
I had an apt-repository a while ago, when MPX was still very very basic. But it took a lot of time maintaining it, since the server I had it on was ... difficult. So eventually I removed it.
ATM there are two reasons why I don't provide packages. One is politicial and won't matter anymore from next week on. The other one is that passive grabs in MPX aren't working right now and it does change the experience a lot. I notice that those who expect prepackaged software are also harsh in judging it, so I'd rather spare them disappointment :)
Once that is fixed, I'll push out MPX packages (for Ubuntu Feisty on x86) into some repository. Not sure how long it will take me, a lot of other stuff was wasting my time over the last weeks.
And I've started writing on my PhD a few weeks ago, so that's taking quite some time off my hacking time as well.
Yeah, I've seen it too and the marketing videos look neat. Haven't seen the actual product, but I did manage to find a paper that (probably) describes the hardware. Track down Andrew Wilson's paper "TouchLight: an imaging touch screen and display for gesture-based interaction". I assume that this is the technology they use.
And no, it's not like Jeff Han's FTIR technology. FTIR uses internal reflections, TouchLight uses IR illumination only. And a big difference is that TouchLight does not use a diffuser, thus allowing much higher-res of the objects on the table. Anyway, that's what I got out of the paper. The really interesting things I found while trying to find as much information as I could afford.
However, the thing that leads me to writing this entry is something else:
Jeff Han's FTIR demo video does not show MPX. I'm pretty sure of that. Ross Burton speculated and now lots of people think this is a fact*. I heavily doubt it. At the time the video was published MPX had ... issues ... running GNOME. So I think it was just clever use of maybe XInput or some software tool (and heavy video editing. there's actually hardly any _concurrent_ input in the 3 seconds of using X. look at the video again). Not that I would have a problem with it, in fact I would be interested in how he did it. But to be honest, I don't want MPX to get the credit for something it didn't do.
MS surface and the FTIR tables give you data in the form of bitmaps where blobs represent the touch points. They don't give user identifiction (although Surface has some potential there). So it'll be hard to map from a touchpoint to a distinct pointer.
MPX only supports distinct devices at the moment. Blobs don't fit well into the event system and you have to emulate a mouse (Andrew Wilson actually mentions a mouse emulator in the paper).
I do have plans to make FTIR tables work with MPX, but I have other stuff to do and need experience with them beforehand anyway.
* "the plural of anecdote is not fact". Read that in a signature somewhere today.
Took a bit longer than I hoped for (stupid interruptions), but I just pushed a big set of changes. Fairly fundamental too.
Long event support with XGE
A new extension, the X Generic Event extension (XGE) has been added. XGE should solve the issues we have with a shrinking number of event opcodes and the 32 byte limit to events. XGE reuses opcode 35, and hides the real opcode inside the event. We can have 2^16 event opcodes per extension.
Also fairly important: the length field in a GenericEvent works just like the length field in a reply. So events can be >32 bytes now. MPX already uses these new GenericEvents, although you have a device with a lot of axes, you won't see long events floating past.
You need to update x11proto, libX11, xextproto, libXext and apply a patch to libxcb (see here).
Raw device events
The integration of XGE made it fairly easy to add raw device events. Instead of the fairly abstract MotionNotify, ButtonPress, etc. you can register for raw events that contain the data provided by the driver. With the new XFakeDeviceEvent() call in libXi, you can forward these events to another X server and effectively hook up a driver from one display to a device on another display. Handy. Depending on the driver, it will also provide you with relative events, which comes in handy for games. Currently only for pointers.
It's been running for a few days stable here. And considering it's getting more complicated to compile MPX, I'll try to provide tarballs and debs as soon as I get round to it.