Combining multiple depth cameras and projectors for interactions on, above and between surfaces

Authors:
Andrew D. Wilson - Microsoft Research, Redmond, WA, USA
Hrvoje Benko - Microsoft Research, Redmond, WA, USA

Proceeding
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology

Summary

Depth cameras and projectors are used to make an interactive room named "LightSpace." Interaction are done on plain non-electronic surfaces in a natural fashion that only requires the cameras and projectors. The point of this approach is to limit external devices, since the surfaces do not need attention, and nothing needs to be worn on the person.

Hypothesis

To the author's knowledge, no research has been done on making an interactive space that solely used depth cameras and projectors.

Methods

The system is composed of a central placement of 3 depth cameras to view the entire room, and 3 projectors. Two projectors are for the table and walls, and one is for projections onto the person, such as a virtual object to be held in the hand. After calibration of the depth cameras using 3 points on the two interaction surfaces, a model mesh of the person is created. for the table, all interaction is analyzed only from a 10cm volume above the table. The resolution is great enough to determine touch onto the table, and essentially creates a multitouch interface on any surface. There are three types of interactions possible with this system:

Multi-touch interaction on the "dumb" surfaces
Holding and transferring virtual icons of objects by grabbing them off the side of an interface
Activating a menu by holding a hand in a column of air above an icon of a surface.

Instead of performing computation on the mesh generated from the cameras, "virtual cameras" were used from orthographic projections of the mesh. There were 3 virtual cameras generated: two for each surface, and one for the entire room.

Discussion

When i was watching the video presentation of the LightSpace concept, I couldn't help noticing how rough the interactions were on the surfaces. This is probably due to either the small resolution of the camera prototypes or the face that the underside of the hands can not be seen. One solution would be to use depth cameras in more diverse locations, but then there would be more complexity to the system.

This paper was published approximately one month before the release of the Kinect system. Since then there has been an SDK released for it and many people have used it for several creative hacks. In my opinion, to stay within the main idea of the paper to reduce external complexity, future 3d interaction would have to use actual 3d projection, since currently this can only be emulated by actively measuring the person's position in space. This however requires external hardware to be worn about the person.

CHI436

Thursday, September 29, 2011

Blog #13

Combining multiple depth cameras and projectors for interactions on, above and between surfaces

No comments:

Post a Comment