In Augmented TV Sports Coverage & Live TV Graphics and From Sports Tracking to Surveillance Tracking…, we started to see how objects in the real world could be tracked and highlighted as part of a live sports TV broadcast. In this post, we’ll how the movement of objects tracked in the real world, including articulated objects such as people, can be sampled into a digital representation that effectively allows us to transform them into a digital objected that can be used to augment the original scene.
Motion capture, and more recently, performance capture, techniques have been used for several years by the film and computer games industry to capture human movements and use them to animate what amounts to a virtual puppet that can then be skinned as required within an animated scene. Typically, this would occur in post-production, where any latency associated with registering and tracking the actor, or animating and rendering the final scene, could largely be ignored.
However, as motion and performance capture systems have improved, so too has the responsiveness of these systems, allowing them to be used to produce live “rushes” of the captured performance with a rendered virtual scene. But let’s step back in time a little and look back at the origins of motion capture.
Motion capture – or mo-cap – refers to digitally capturing the movement of actors or objects for the purposes of animating digital movements. Markers placed on the actor or object allow the object to be tracked and its trajectory recorded. Associating points on a digital object with the recorded points allows the trajectory to be replayed out by the digital object. Motion capture extends the idea of tracking a single marker that might be used to locate a digital object in an augmented reality setting by tracking multiple markers with a known relationship to each other, such as different points on the body of a particular human actor.
An example of how motion capture techniques are used to animate the movement of objects, rather than actors, is provided by the BLACKBIRD adjustable electric car rig. This rig provides a customisable chassis – the length of the vehicle can be modified and the suspension is fully adjustable – that can be used to capture vehicle movements, and footage from within the vehicle. Markers placed on the rig are tracked in the normal way and then a digital body shell overlaid on the tracked registration points. The adaptable size of the rig allows marked points on differently sized vehicles to be accurately tracked. According to its designers, The Mill, an augmented reality application further “allows you to see the intended vehicle in CG, tracked live over the rig on location”.
Motion capture is a relatively low resolution or low fidelity technique that captures tens of points that can be used to animate a relatively large mass, such as a human character. However, whereas markers on human torsos and human limbs have a relatively limited range of free movement, animating facial expressions is far more complex, not least because the human brain is finely tuned to tracking human expressions on very expressive human faces. Which is where performance capture comes in.
Performance capture blends motion capture with at a relatively low resolution, typically, the orientation and relative placement of markers placed around limb joints, with more densely placed markers on the face. Facial markers are tracked using a head mounted camera along with any vocal performance provided by the actor.
Performance capture allows the facial performance of human actors to drive the facial performance of a digital character. By recording the vocal performance alongside the facial performance, “lip synch” between the voice and mouth movements of the character can be preserved.
As real-time image processing techniques have developed, markerless performance capture systems now exist, particularly for facial motion capture, that do not require any markers to be placed on the actor’s face.
In the case of facial markerless motion capture, multiple facial features are detected automatically and used to implicitly capture the motion of those features relative to each other.
As well as real time mocap and markerless performance capture, realtime previews of the digitally rendered backlot are also possible. Andy Serkis’ tour of his Imaginarium performance capture studio for Empire magazine demonstrates this to full effect.
Virtual cameras are described in more detail in the following clip.
SAQ: What is a virtual camera? To what extent do virtual cameras provide an augmented or mixed reality view of the world?
Originally developed as techniques for animating movements and facial performances in games or films that were then rendered as part of a time-consuming post-production process, the technology has developed to such an extent that motion and performance capture now allow objects and actors to be tracked in realtime. Captured data points can be used to animate the behaviour of digital actors, on digital backlots, providing a preview, in real time, of what the finally rendered scene might actually look like.
For the actor in such a performance space, there is an element of make believe about the setting and the form of the other actors they performing with – the actors can’t actually see the world they are supposed to be inhabiting, although the virtual cameraman, and director, can. Instead, the actors perform in what is effectively a featureless space.
For the making of the film Gravity, a new rig was developed known as the Light Box, that presented the actors with view of the digitally created world they were to be rendered in, as a side of effect of lighting the actors in such a way that it looked as if the light was coming from the photorealistic, digital environment they would be composited with.
SAQ: how might performance capture and augmented reality be used as part of a live theatrical experience? What challenges would such a performance present? Feel free to let your imagination run away with you!
Answer: As Andy Serkis’ Imaginarium demonstrates, facilities already exist where photorealistic digital worlds populated by real world characters can be rendered in real time so the director can get a feel for how the finished scene will look as it is being short. However, the digital sets and animated characters are only observable to third parties, rather than the actual participants in the scene, and then only from the perspective of a virtual camera. But what would it take to provide an audience with a realtime rendered view of an Imaginarium styled theatre set? For this to happen at a personal level would require multiple camera views, one for each seat in the audience, the computational power to render the scene for each member of the audience from their point-of-view, and personal, see-through augmented reality displays for each audience member.
Slightly simpler might be personally rendered views of the scene for each of the actors so that they themselves could see the digital world they were inhabiting, from their perspective. As virtual reality goggles would be likely to get in the way of facial motion capture, augmented reality displays capable of painting the digital scene from the actor’s perspective in real time would be required. For film-makers, though, the main question to ask then would be: what would such immersion mean to the actor’s in terms of their performance? And it’s hard to see what the benefit might be for the audience.
But perhaps there is a middle ground that would work? For example, the used of projection based augmented reality might be able to render digital backlot, at least for a limited field of view. Many stage magicians create illusions that only work from a particular perspective, although it limits the audience size. Another approach might be to use a Pepper’s Ghost style effect, or even hide the cast behind on-stage behind an opaque projection screen and play out their digitally rendered performance on the screen. Live animated theatre, or a digital puppet show. A bit like the Gorillaz…
Motion and performance capture are now an established part of film making, at least for big budget film producers, and digital rushes of digital backlots and digital characters previewed in real-time alongside the actors’ performances. It will be interesting to see the extent to which similar techniques might be used as part of live performance in front of a live audience.