In the post From Magic Lenses to Magic Mirrors and Back Again we saw how magic lenses allow users to look through a screen at a mediated view of the scene in front of them, and magic mirrors allow users to look at a mediated view of themselves. In this post, we will look at how remote viewer might capture a scene that is then mediated in some way before being presented to the viewer in near-real-time. In particular, we will consider how live televised sporting events may be augmented to enhance the viewer’s understanding or appreciation of the event.
Ever since the early days of television, TV graphics have been used to overlay information – often in the “lower third” of the screen – to provide a mediated view of the scene being displayed. For example, one of the most commonly scene lower third effects is to display a banner giving the name and affiliation of a “talking head”, such as a politician being interviewed in a news programme.
But in recent years, realtime annotation of elements within the visual scene have become possible, providing the producers of sports television in particular with a very rich and powerful way of enhancing the way that a particular event is covered with live TV graphics.
EXERCISE: from your own experience, try to recall two or three examples of how “augmented reality” style effects can be used to enhance televised sporting events in a real-time or near-realtime way.
Educators often use questions to focus the attention of the learner onto a particular matter. For example, an educator reading an academic paper may identify things of interest (to them) that they want the learner to pick up on. The educator then needs to find a way of twisting the attention of the learner to those points of interests. This is often what motivates the questions they set around a resource (its purpose is to help the students learn how to focus their attention on a resource and immediately reflect back why something in the paper might be interesting – by casting a question to which the item in the paper is the answer). When addressing a question, the learner also needs to appreciate that they expected to answer the question in an academic way. More generally, when you read something, read it with a set of questions in mind that may have been raised by reading the abstract. You can also annotate the reading with questions which that part of the reading answers. Another trick is to spot when part of the reading answers a question or addresses a topic you didn’t fully understand: “Ah, so that means if this, then that…”. This is a simple trick, but a really powerful one nonetheless, and can help you develop your own self-learning skills.
EXERCISE: Read through the following abstract taken from a BBC R&D department white paper written in 2012 (Sports TV Applications of Computer Vision, riginally published in ‘Visual Analysis of Humans: Looking at People’, Moeslund, T. B.; Hilton, A.; Krüger, V.; Sigal, L. (Eds.), Springer 2011):
This chapter focuses on applications of Computer Vision that help the sports broadcaster illustrate, analyse and explain sporting events, by the generation of images and graphics that can be incorporated in the broadcast, providing visual support to the commentators and pundits. After a discussion of simple graphics overlay on static images, systems are described that rely on calibrated cameras to insert graphics or to overlay content from other images. Approaches are then discussed that use computer vision to provide more advanced effects, for tasks such as segmenting people from the background, and inferring the 3D position of people and balls. As camera calibration is a key component for all but the simplest applications, an approach to real-time calibration of broadcast cameras is then presented. The chapter concludes with a discussion of some current challenges.
How might the techniques described be relevant to / relate to AR?
Now read through the rest of the paper, and try to answer the following questions as you do so:
- what is a “free viewpoint”?
- what is a “telestrator” – to what extent might you claim this is an example of AR?
- what approaches were taken to providing “Graphics overlay on a calibrated camera image”? How does this compare with AR techniques? Is this AR?
- what is Foxtrax and how does it work?
- what effects are possible once you “segment people or other moving objects from the background”? What practical difficulties must be overcome when creating such an effect?
- how might prior knowledge help when constructing tracking systems? What additional difficulties arise when tracking people?
- how can environmental features/signals be used to help calibrate camera settings? what does it even mean to calibrate a camera?
- what difficulties are associated with Segmentation, identification and tracking?
The white paper also identifies the following challenges to “successfully applying computer vision techniques to applications in TV sports coverage”:
The environment in which the system is to be used is generally out of the control of the system developer, including aspects such as lighting, appearance of the background, clothing of the players, and the size and location of the area of interest. For many applications, it is either essential or highly desirable to use video feeds from existing broadcast cameras, meaning that the location and motion of the cameras is also outside the control of the system designer.
- The system needs to fit in with existing production workflows, often needing to be used live or with a short turn-around time, or being able to be applied to a recording from a single camera.
- The system must also give good value-for-money or offer new things compared to other ways of enhancing sports coverage. There are many approaches that may be less technically interesting than applying computer vision techniques, but nevertheless give significant added value, such as miniature cameras or microphones placed in a in cricket stump, a ‘flying’ camera suspended on wires above a football pitch, or a high frame-rate cameras for super-slow-motion.
To what extent do you think those sorts of issues apply more generally to augmented and mediated reality systems?
In the rest of this post, you will some some examples of how computer vision driven television graphics have been used in recent years. As you watch the videos, try to relate the techniques demonstrated with the issues raised in the white paper.
From 2004 to 2010, the BBC R&D department, in association with Red Bee Media, worked on a system known as Piero, now owned by Ericsson, that explored a wide range of augmentation techniques. Watch the following videos and see how many different sorts of “augmentation” effect you can identify. In each case, what sorts of enabling technology do you think are required in order to put together a system capable of generating such an effect?
In the US, SportVision provide a range of real-time enhancements for televised sports coverage. The following video demonstrates car and player tracking in motor-racing and football respectively, ball tracking in baseball and football (soccer), and a range of other “event” related enhancements, such as offside lines or player highlighting in football (soccer).
EXERCISE: watch the SportVision 2012 showreel on the SportVision website. How many different augmented reality style effects did you see demonstrated in the showreel?
For further examples, see the case studies published by vizrt.
Watching the videos, there are several examples of how items tracked in realtime can be visualised, either to highlight a particular object or feature (such as tracking a player, highlighting the position of a ball, puck, or car), or trace out the trajectory followed by the object (for example, highlighting in realtime the path followed by a ball).
Having seen some examples of the techniques in action, and perhaps started to ask yourself “how did they do that?”, skim back over the BBC white paper to see if any of the sections jump out at you in answer to your self-posed questions.
In the UK, Hawk-Eye Innovations is one of the most well known providers of such services to UK TV sports viewers.
The following video describes in a little more detail how the Hawk-Eye system can be used to enhance snooker coverage.
And how Hawk-Eye is used in tennis:
In much the same way as sportsmen compete on the field of play, so too do rival technology companies. In the 2010 Ashes series, Hawk-Eye founder Paul Hawkins suggested that a system provided by rivals VirtualEye could lead to inaccurate adjudications due to human operator error compared to the (at the time) more completely automated Hawk-Eye system (The Ashes 2010: Hawk-Eye founder claims rival system is not being so eagle-eyed).
The following video demonstrates how the Virtual Eye ball tracking software worked to highlight the path of a cricket ball as it is being bowled:
EXERCISE: what are the benefits to sports producers from using augmented reality style, realtime television graphics as part of their production?
The following video demonstrates how the SportVision Liveline effect can be used to help illustrate what’s actually happening in an Americas Cup yacht race, which can often be hard to follow for the casual viewer:
EXERCISE: To what extent might such effects be possible in a magic lens style application that could be used by a spectator actually witnessing a live sporting event?
EXERCISE: review some of the video graphics effects projects undertaken in recent years by the BBC R&D department. To what extent do the projects require: a) the modeling of the world with a virtual representation of it; b) the tracking of objects within the visual scene; c) the compositing of multiple video elements, or the introduction of digital objects within the visual scene?
As a quick review of the BBC R&D projects in this area suggests, the development of on-screen graphics that can track objects in real time may be complemented by the development of 3D models of the televised view so that it can be inspected from virtual camera positions that provide a view of the scene that is reconstrcuted from a model bulit up from the real camera positions.
Once again, though, there may be a blurring of reality – because is the view actually taken from a virtual camera, or a real one such as in the form of a Spidercam?
As well as overlaying actual footage with digital effects, sports producers are also starting to introduce virtual digital objects into the studio to provide an augmented reality style view of the studio to the viewer at home.
The use of 3D graphics in TV studios is increasingly being used to dress other elements of the set. In addition, graphics are also being used to enhance TV sports through the use of virtual advertising. Both these approaches will be discussed in another post.
More generally, digital visual effects are used widely across film and television, as we shall also explore in a later post…