Digital Worlds – The Blogged Uncourse

Digital Worlds – Interactive Media and Game Design was originally developed as a free learning resource on computer game design, development and culture, authored as part of an experimental approach to the production of online distance learning materials. Many of the resources presented on this blog also found their way into a for credit, formal education course from the UK’s Open University.

This blog was rebooted at the start of summer 2016 to act as a repository for short pieces relating to mixed and augmented reality, and related areas of media/reality distortion, as preparation for a unit on the subject in a forthcoming first level Open University course.

Smart Hearing

As we have already seen, there are several enabling technologies that need to be in place in order to put together an effective mediated reality system. In a visual augmented reality system, this includes having some sort of device for recording the visual scene, tracking objects within it, rendering augmented features in the scene and the some means of displaying the scene to the user. We reviewed a range of approaches for rendering augmented visual scenes in the post Taxonomies for Describing Mixed and Alternate Reality Systems, but how might we go about implementing an audio based mediated reality?

In Noise Cancellation – An Example of Mediated Audio Reality?, we saw how headphone based systems could be used to present a processed audio signal to a subject directly – a proximal form of mediation such as a head mounted display – or a speaker could be used to provide a more environmental form of mediation rather more akin to a projection based system in the visual sense.

Whilst enabling technologies for video based proximal AR systems are still at the clunky prototype stage, at best, discreet solutions for realtime, daily use, audio based mediation already exist, complemented in recent years by advanced digital signal processing techniques, in the form of hearing aids.

The following promotional video shows how far hearing aids have developed in recent years, moving from simple amplifiers to complex devices combing digital signal processing of the audio environment with integration with other audio generating devices, such as phones, radios and televisions.

To manage the range of features offered by such devices, they are complemented by full featured remote control apps that allow the user to control what they hear, as well as how they hear it – audio hyper-reality:

The following video review of the here “Active Listening” earbuds further demonstrates how “audio wearables” can provide a range of audio effects – and capabilities – that can augment the hearing of a wearer who does not necessarily suffer from hearing loss or some other form of hearing impairment. (If you’d rather read a review of the same device, the Vice Motherboard blog has one – These Earbuds Are Like Instagram Filters for Live Music.)

SAQ: What practical challenges face designers of in-ear, wirelessly connected audio devices?
Answer: I can think of two immediately: how is the wireless signal received (what sort of antenna is required?) and how is the device powered?

Hearing aids are typically comprised of several elements: an earpiece that transmits sound into the ear; a microphone that receives the sound; and amplifier that amplifies the sound; and a battery pack that powers the device. Digital hearing aids may also include remote control circuitry to allow the hearing aid to be controlled remotely; circuitry to support digital signal processing of the received sound; and even a wireless receiver capable of receiving and then replaying sound files or streams from a mobile phone or computer.

Digital hearing aids can be configured to tune the frequency response of the device to suit the needs of each individual user as the following video demonstrates.

Hearing aids come in a range of form factors – NHS Direct describes the following:

  • Behind-the-ear (BTE): rests behind the ear and sends sound into the ear through either an earmould or a small, soft tip (called an open fitting)
  • In-the-ear (ITE): sits in the ear canal and the shell of the ear
  • In-the-canal (ITC): working parts in the earmould, so the whole hearing aid fits inside the ear canal
  • Completely-in-the-canal (CIC): fits further into your ear canal than an ITC aid

Age UK further identify two forms of spectacle hearing aid systemsbone conduction devices and air conduction devices – that are suited to different forms of hearing loss:

With a conductive hearing loss there is some physical obstruction to conducting the sound through the outer ear, eardrum or middle ear (such as a wax blockage, or perforated eardrum). This can mean that the inner ear or nerve centre on that ear is in good shape, and by sending sound straight through the bone behind a patient’s ear the hearing loss can effectively be bypassed. Bone Conduction or “BC” spectacle hearing aids are ideal for this because a transducer is mounted in the arm of the glasses behind the ear that will transmit the sound through the bone to the inner ear instead of along the ear canal.

Sensorineural hearing loss occurs when the anatomical site responsible for the deficiency is the inner ear or further along the auditory pathway (such as age related loss or noise induced hearing loss). Delivering the sound via a route other than the ear canal will not help in these cases, so Air Conduction “AC” spectacle hearing aids are utilised with a traditional form of hearing aid discreetly mounted in the arm of the glasses and either an earmould or receiver with a soft dome in the ear canal.

The following video shows how the frames of digital hearing glasses can be used to package the components required to implement to hearing aid.

And the following promotional video shows in a little more detail how the glasses are put together – and how they are used in everyday life (with a full range of digital features included!).

EXERCISE: Read the following article from The Atlantic – “What My Hearing Aid Taught Me About the Future of Wearables”. What does the author think wearable devices need to offer to make the user want to wear them? How does the author’s experience of wearing a hearing aid colour his view of how wearable devices might develop in the near future?

Many people wear spectacles and/or hearing aids as part of their everyday life, “boosting” the perception of reality around them in particular ways in order to compensate for less than perfect eyesight or vision. Advances in hearing aids suggest that many hearing aid users may already be benefiting from reality augmentations that people without hearing difficulties may also value. And whilst wearing spectacles to correct for poor vision is a commonplace, it is possible to wear eyewear without a corrective function as a fashion item or accessory. Devices such as hearing spectacles already provide a means of combining battery powered, wifi connected audio as well as “passive” visual enhancements (corrective lenses). So might we start to see those sorts of device evolving as augmented reality headwear?

Even if the Camera Never Lies, the Retouched Photo Might…

In Hyper-reality Offline – Creating Videos from Photos, we saw how a single flat image could be transformed in order to provide a range of video effects. In this post, we’ll review some of the other ways in which photographs of real objects may be transformed to slightly less real – or should that be hyper-real – objects, and consider some of the questions such manipulations raise about the authenticity of photographic images.

In 2014, an unsuccessful bill was introduced to the US House of Representatives that sought to introduce controls around “photoshopping”, based on the principle that “altered images [of models’ faces and bodies] can create distorted and unrealistic expectations and understandings of appropriate and healthy weight and body image” (Slate: Legislating Realism). The Bill reappeared in the 114th session of Congress in 2016 as H.R.4445 – Truth in Advertising Act of 2016:

This bill directs the Federal Trade Commission (FTC) to submit a report to Congress assessing the prevalence, in advertisements and other media for the promotion of commercial products and services in the United States, of images that have been altered to materially change the appearance and physical characteristics of the faces and bodies of the individuals depicted. The report must contain: (1) an evaluation of the degree to which such use of altered images may constitute an unfair or deceptive act or practice, (2) guidelines for advertisers regarding how the FTC determines whether such use constitutes an unfair or deceptive act or practice, and (3) recommendations reflecting a consensus of stakeholders and experts to reduce consumer harm arising from such use.

OPTIONAL READING: Forsey, Logan A., “Towards a Workable Rubric for Assessing Photoshop Liability” (2013). Law School Student Scholarship. Paper 222. http://scholarship.shu.edu/student_scholarship/222

Photoshopping – the use of digital photograph editors such as Adobe Photoshop – is a form of reality mediation in which a photograph is transformed to “improve it”. In many cases, photoshopped images may be viewed as examples of “hyper-reality” in that no additional information over and above the scene captured in the photograph is introduced, but elements of the scene may be highlighted, or “tidied up”.

As you might expect, the intention behind many advertising campaigns is to present a product in the best possible light whilst at the same time not misrepresenting it, which makes the prospect of using photo-manipulation attractive. The following  promotional video from the marketers at McDonald’s Canada – Behind the scenes at a McDonald’s photo shoot – shows how set dressing and post-production photo manipulation are used to present the product in the best possible light, whilst still maintaining some claims about the “truthfulness” of the final image.

Undoubtedly, many food photographers manipulate reality whilst at the still time being able to argue that the final photograph is a “fair” representation of it. In a similar way, fashion photography relies on the manipulation of “real” models prior to the digital manipulation of their captured likenesses. A behind the scenes video – Dove Evolution – from beauty product manufacturer, Dove, shows just how much transformation of the human model is applied prior to a photo-shoot, as well as the how much digital manipulation is applied after it.

Let’s see in a little more detail how the images can be transformed. One common way of retouching photos is to map a photographed object, which may be a person, onto a two dimensional mesh. Nodes in the mesh map on to points of interest in the image. As the mesh is transformed by dragging around nodes in the mesh, so too is the image, with points of interest tracking the repositioned nodes with the image content in each cell, or grid element, of the mesh being transformed appropriately.

As well as transforming body shapes, faces can be retouched in a similar way.

Surrounded as we are each day by commercially produced images, it’s important to consider the range of ways in which reality might be manipulated before it is presented to us.

EXERCISE: even if we may be a little sceptical around claims of truth in advertising, we typically expect factual documentaries to present a true view of the world, for some definition of truth. In 2015, the BBC were called to account around a documentary that appeared to depict a particular sort of volcano eruption, but that was actually a composited sequence from more than one eruption. See if you can track down one or two reports of the episode. WHat charges did the critics lay, and how did the documentary makers respond? What sort of editorial guidelines does the BBC follow in the production of natural history film-making?

EXAMPLE RESOURCES:
http://www.independent.co.uk/arts-entertainment/tv/features/bbc-feels-the-commercial-chill-of-fake-documentary-6276155.html
http://www.bbc.co.uk/editorialguidelines/guidance/natural-world/guidance-full
http://www.theguardian.com/media/2015/oct/03/bbc-lightning-erupting-volcano-fake-patagonia
http://www.bbc.co.uk/earth/story/20151002-how-to-capture-a-volcano

EXERCISE: As well as advertising and documentary making, news journalists may also be tempted to photoshop images for narrative or impact effect. Watch the following video, stopping and pausing the video where appropriate to note the different ways in which the photographs reviewed have been manipulated. To what extent, if any, do you think manipulations of that sort would be justifiable in a news reporting context? 

A post on the Hacker Factor blog – Body By Victoria – describes how a photographic image may be looked at forensically in order to find evidence of photoshopping.

As the Photoshop tools demonstrate, by mapping a photograph onto a mesh, transforming the mesh and then stretching pixel values or in-painting in a context sensitive way, photographic images can be reshaped whilst still maintaining some sort of integrity in the background, at least to the casual observer.

Increasingly, there are similarities between the tools used to create digital objects from scratch, and the tools used to manipulate real world objects captured into digital form. The computer uses a similar form of representation in each case – a mesh – and supports similar sorts of manipulation on each sort of object. As you will see in several other posts, the ability to create photorealistic objects as digital objects from scratch on the one hand, and the ability to capture the form, likeness and behaviour of a physical object into a digital form means that we are truly starting to blur the edges of reality around any image viewed through a screen.

Hyper-reality Offline – Creating Videos from Photos

In Mediating the Background and the Foreground – From Green Screen and Chroma-Key Effects to Virtual Sets we saw how green screen/chroma key effects could be used to mask out part of one image so that it could be composited with another. In this post, you’ll see how we can also generate animation effects from a single image.

Many of you will recognise the following effect from television documentaries, as well as screen savers or photo-stories:

Know as the Ken Burns effect, named after the documentary maker who made extensive use of the technique, it allows a moving image to be generated from a still photograph by panning and zooming across the image.

But what happens if you take a flat, static image, separate out the foreground and background elements, and then apply the effect, panning and zooming foreground and background elements differentially to create a “2.5D” parallax effect?

These views can be created from a single, flat image by cutting the foreground component out into its own layer, and then inpainting the background layer;  when the foreground component moves relative to the background, the inpainted area hides the fact that that part of the original image was taken up by the foreground component.

The inpainting effect can be achieved by applying an image processing technique that works from the edge of a cropped area inwards, trying to predict what value each missing neighbouring pixel should be based on the actual set values of the surrounding pixels. More elaborate techniques allow for “content aware” fills, in which the patterns generated from the surrounding texture are used to fill in the missing area. The following video show how to apply such as content aware effect in a popular photo-editing tool.

An extension of the technique – content aware crop – automatically inpaints whitespace around the edge of an image when changing the aspect ration of an image, such as following a straightening of the horizon.

Developing algorithms for improved content aware fills is an active area of academic, as well as commercial, research (eg (Pathak, Deepak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A. Efros. “Context Encoders: Feature Learning by Inpainting.arXiv preprint arXiv:1604.07379 (2016).)).

Related techniques can be used to improve the quality of images, as demonstrated by the Magic Pony Technology company (MIT Technology Review – Artificial Intelligence Can Now Design Realistic Video and Game Imagery).

Additional 2.5d effects can be created by animating both the foreground and background elements. Alternatively, by associating a mesh with particular points in a photo, translating those points appropriately results in the animation of the meshed element .

These effects are all based on the manipulation of pixels within a static image. But as you’ll see in another post, flat images can also be used as the basis for generating three dimensional models.

Tuning the colour palette of an image can is another technique that can be used to make it feel hyper-real, or somehow sharper than the captured reality. Similar techniques can also be applied to video to create a stylised hyper-real video effect.

As you are perhaps beginning to realise, many mediated reality effects rely on a whole stack of technologies, techniques or other effects being available first. But this in turn means that many of today’s yet-to-be invented techniques are likely to be built from a novel combination of techniques that already exist, or that can be built on; and once those new techniques are identified, and tools built to implement them efficiently, they in turn will provide the basis for yet more techniques.

Mediating the Background and the Foreground – From Green Screen and Chroma-Key Effects to Virtual Sets

It may be hard to remember now, but the first digital cameras only started to appear on the shelves in 1990, to be replaced for many just a decode later by camera replacing smartphones. Prior to that, cameras were film based, or produced “self-developing” polaroid photos, printed by the camera itself. Many film based cameras required the film to be manually “wound on” between taking one photograph and the next. Failure to do this could result in a particular piece of the film being double exposed, with the result that two photographs could be superimposed. Such tricks were well know to photographers and film makers alike, and multiple exposure techniques, along with other tricks of the photographer’s trade, were widely used for creating otherwise impossible to record scenes.

In Behind the Scenes of Sports Broadcasting – Virtual Sets, Virtual Signage and Virtual Advertising, we saw how sports broadcasters could make use of virtual sets to enhance outside, on-location settings. Virtual sets are increasingly used by broadcasters for a wide range of other live television formats, such as news and politics, with digital objects often appearing in front of the presenters. This contrasts with a traditional green screen effect where the background behind the presenter is replaced. In film studios, virtual “backlots” may make extensive use of green screens to replace the need for unwieldy physical sets with digital ones that are rendered in post production.

So how do green screen, or “chroma key” effects work?

Green Screen Effects

Green screen style effects have a long history in film and television and were available long before digital green screen effects became available. The effect relies on producing a matte, or travelling matte, from an image that allows elements from two separate images to be combined in a single image, a process referred to as compositing. By making part of one image transparent, it can be layered on top of another background image.

So what happened next?

(The earlier parts of the above video also retrace the history of the green screen effect.)

Even without digital technologies and the introduction of virtual digital objects, compositing multiple takes of a few human actors can be used to generated a visual scene that appears to include a cast of thousands:

The following showreel provides some examples of how the green screen effect has been put to use as a virtual backlot for movies:

Exercise: see if you can find behind the scenes footage of the visual effects – VFX – used to create one or two recent of your most recent favourite films.

One problem with the chroma key approach is that much of the magic is done in post-production, and not in real time. But as you will know from watching TV weather reports, chroma-key key effects can be used for real time mediation of video imagery. And increasingly, green screen technique can be used to produce a virtual studio or virtual set in real time, with no post production required:

SAQ: What are the similarities and differences between virtual studio or virtual set and chroma key techniques?

The virtual set itself is a 3D digital model that is rendered around the human presenter(s).

An important part of the system is the ability to track the location of the camera, as well as physical objects within the set.

Replacing part of the visual scene using a chroma-key effect is a tried and trusted technique, and as work on virtual sets shows can be used to support real-time mediated reality effects. Tracking objects within the set allows digital objects to be overlaid on those tracked physical objects. But object tracking in the form of motion capture, and the even more refined performance capture, can be used as the basis for far more elaborate visual effects, as we’ll see in another post.

But first, let’s step aside for a moment, and see how the notion of image layers can be used to transform a single photograph into a short video…

Behind the Scenes of Sports Broadcasting – Virtual Sets, VIrtual Signage and Virtual Advertising

In the post Augmented TV Sports Coverage & Live TV Graphics, we saw how live TV graphics could be used to overlay sports events in order to highlight particular elements of the sports action.

One of the thing things you may have noticed in some of the broadcasts was that as well as live “telestrator” style effects, such as highlighting the trajectory of a ball, or participant tracking effects, many of the scenes also included on pitch advertising. So was the pitch really painted with large adverts, or were they digital effects? The following showreel from Namadgi Systems (which in its full form demonstrates many of the effects shown in the previously mentioned post) suggests that the on pitch advert are, in fact, digital creations. Other vendors of similar services include Broadcast Virtual and BrandMagic.

So-called virtual advertising allows digitally rendered adverts to be embedded into live broadcast feeds in a way that makes the adverts appear as if they are situated on or near the field of play. As such, to the viewer of the broadcast, it may appear as if the advert would be visible to the spectators present at the event. In fact, it may be the case that the insert is an entirely digital creation, an overlay on top some sort of distinguished marker or location (determined relative to an easily detected pitch boundary, for example), or a replacement of a static, easily recognised and masked local advert.

EXERCISE: Watch the following video and see how many different forms of virtual advertising you can detect.

So how many different ways of delivering mediated reality ads did you find?

The following marketing video from Supponor advertises their “digital billboard replacement” (DBRLive) product that is capable of identifying and tracking track or pitched advertising hoardings and replacing them with custom adverts.

EXERCISE: what do you think are the advantages of using digital signage over fixed advertising billboards? What further advantages do “replacement” techniques such as DBRLive have over traditional digital signage? To what extent do you think DBRLive is a mediated reality application?

As well as transforming the perimeter, and event the playing area, with digital adverts, sports broadcasters often present a mediated view of the studio set inhabited by the host and selected pundits to provide continuity during breaks in the action, as the following corporate video from vizrt describes:

So how do virtual sets work and how do they compare with the “chroma key” effects used in TV and film production since the 1940s? We’ll need another post for that…

From Sports Tracking to Surveillance Tracking…

In the post Augmented TV Sports Coverage & Live TV Graphics, we saw how sports broadcasters increasingly make use of effects that highlight tracked elements in a sporting event, from the players in a football match to the ball they are playing with. So how else might we apply such tracking technologies?

According to Melvin Kranzberg’s first law of technology, “Technology is neither good nor bad; nor is it neutral”. In the sports context, we may be happy to thing that cameras can be used to track – and annotate – each player’s every move. But what if we take such technological capabilities and apply them elsewhere?

EXERCISE: As well as being used to support referees making decisions about boundary line events, such as whether a tennis ball landed “in” or “out”, or whether a football crossed the goal line, how might virtual boundaries be used as part of a video surveillance system? To what extent could image tracking systems also be used as part of a video surveillance system?

One way of using virtual boundaries as part of a video based surveillance system might be to use them as virtual trip wires, where breaches of a virtual boundary or fence can be used to flag a warning about a possible physical security breach and perhaps start a detailed recording of the scene.

ASIDE: The notion of virtual tripwires extends into other domains too. For example, for objects tracked using GPS, “geo-fences” can be defined that raise an alert when a tracked object enters, or leaves, a particular geographic area. The AIS ship identification system used to uniquely identify ships – and their locations – can be used as part of a geofenced application to raise an alert whenever a particular boat, such as a ferry, enters or leaves a port.

Video surveillance might also be used to track individuals through a videoed scene. For example, if a person of interest has been detected in a particular piece of footage, they might be automatically tracked through that scene. If multiple cameras cover the same area, persons of interest may be tracked across multiple video feeds, as described by Khan, Sohaib, Omar Javed, Zeeshan Rasheed, and Mubarak Shah. “Human tracking in multiple cameras.” In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on, vol. 1, pp. 331-336. IEEE, 2001.

Where the environment is rather more constrained, such as an office block, tools such as the FXPAL DOTS Video Surveillance System allow for individuals to be tracked throughout the building. Optional filters also allow tracking or identification based on the colour of clothing, which may be meaningful in an environment where different colour uniforms or protective clothing are used to identify people by role – and perhaps by different access permission levels.

Tracking is one thing – but identification of tracked entities is another. In some situations, however, tracked entities may carry clearly seen identifiers – such as car number plates. Automatic Number Plate Recognition (ANPR) is now a mature technology and is widely deployed against moving, as well as stationary, vehicles.

With technology firmly in place for tracking objects, and perhaps even identifying them, analysts are now turning their attention to systems that are capable of automatically identifying different events, or behaviours, within a visual scene, a step up from the simple “threshold crossing” behaviours used to implement virtual tripwires.

Once behaviours have been automatically identified, the visual scene may be overlaid with a statement of, or interpretation of, those behaviours.

Many technologies are developed for a particular purpose, but that does not prevent them being adopted for other purposes. When new technologies emerge, there are often many opportunities for businesses and entrepreneurs to find ways of using those technologies either on their own or in combination with other technologies. However, there are also risks, not least that the technology is used for a harmful purpose, or that that we do not approve of. More difficult is to try to predict what the consequences of using such technologies widely may be. As technologists, it’s our job to try to think critically about how emerging technologies may be used, whether for good, or evil, and contribute to debates about whether we want to approve the use of such technologies, or limit them in some way.

Augmented TV Sports Coverage & Live TV Graphics

In the post From Magic Lenses to Magic Mirrors and Back Again we saw how magic lenses allow users to look through a screen at a mediated view of the scene in front of them, and magic mirrors allow users to look at a mediated view of themselves. In this post, we will look at how remote viewer might capture a scene that is then mediated in some way before being presented to the viewer in near-real-time. In particular, we will consider how live televised sporting events may be augmented to enhance the viewer’s understanding or appreciation of the event.

Ever since the early days of television, TV graphics have been used to overlay information – often in the “lower third” of the screen – to provide a mediated view of the scene being displayed. For example, one of the most commonly scene lower third effects is to display a banner giving the name and affiliation of a “talking head”, such as a politician being interviewed in a news programme.

But in recent years, realtime annotation of elements within the visual scene have become possible, providing the producers of sports television in particular with a very rich and powerful way of enhancing the way that a particular event is covered with live TV graphics.

EXERCISE: from your own experience, try to recall two or three examples of how “augmented reality” style effects can be used to enhance televised sporting events in a real-time or near-realtime way.

Educators often use questions to focus the attention of the learner onto a particular matter. For example, an educator reading an academic paper may identify things of interest (to them) that they want the learner to pick up on. The educator then needs to find a way of twisting the attention of the learner to those points of interests. This is often what motivates the questions they set around a resource (its purpose is to help the students learn how to focus their attention on a resource and immediately reflect back why something in the paper might be interesting – by casting a question to which the item in the paper is the answer). When addressing a question, the learner also needs to appreciate that they expected to answer the question in an academic way. More generally, when you read something, read it with a set of questions in mind that may have been raised by reading the abstract. You can also annotate the reading with questions which that part of the reading answers. Another trick is to spot when part of the reading answers a question or addresses a topic you didn’t fully understand: “Ah, so that means if this, then that…”. This is  a simple trick, but a really powerful one nonetheless, and can help you develop your own self-learning skills.

EXERCISE: Read through the following abstract taken from a BBC R&D department white paper written in 2012 (Sports TV Applications of Computer Vision, riginally published in ‘Visual Analysis of Humans: Looking at People’, Moeslund, T. B.; Hilton, A.; Krüger, V.; Sigal, L. (Eds.), Springer 2011):

This chapter focuses on applications of Computer Vision that help the sports broadcaster illustrate, analyse and explain sporting events, by the generation of images and graphics that can be incorporated in the broadcast, providing visual support to the commentators and pundits. After a discussion of simple graphics overlay on static images, systems are described that rely on calibrated cameras to insert graphics or to overlay content from other images. Approaches are then discussed that use computer vision to provide more advanced effects, for tasks such as segmenting people from the background, and inferring the 3D position of people and balls. As camera calibration is a key component for all but the simplest applications, an approach to real-time calibration of broadcast cameras is then presented. The chapter concludes with a discussion of some current challenges.

How might the techniques described be relevant to / relate to AR?

Now read through the rest of the paper, and try to answer the following questions as you do so:

  • what is a “free viewpoint”?
  • what is a “telestrator” – to what extent might you claim this is an example of AR?
  • what approaches were taken to providing “Graphics overlay on a calibrated camera image”? How does this compare with AR techniques? Is this AR?
  • what is Foxtrax and how does it work?
  • what effects are possible once you “segment people or other moving objects from the background”? What practical difficulties must be overcome when creating such an effect?
  • how might prior knowledge help when constructing tracking systems? What additional difficulties arise when tracking people?
  • how can environmental features/signals be used to help calibrate camera settings? what does it even mean to calibrate a camera?
  • what difficulties are associated with  Segmentation, identification and tracking?

The white paper also identifies the following challenges to “successfully applying computer vision techniques to applications in TV sports coverage”:

The environment in which the system is to be used is generally out of the control of the system developer, including aspects such as lighting, appearance of the background, clothing of the players, and the size and location of the area of interest. For many applications, it is either essential or highly desirable to use video feeds from existing broadcast cameras, meaning that the location and motion of the cameras is also outside the control of the system designer.

  • The system needs to fit in with existing production workflows, often needing to be used live or with a short turn-around time, or being able to be applied to a recording from a single camera.
  • The system must also give good value-for-money or offer new things compared to other ways of enhancing sports coverage. There are many approaches that may be less technically interesting than applying computer vision techniques, but nevertheless give significant added value, such as miniature cameras or microphones placed in a in cricket stump, a ‘flying’ camera suspended on wires above a football pitch, or a high frame-rate cameras for super-slow-motion.

To what extent do you think those sorts of issues apply more generally to augmented and mediated reality systems?

In the rest of this post, you will some some examples of how computer vision driven television graphics have been used in recent years. As you watch the videos, try to relate the techniques demonstrated with the issues raised in the white paper.

From 2004 to 2010, the BBC R&D department, in association with Red Bee Media, worked on a system known as Piero, now owned by Ericsson, that explored a wide range of augmentation techniques. Watch the following videos and see how many different sorts of “augmentation” effect you can identify. In each case, what sorts of enabling technology do you think are required in order to put together a system capable of generating such an effect?

In the US, SportVision provide a range of real-time enhancements for televised sports coverage. The following video demonstrates car and player tracking in motor-racing and football respectively, ball tracking in baseball and football (soccer), and a range of other “event” related enhancements, such as offside lines or player highlighting in football (soccer).

EXERCISE: watch the SportVision 2012 showreel on the SportVision website. How many different augmented reality style effects did you see demonstrated in the showreel?

For further examples, see the case studies published by vizrt.

Watching the videos, there are several examples of how items tracked in realtime can be visualised, either to highlight a particular object or feature (such as tracking a player, highlighting the position of a ball, puck, or car), or trace out the trajectory followed by the object (for example, highlighting in realtime the path followed by a ball).

Having seen some examples of the techniques in action, and perhaps started to ask yourself “how did they do that?”, skim back over the BBC white paper to see if any of the sections jump out at you in answer to your self-posed questions.

In the UK, Hawk-Eye Innovations is one of the most well known providers of such services to UK TV sports viewers.

The following video describes in a little more detail how the Hawk-Eye system can be used to enhance snooker coverage.

And how Hawk-Eye is used in tennis:

In much the same way as sportsmen compete on the field of play, so too do rival technology companies. In the 2010 Ashes series, Hawk-Eye founder Paul Hawkins suggested that a system provided by rivals VirtualEye could lead to inaccurate adjudications due to human operator error compared to the (at the time) more completely automated Hawk-Eye system (The Ashes 2010: Hawk-Eye founder claims rival system is not being so eagle-eyed).

The following video demonstrates how the Virtual Eye ball tracking software worked to highlight the path of a cricket ball as it is being bowled:

EXERCISE: what are the benefits to sports producers from using augmented reality style, realtime television graphics as part of their production?

The following video demonstrates how the SportVision Liveline effect can be used to help illustrate what’s actually happening in an Americas Cup yacht race, which can often be hard to follow for the casual viewer:

EXERCISE: To what extent might such effects be possible in a magic lens style application that could be used by a spectator actually witnessing a live sporting event?

EXERCISE: review some of the video graphics effects projects undertaken in recent years by the BBC R&D department. To what extent do the projects require: a) the modeling of the world with a virtual representation of it; b) the tracking of objects within the visual scene; c) the compositing of multiple video elements, or the introduction of digital objects within the visual scene?

As a quick review of the BBC R&D projects in this area suggests, the development of on-screen graphics that can track objects in real time may be complemented by the development of 3D models of the televised view so that it can be inspected from virtual camera positions that provide a view of the scene that is reconstrcuted from a model bulit up from the real camera positions.

Once again, though, there may be a blurring of reality – because is the view actually taken from a virtual camera, or a real one such as in the form of a Spidercam?

As well as overlaying actual footage with digital effects, sports producers are also starting to introduce virtual digital objects into the studio to provide an augmented reality style view of the studio to the viewer at home.

The use of 3D graphics in TV studios is increasingly being used to dress other elements of the set. In addition, graphics are also being used to enhance TV sports through the use of virtual advertising. Both these approaches will be discussed in another post.

More generally, digital visual effects are used widely across film and television, as we shall also explore in a later post…


Categories


Follow

Get every new post delivered to your Inbox.

Join 66 other followers