In the post Taxonomies for Describing Mixed and Alternate Reality Systems we introduced various schemes for categorising and classifying the various components of mixed and augmented reality systems. In this post, we will see how one particular class of display – see-through displays – can be put to practical purpose.
Using a phone, or tablet, with a forward facing, back-mounted camera as a see-through video display, you can relay the camera view to the screen in a realtime view mode and manipulate the current scene. This approach has been referred to as a “magic lens, … a see-through interface/metaphor that affords the user a modified view of the scene behind the lens” (D. Baričević, C. Lee, M. Turk, T. Höllerer and D. A. Bowman, “A hand-held AR magic lens with user-perspective rendering,” Mixed and Augmented Reality (ISMAR), 2012 IEEE International Symposium on, Atlanta, GA, 2012, pp. 197-206, doi: 10.1109/ISMAR.2012.6402557 [PDF]). (See also M. Rohs and A. Oulasvitra. Target acquisition with camera phones when used as magic lens, Proceedings of the 26th international conference on Human Factors in Computing Systems, CHI ’08, pages 1409–1418. ACM, 2008, who define a magic lens as an “augmented reality interface thats consist of a camera-equipped mobile device being used as a see-through tool. It augments the user’s view of real world objects by graphical and textual overlays”).
However, as the paper noted at the time:
Many existing concept images of AR magic lenses show that the magic lens displays a scene from the user’s perspective, as if the display were a smart transparent frame allowing for perspective-correct overlays. This is arguably the most intuitive view. However, the actual magic lens shows the augmented scene from the point of view of the camera on the hand-held device. The perspective of that camera can be very different from the perspective of the user, so what the user sees does not align with the real world. … We define the user-perspective view as the geometrically correct view of a scene from the point-of-view of the user, in the direction of the user’s view, and with the exact view the user should have in that direction.
Whilst head-up displays are also examples of see-through displays, many head-up displays do not necessarily situate the virtual digital objects as direct augmentations of the perceived physical world – rather they are frequently pop-up style dashboard that open as “desktop windows” or pop-up menus within the visual scene, rather than as direct objects of physical objects perceived within the visual scene.
The first wave of consumer augmented reality applications relied printing out registration images or QR codes that could act as fiducial markers and be easily recognised using image recognition software, and then overlaid with a 3D animation.
If an image could reliably be detected, it could be used as part of an augmented reality system, resulting in some innovative marketing campaigns.
The same idea can be used to enhance two-dimensional print publications. With a suitable device and the appropriate app installed, you can recognise a particular page of print and “unlock” additional content, an approach taken by the Layar augmented reality app, among other things, that allows you you create your own augmented reality enhanced content.
For more confident programmers, one of the earliest widely available augmented reality programming toolkits, the open source ARToolkit, (which is still being developed today and is distributed for free at ARToolkit.org), and the Wikitude SDK (software development kit), which allow professional and hobbiest programmers alike to create their own augmented reality demonstrations. (See also commercial services such as Catchoom: CraftAR.)
Within all these applications, we see how there is a need for “enabling technologies, … advances in the basic technologies needed to build compelling AR environments. Examples of these technologies include displays, tracking, registration, and calibration” (Azuma, Ronald, Yohan Baillot, Reinhold Behringer, Steven Feiner, Simon Julier, and Blair MacIntyre. “Recent advances in augmented reality.” IEEE computer graphics and applications 21, no. 6 (2001): 34-47) that make the development of such systems possible by developers outside of advanced research and development labs.
One popular category of AR Toolkit demonstration, and an approach that hints at a particular category of potential augmented reality applications, was the development of interactive Lego model assembly manuals. These could recognise a registration image associated with a particular model and could then step through the sequential steps required to build the model, overlaying the next piece to be added to the model in a stepwise fashion. The known size and of the marker, the fixed geometry of the model, and the availability of open source Lego CAD tools based around LDraw meant that many of the physical and computational building blocks required for creating such applications were already in place.
SAQ: What’s wrong with the demonstration shown in the video above?
Answer: The placing of the virtual block on the model does not appear to be in the correct place, but is offset slightly. This might arise from a combination of issues, including the placement of the registered image or the positioning of the or see-through device or the camera used to record the video.
An earlier demonstration of a Lego construction model instruction manual includes some additional humour in the form of an animated Lego figure mechanic who fetches an appropriate piece at each step and then demonstrates where to attach it to a model based around the original Lego Mindstorms Robot Invention System.
The demonstration also shows how augmented reality can be used to to test the operation of the completed assembly, stepping the user through a test sequence and virtually animating the expected behaviour. The ability of the RCX computer brick at the heart of the model to communicate back to the computer hosting the manual also allowed the augmented reality layer to display information captured by the brick (the light sensor readings) to be displayed in the augmented reality layer.
SAQ: how might advances in 3D image recognition technology be used to further improve the functionality of the manual, for example, in terms of checking the correct assembly of the model? What other enabling technologies may also help in this endeavour?
Answer: as the ability to recognise, identify and orientate 3D objects improved, the potential for generating digital overlays on three dimensional objects became more tractable. This means that it may be possible to recognise pieces picked up by the person building the model and checked by the interactive manual against the expected part. Erroneous parts could be highlighted with a warning sign. Additionally, the state of the model after each step is completed visually checked to see that the correct piece appears to have been placed in the correct position, although this is likely to present a mode complex task and may not be possible. If a piece could be identified as incorrectly placed (for example, in a “likely possible” misplaced position) the instruction manual might show how to move the part to the correct place, or rewind to show where the piece should have been placed.
The availability of programming building blocks capable of regnising individual lego bricks and associating them with a part number also used by CAD tools such as LDraw could be seen as an enabling technology for the further development of such diagnostic AR tools.
As a proof-of-concept idea, augmented reality Lego construction manuals provide a realistic, if toy example (in several senses of the word!) of how such techniques might be used in a practical setting. So it’s not surprising that augmented reality instruction manuals were among the first application areas described when the possibility of AR first began to emerge.
Optional Reading: two relatively early descriptions of augmented reality instruction and assembly manuals can be found in: Caudell, Thomas P., and David W. Mizell. “Augmented reality: An application of heads-up display technology to manual manufacturing processes“, Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences, vol. 2, pp. 659-669. IEEE, 1992.; and Feiner, Steven, Blair Macintyre, and Dorée Seligmann, “Knowledge-based augmented reality.” Communications of the ACM 36, no. 7 (1993): 53-62.
A similar approach is also being used to develop service manuals for use in industry, via magic lens displays, and presumably also in smart helmet displays:
Tablet, or phone, based magic lens apps are also being use to support car maintenance in the form of augmented reality car manuals that are capable of recognising particular features of a car dashboard, or engine, and the interactively annotating them with a virtual overlay, as described in this press release from Hyundai about their augmented reality car ownersmanual:
On the other hand, maybe such applications are just so much hype and not actually of interest to a wider public? For example, in 2013, one company attempted to crowdsource funding for a general purpose app that could act as an AR style guide for a range of car models:
At the time of writing, in mid-2016, the site appears to have raised almost $500 of the $90, 000 goal. Maybe augmented reality is not that compelling for the mass market?!
However, it seems as if the development of augmented reality technical documentation is still an area of academic research, at least.
In the next post on this theme, we’ll see how augmented reality can be used to implement “magic mirrors”, in contrast to “magic lenses”. But first – what “lenses” can we apply to another modality: sound?