In the post Introducing Augmented Reality Apparatus – From Victorian Stage Effects to Head-Up Displays, we saw how a Victorian illusion could be repurposed as the basis of a a modern day augmented reality application. In this post, we’ll start to pick apart the various ways in which mixed and alternate reality systems can be put together, and explore how we can distinguish such systems from each other.
When coming to a new topic, it can often be hard to know how people who work in that area, or who are experts in it, make sense of it. If presented with a photograph of a bird and asked to identify it, an ornithologist (bird watcher) would almost certainly see different, and distinctive, things in the image than I would! So as we embark on out journey into augmented reality, what sort of things do we need to be looking out for to help us get our bearings?
A taxonomy describes a classification scheme that allows you categorise related items within a particular frame of reference in a meaningful way. Milgram and Kishino’s “A taxonomy of mixed reality visual displays”, helps us to identify a range of methods for displaying mixed reality scenes for viewing by individual’s in a non-immersive way (I am using “non-immersive” in the sense that the participant can still see the physical word around them). Their classification includes the following:
- “Monitor based … video displays – i.e. ‘window-on-the-world’ (WoW) displays – upon which computer generated images are electronically or digitally overlaid”; there is no implication of being able to “see through” these displays. Rather, the viewed scene may be remote in terms of time and/or space and the focus is on the manipulation of an already captured video scene. A window-on-the-world view might be something as simple as television view displaying a swimming race with an overlaid virtual line placed on top of the scene showing where the race leader would have to be at that point in time if they were setting a world record pace.
- Displays, such as HMDs (Head Mounted Displays), “equipped with a[n optical] see-through [ST] capability, with which computer generated graphics can be optically superimposed, using half-silvered mirrors, onto directly viewed real-world scenes”. A head up display on a smart helmet is a good example of an optical see through display.
- Displays that use “video, rather than optical, viewing of the ‘outside’ world. … the displayed world should correspond orthoscopically [that is, size, shape and perspective should be maintained] with the immediate outside real world, thereby creating a ‘video see-through’ system, analogous with the optical see-through [approach]”. Someone viewing the world through a camera view on their smartphone would be looking at a video see through system.
A second paper from the same lab (Milgram P, Takemura H, Utsumi A, Kishino F. Augmented reality: A class of displays on the reality-virtuality continuum, Photonics for industrial applications, 1995 Dec 21 (pp. 282-292). International Society for Optics and Photonics) further classified these display types in terms of whether the principle depicted world (the substratum world) was real or computer generated (CG), providing a basis for comparing augmented reality systems from virtual reality ones, whether the substrate was “scanned” or directly viewed (that is, directly perceived without mediation through a video screen or projection) and whether the view was a first person, egocentric view (that is, from the viewer’s perspective) or an exocentric view (from some other perspective).
|Class of MR System||Real (R) or CG world?||Direct (D) or Scanned (S) view of substrate?||Exocentric (EX) or Egocentric (EG) Reference?|
|Monitor-based video, with CG overlays||R||S||EX|
|HMD-based optical ST, with CG overlays||R||D||EG|
|HMD-based video ST, with CG overlays||R||S||EG|
We might further refine the exocentric notion into 2nd and 3rd person views, where we imagine the second person view is capable of including the presence of the viewer, and the third person view is completely remote from them.
A later paper by Bimber, Oliver, and Ramesh Raskar, “Modern approaches to augmented reality“, ACM SIGGRAPH 2006 Courses, p. 1. ACM, 2006, also considered the sort of physical system, or apparatus, required to augment a visual scene with digital imager. (The idea is not that all of these methods are employed at the same time – only one of them is!)
- retinal display;
- head-mounted display;
- hand-held display;
- spatial optical see-through display;
- projected display on object.
(We might also add contact lens mounted displays between retinal and head mounted displays.)
A related classification is used by Van Krevelen, D. W. F., & Poelman, R. (2010). A survey of augmented reality technologies, applications and limitations. International Journal of Virtual Reality, 9(2), 1, which groups the approaches as retinal, optical see-through, video see-through, and projective.
Drawing on all these ideas, the following classification allows us to talk about a range of visual displays capable of rendering mixed and augmented realities, whether locally or remotely situated with respect to the reality being augmented, to individuals or groups:
- proximity dimension:
- proximal: retinal and head mounted displays, which may be grouped together as augmented visual field devices (AVFDs)
- hand-held: hand-held devices such as phones or tablets
- distal: free standing displays (e.g. monitors or projected displays)
- optical dimension:
- video screen based window on the world displays, which overlay a given video image
- see through displays to augment the visual scene perceived through the display, which may be video based, and as such provide a “scanned” (or indirect) view of the substrate, or optically based, where the substrate is directly perceived
- projected displays, which directly enhance the environment
- first degree (first person?): first person view
- second degree (bystander?): colocated with viewer and capable of presenting them in the visual scene
- third degree (third party? remote?): representing a non-local visual scene.
On the one hand, the classification allows us to refer to an augmented reality phone-app as a hand-held see-through video screen based display used to indirectly perceive the visual scene from a first degree viewpoint. On the other, it allows us to refer to a mixed reality scene such as televised sporting event with overlaid graphics as indirect view of the scene from a third degree viewpoint using hand-held or distal window on the world video display.