Steve Mann
University of Toronto
Department of Electrical and Computer Engineering
Toronto, Ontario, M5S 3G4
mann@eecg.toronto.edu
Mixed Reality exists in many forms along a continuum from Augmented Reality (reality enhanced by graphics, such as Sutherland's work from more than 30 years ago) to more recent efforts at Augmented Virtuality (graphics enhanced by reality, graphics enhanced by video, etc.). Mixed Reality provides numerous ways to add together (mix) various proportions of real and virtual worlds. However, Mediated Reality is an even older tradition, introduced by Stratton more than 100 years ago. Stratton presented two important ideas:
Virtual Reality, Mediated Reality, Mixed Reality, Modulated Reality, Modified Reality, Wearable Computing, Personal Imaging, Personal Technologies, Humanistic Intelligence, Intelligent Image Processing
Steven Feiner defines augmented reality as follows:
Augmented reality refers to computer displays that add virtual information to a user's sensory perceptions. [Feiner 02]One way to add virtual information to our visual field of view is to display the information on some kind of beamsplitter or other similar device, and thus use the human eye itself as the adder that achieves the superposition of the real and virtual information. Alternatively, the superposition can take place outside the eye, where the real plus virtual (added together by a video mixer) have been combined before being presented to the eye. The choice of which system to use depends on various factors such as whether or not the signals from the real world are available in electronic form. (The signals from the virtual world are assumed to already exist in electronic form since they are usually computationally generated.)
Figure 1: Illusory transparency. A television camera is connected to a
television, both being arranged to display, upon the television,
that which it occludes.
Illusory transparency is an older concept than,
as well as a generalization of, the notion of video see-through.
Illusory transparency
includes media that do not necessarily involve video.
The concept of
illusory transparency continues to apply to
systems that do not involve video.
Examples of illusory transparency include computational
see-through by way of laser EyeTap
devices, as well as optical computing to produce a similar illusory
transparency under computer program control. With optical computing,
the concept of illusory transparency continues to provide
a distinction between devices that can modify (not just add to)
our visual perception and devices that cannot. Not all reality--modifying
devices involve video, but many still involve the concept of
illusory transparency.
It has been shown that the addition of video signals, in augmented reality, is fundamentally flawed because cameras operate approximately logarithmically and displays (including most head mounted displays) operate anti-logarithmically (Comparametric Equations, IEEE Trans. Image Proc., 9(8) August, 2000, pp1389-1406) Thus video mixers effectively perform something that is closer to multiplication than it is addition, because they operate on approximately the logarithms of the underlying true physical quantities of light. Accordingly, antihomomorphic filters have been proposed for augmented or mediated reality, wherein this undesirable effect is cancelled (Intelligent Image Processing, John Wiley and Sons, 2001) [Mann 01].
The concept of mixing real and virtual worlds exists in a wide variety of situations in the broadcast, entertainment, audiovisual, and computer graphics industries, and beyond. Therefore a taxonomy of the different mixtures of real and virtual information content [Milgram 94, Drascic96], with particular emphasis on a continuum from augmented reality to augmented virtuality, is useful. This taxonomy is especially useful in considering the way in which real and virtual are combined in head mounted displays [Milgram 99].
One of the earliest and most important such examples was the eyeglasses George Stratton built and wore. His eyewear, made from two lenses of equal focal length, spaced two focal lengths apart, was basically an inverting telescope with unity magnification, so that he saw the world upside-down [Stratton 96, Stratton 97].
These eyeglasses, in many ways, actually diminished his perception of reality. His deliberately diminished perception of reality was neither graphics enhanced by video, nor was it video enhanced by graphics, nor any linear combination of these two. (Moreover it was an example of optical see-through that is not an example of registered illusory transparency, e.g. it problematizes the notion of the optical see-through concept because both the mediation zone, as well as the space around it, are examples of optical-only processing.)
Others followed in Stratton's footsteps by living their day-to-day lives (eating, swimming, cycling, etc.) through left-right reversing eyeglasses, prisms, and other optics [Kohler 64, Dolezal 82] that were neither examples of augmented reality nor augmented virtuality (and for which the optical see-through versus video see-through distinction fails to properly address important similarities and differences among various such devices).
Thus a more general concept than that of mixed reality is needed.
Because of the existence of a broad range of devices that modify human perception, mediated reality, a more general framework that includes the reality virtuality continuum, as well as devices for modifying as well as mixing these various aspects of reality and virtuality has been proposed [Mann 94, Mann 01].
Mediated Reality, therefore, refers to a general framework for artificial modification of human perception by way of devices for augmenting [Starner 97], deliberately diminishing, and more generally, for otherwise altering sensory input. This gives rise to the Reality, Virtuality, Mediality continuum depicted in Figure 2, as shown below:
Figure 2: (at left) Taxonomy of Reality, Virtuality, Mediality. The origin R
denotes unmodified reality. A continuum across the Virtuality
axis V includes reality augmented with graphics
(Augmented Reality), as well as graphics augmented by reality
(Augmented Virtuality). However, the taxonomy also includes
modification of reality or virtuality or any combination of these.
The modification is denoted by moving up the mediality axis.
Further up this axis, for example, we can find mediated reality,
mediated virtuality, or any combination of these. Further up and
to the right we have virtual worlds that are responsive to a
severely modified version of reality.
(at right)
Mediated reality generalizes the concepts of mixed reality, etc..
It includes the virtuality reality continuum (mixing) but also,
in addition to additive effects, also includes multiplicatave
effects (modulation) of (sometimes deliberately) diminished reality.
Moreover, it considers, more generally, that
reality may be modified in various ways.
The mediated reality framework describes devices that deliberately
modify reality, as well as devices that accidentally modify it.
Indeed, mediated reality has been shown to be useful in deliberately diminishing reality, such as by filtering out advertisements [Mann and Fung 02]. The HOLZER (Homographically Obliterating Labels by Zeroing, Enhancement or Replacement) system is a diminished reality system that filters out billboards and other advertising material, replacing them with blank spaces or other, more useful material [Mann and Niedzviecki 01]. Mediated reality relates to Feiner's distinction of virtual reality and augmented reality as follows: ``whereas virtual reality brashly aims to replace the real world, augmented reality respectfully supplements it.'' [Feiner 02] whereas mediated reality modifies it.
Figure 3: There was no doubt that Mediated Reality had practical uses.
(at left) Author (looking down at the mop he is holding)
wearing a thermal EyeTap wearable computer
system for seeing heat. This device {\em modified}
the author's visual perception of the world,
and also allowed others to communicate with the
author by modifying his visual perception.
A bucket of 500 degree asphalt is present in the foreground.
(at right) Thermal EyeTap principle of operation:
Rays of thermal energy that would otherwise pass through
the center of projection of the eye (EYE)
are diverted by a specially made 45 degree ``hot mirror''
(DIVERTER) that reflects heat, into a heat sensor.
This effectively locates the heat sensor at the center
of projection of the eye (EYETAP POINT).
A computer controlled
light synthesizer (AREMAC) is controlled by a wearable computer
to reconstruct rays of heat
as rays of visible light that are each collinear with the
corresponding ray of heat. The principal point on the
diverter is equidistant to the center of the iris of the eye
and the center of projection of the sensor (HEAT SENSOR).
(This distance, denoted ``d'', is called the eyetap distance.)
The light synthesizer (AREMAC) is also used to draw on the
wearer's retina, under computer program control, to facilitate
communication with (including annotation by) a remote roofing
expert.
Figure 4: Practical application of collaborative Computer Mediated Reality:
(leftmost) First person perspective as captured by the EyeTap device.
Author's hands are visible grasping the mop.
(left of middle)
Mop and hot asphalt as viewed through wearer's right eye.
(middle) After mopping hot asphalt onto the roof surface,
a base sheet is rolled down (bucket of hot asphalt shows as
white in the upper right area of the frame).
(right of middle)
The thermal EyeTap is useful for ``seeing through'' the top layers
of felt or fiberglass, to determine heat flow underneath.
(rightmost) The kettle (upper right of frame) shows up as white
(approx. 500 degrees) whereas the
propane cylinder (bottom of frame) and the propane hose
supplying it show up as black, because the cylinder and hose
are cold due to the expansion of the propane gas
(Joule Thomson effect).
The thermal EyeTap was also useful when the kettle caught on
fire because of its ability to see through smoke. Kettle fires
are easy to extinguish (simply by slamming the lid shut)
if the kettle can be seen through the thick black smoke given
off by the burning asphalt.
Such devices can be used to modify the visual perception of reality within
certain mediation zones (e.g. only one eye rather than both eyes, or
only a portion of that eye), giving rise to {\em partially
mediated realitly}~\cite{intelligentimageprocessing}. Moreover
these devices can also be worn with prescription eyeglass lenses,
or even have prescription eyeglass lenses incorporated into the design.
Prescription eyeglasses are themselves
Unlike traditional scientific experiments that take place in a controlled lab-like setting (and therefore do not always translate well into the real world), Stratton's approach required a continuous rather than intermittent commitment. For example, he would remove the eyewear only to bathe or sleep, and he even kept his eyes closed during bathing, to ensure that no un-mediated light from the outside world could get into his eyes directly [Stratton 96, Stratton 97]. This work involved a commitment on his part, to devote his very existence -- his personal life -- to science.
Stratton captured a certain important human element in his broad seminal work, which laid the foundation for others to later do carefully controlled lab experiments within narrower academic disciplines. Moreover, his approach was one that broke down the boundaries between work and leisure activity, as well as the boundaries between the laboratory and the real world.
Similarly, it is desired to integrate computer mediated reality into daily life. The past 22 years of wearing computerized reality mediators in everyday life has provided the author with some insight into some of the sociological factors such as how others react to such devices~\cite{cyborg}. In particular, this has given rise to a desire to design and build reality mediators that do not have an unusual appearance.
Indeed, it is preferable that commonly used reality mediators, such as hearing aids and personal eyeglasses must have an unobtrusive (or hidden) appearance, or be designed to be sleek, slender, and fashionable. Figure 5 shows how miniaturization of components makes this evolution possible.
Figure 5: (at left) EyeTap devices, such as this infrared night vision
computer system, when the components are visible, tend to
have a frightening appearance, owing to the ``glass eye'' effect,
in which optics, projected to the center of projection of a lens
of the tapped eye are visible to others. Here we see the image
of the infrared camera located in the center of the author's
right eye.
(at right) Author's recent (1996) reality mediator design
with systems built into dark glasses tends to mitigate this
undesirable social effect. The eyeglass lenses are also transparent
in the infrared, allowing the night vision portion of the apparatus
to take over in low light, without loss of gain. In the visible
portion of the spectrum, a 10dB loss of sensitivity is incurred
to conceal the color sensor elements and optics.
The author's wearable computer reality mediators have evolved from headsets of the 1970s, to eyeglasses with optics outside the glasses in the 1980s, to eyeglasses with the optics built inside the glasses in the 1990s [Mann 01] to eyeglasses with mediation zones built into the frames, lens edges, or the cut lines of bifocal lenses in the year 2000 (e.g. exit pupil and associated optics concealed by the transition regions). Reality mediators that have the capability to measure and resynthesize electromagnetic energy that would otherwise pass through the center of projection of a lens of an eye of a user, such as shown in Figures 3, 4, and 5, are referred to as EyeTap [Mann 01] devices. These devices divert at least a portion of eyeward bound light into a measurement system that measures how much light would have entered the eye in the absence of the device. Some eyetap devices use a focus control to reconstruct light in a depth plane that moves to follow subject matter of interest. Others reconstruct light in a wide range of depth planes, in some cases having infinite or near infinite depth of field.
Therefore, the author has proposed that any intermediary elements to be installed in an eyeglass lens be installed in the transition zones, e.g. as transfer functions between peripheral vision and the eyeglass lens itself (e.g. in the frames), or as the transfer functions between different portions of a multifocal (bifocal, trifocal, etc.) eyeglass lens.
Consider first the implementation of a reality mediator as part of transfer function of the glass to glass transition region of bifocal eyeglasses.
In the case in which a monocular version of the apparatus is being used, the apparatus is built into one lens, and a dummy version of the diverter portion of the apparatus is positioned in the other lens for visual symmetry. It was found that such an arrangement tended to call less attention to itself than when only one diverter was used for a monocular implementation.
These diverters may be integrated into the lenses in such a manner to have the appearance of the lenses in ordinary bifocal eyeglasses.
Where the wearer does not require bifocal lenses, the cut line in the lens can still be made, such that transfer function simply defines a transition between two lenses of equal focal length.
The peripheral transfer function (e.g. at the edges of the glass, or the eyeglass frame boundaries) provides a more ideal location for a partial reality mediator, because the device can be concealed directly within the frames of the eyeglasses, as shown in Figure 6 below.
Reality mediator incorporated into the eyeglass frames:
Upon close inspection with the unaided eye, the eyeglasses have a normal
appearance.
Using a magnifier, macro lens, etc., we can see the diverter
when magnified in an extreme close-up.
Side view shows sleek and slender design, where apparatus can be hidden
in hair, at back of wearer's head.
In view of such a concealment opportunity, the author envisioned a new kind of eyeglass design in which the frames would come right through the center of the visual field. With materials and assistance provided by Rapp optical, eyeglass frames were assembled using standard photochromic prescription lenses drilled in two places on the left eye, and four places on the right eye, to accommodate a break in the eyeglass frame along the right eye (the right lens being held on with two miniature bolts on either side of the break). The author then bonded fiber optic bundles concealed by the frames, to locate the camera and aremac in back of the device.
This research prototype proves the viability of using eyeglass frames as a mediating element. The frames being slender enough (e.g. two millimeters wide) do not appreciably interfere with normal vision, being close enough to the eye to be out of focus.
But even if the frames were wider, they can be made out of a see-through material, or they can be seen through by way of the illusory transparency afforded by the EyeTap principle [Mann 01]. Therefore, there is definite merit in seeing the world trough eyeglass frames.
In particular, the eyeglasses of Figure 6 were crude and simple. A more sophisticated design could use a plastic coating to completely conceal all the elements, so that even when examined under a microscope, evidence of the EyeTap would not be visible.
The author would like to thank Mel Rapp, of Toronto-based Rapp Optical, for assistance in recovering from an unfortunate incident that had involved the damage of the author's eyeglasses, and for providing the author much in the way of motivation to persevere through these difficult times.
Caudell, T. and Mizell, D. (1992), "Augmented Reality: An Application of Heads-Up Display Technology to Manual Manufacturing Processes." Proc. Hawaii International Conf. on Systems Science, Vol. 2, pp 659--669.
Dolezal, H. (1982), "Living in a world transformed." Academic press series in cognition and perception, Academic press, Chicago, Illinois.
Drascic, D. and Milgram, P. (1996). "Perceptual Issues in Augmented Reality." SPIE Volume 2653: Stereoscopic Displays and Virtual Reality Systems III, San Jose, February, pp. 123-134.
Feiner, S. (2002) "Augmented Reality: A New Way of Seeing." Scientific American, Apr 2002, pp. 52-62.
Kohler, I. (1964), "The formation and transformation of the perceptual world." Psychological issues, International university press, 227 West 13 Street, 3(4), monograph 12.
Mann, S. (1994), "Mediated Reality. " M.I.T. M.L. Technical Report 260, Cambridge, Massachusetts.
Mann, S. (2001), "Intelligent Image Processing. " John Wiley and Sons, 384pp ISBN: 0-471-40637-6.
Mann, S. and Fung, J. (2002), "EyeTap devices for augmented, deliberately diminished, or otherwise altered visual perception of rigid planar patches of real world scenes." PRESENCE, 11 (2), pp 158--175, MIT Press.
Mann, S. with Niedzviecki, H. (2001), "Cyborg: Digital Destiny and Human Possibility in the Age of the Wearable Computer." Randomhouse (Doubleday), 288pp, ISBN: 0-385-65825-7.
Milgram, P. (1994) "AUGMENTED REALITY: A CLASS OF DISPLAYS ON THE REALITY-VIRTUALITY CONTINUUM." SPIE Vol. 2351, Telemanipulator and Telepresence Technologies
Milgram, P. and Colquhoun, H. (1999), "A framework for relating head mounted displays to mixed reality displays", Proceedings of the human factors and ergonomics society, 43rd annual meeting, San Jose, pp123-134.
Starner, T., Mann, S., Rhodes, B., Levine, J., Healey, J., Kirsch, D., Picard, R., and Pentland, A. (1997), "Augmented Reality Through Wearable Computing." PRESENCE, 6(4), MIT Press.
Steuer, J (1992) "Defining Virtual Reality: Dimensions Determining Telepresence." Journal of Communication, 42(4), pp 73-93.
Stratton, G. (1896), "Some preliminary experiements on vision." Psychological Review.
Stratton, G. (1897), "Vision without inversion of the retinal image." Psychological Review.
Sutherland, I. (1968), "A head-mounted three dimensional display." Proc. Fall Joint Computer Conference, Thompson Books, Wash. D.C., pp757-764.
Tamura, H., Yamamoto, H., and Katayama, A. (2002), "Role of Vision and Graphics in Building a Mixed Reality Space." MR Systems Laboratory, Canon Inc., Proc. Int. Workshop on Pattern Recognition and Image Understanding for Visual Information, pp1-8.