From Wheatstone to the Autostereogram Craze
Stereopsis and 3D vision are the same thing, an important binocular ability for people and animals that have it. It is the remarkable power of the visual sense to give an immediate perception of depth on the basis of the difference in points of view of the two eyes. It exists in those animals with overlapping optical fields, acting as a range finder for objects within reach. There are many clues to depth, but stereopsis is the most reliable and overrides all others. The sensation can be excited by presenting a different, properly prepared, view to each eye. The pair of views is called a stereopair or stereogram, and many different ways have been devised to present them to the eye. The appearance of depth in miniature views has fascinated the public since the 1840's, and still appears now and then at the present time. There was a brief, but strong, revival in the 1990's with the invention of the autostereogram. Stereopsis also has technical applications, having been used in aerial photograph interpretation and the study of earth movements, where it makes small or slow changes visible.
The word stereopsis was coined from the Greek stereos, solid or firm, and oyis, look or appearance. Since terms derived from Greek are often used in this field, it may be useful to have a brief discussion. Single and double vision are called haplopia and diplopia, respectively, from 'aplous (haplous) and diplous (diplous), which mean "single" and "double". Haplopia is the happy case; with diplopia we are seeing double. The use of a Greek term removes the connotations that may attach to a common English word, and sounds much more scientific. Note that the "opia" part of these words refers to "appearance", and does not come from a word for "eye". The -s- has been dropped for euphony. Otherwise, the closest Greek to "opia" means a cheese from milk curdled with fig juice. "Ops", for that matter is more usually associated with cooked meat or evenings. In fact, words like "optic" come from optikos, meaning "thing seen", from the future oysomai of oraw, (horao) "to see", not from a reference to the eye. The Latin oculus does mean "eye" and is used in many technical terms, like binocular, which combines Greek and Latin.
Charles Wheatstone, F.R.S. (1802-1875), published his discovery of stereograms in a remarkable paper to the Royal Society in 1838. After discussing the method of preparing and viewing the stereograms in the first few pages, he went on to the important matter of what his experiments showed about binocular vision. In fact, they overturned all existing theories and led to a better appreciation of the nature of the visual sense. Looking back, we can see that he had an exceptionally clear and penetrating appreciation of the facts he had uncovered about binocular vision. The following details are largely from his paper.
It is obvious that we have two eyes but perceive a single picture of our surroundings. Also, people with one eye do not perceive a different picture. When we fixate on a nearby object, the optic axes of the two eyes converge to intersect near the object, and this convergence is greater, the closer the object. These facts led to no contradictions when viewed in the light of early theories of vision, where vision was essentially touch, and naturally the eyes touched the same object, so there was only one impression of a unitary reality.
Later, some uncertainty grew about how light excited images, and the seat of visual perception was assigned to the lens, to the retina, and even to the aqueous humour. Galen apparently concluded that the eyes were wired to the brain through the nerves, and there was a one-to-one correspondence between the sensations in the two eyes. Later, when experiments showed that an image of the view was projected on the retina, it was concluded that the retina was the site of visual reception.
These ideas crystallized in the theory of Aguilonius, that points in the retinas were in one-to-one correspondence, and single vision occurred when the two images on the retina were located in corresponding locations. The convergence of the optic axes defined a point, and a line through this point parallel to the line joining the eyes was called the horopter (the "see-image"). Points in a plane containing the horopter and perpendicular to the plane of the eyes were seen singly, other points not. The sense of depth was provided by the convergence of the optic axes.
Theories of this kind were elaborated using complicated ray tracing procedures and geometrical constructions, with much jargon and coining of words from Greek. There was argument over just what ray connected the object seen with the corresponding retinal point. Strange theories were advanced that had three-dimensional images appearing in the aqueous humour, sensed by nerves there, an alternative to the corresponding-points hypothesis. It should be remarked that it was impossible to see nerves in the retina at that time, though everyone assumed they were there. Not until much later were staining procedures developed to make the structure of the transparent retina visible. Ignorance gives great scope for speculation. In the psychological literature, such theories have always been popular, even down to the present. Wheatstone demonstrated their futility, and Helmholtz completed the job about 50 years on, but they have shown remarkable durability.
Wheatstone showed that the retinal images from two independent sources could be fused into a single image giving the immediate impression of depth, and the effect was due to the difference of the images, not their similarity. He immediately suspected that steroscopy, and binocular vision in general, was not due to any corresponding points or convergence of optic axes, and set out to prove this. In the most conclusive demonstration, the two views of a stereogram were of the same object, but not to the same scale. That is, they were different in size. Not extravagantly different, but different enough to be easily recognized when the images were viewed separately. When viewed in the steroscope, the images were successfully fused, and the fused image was intermediate in size between the two separate images. There is no way, quite obviously, that the two images could ever have fallen on corresponding points of the retinas.
Random-dot stereograms, the first of which was created by Aschenbrenner in 1954, have shown that a recognizable image is not necessary. If the parallactic displacement of the dots is the same over a surface, the surface is perceived with the dots painted on it. Even single dots can excite stereopsis, if there is some way for the mind to deduce correspondence. Single-image autostereograms, in which one dot serves as part of both monocular views, but represents different points, give further fascinating insight. The mind does its best to make sense out of what it sees.
When the images were too different, they could not be fused. What happened then was quite unexpected. They were not seen simultaneously, as if the information from each eye was presented as seen, but alternately, as if the mind could not decide which was true. There are cases when two superimposed images are seen, called diplopia, when one eye is displaced mechanically, or alcohol causes a relaxation of muscular control, but these are not the usual case. In haplopia, single vision, the two images are antagonistic. Experiments with color are also adduced. When yellow and blue are presented separately to the two eyes, green is not seen. When red and blue are presented, purple is not seen. Wheatstone observes that some authors had stated the contrary, led by reason rather than experiment.
If one eye consistently offers poor or erroneous information, the mind will neglect its input. In this case there is no stereopsis, and usually no diplopia or antagonism, so the subject is unaware of the condition, called amblyopia, "dim-sight". A slight tendency in this direction, which is quite common, may require the person to make a small effort to produce fusion. If fusion does not take place, diplopia usually results.
Wheatstone also notes the observations of Necker on ambiguous figures, that by an effort of mind can be made to look one way or another. Here, he is very close to the realization that the most important part of the visual sense is mental, in the recognition of objects from past experience combined with current sensory stimuli. Earlier investigators, and psychologists at any time, looked for mechanical, physiological explanations of phenomena, often supported by the poorest knowledge of the physics involved. Stereoscopic fusion is an obviously mental process, unrelated to any retinal registration. It is an aspect of the normal fusion of the inputs from the two eyes to construct the unitary illusion of reality that surrounds us.
Wheatstone did not go on to show that stereopsis is independent of the convergence of visual axes, though he could have done so very quickly. The fusion of a stereopair does not require optical aid. If one looks at a stereopair, and consciously diverges the optic axes to parallelism, the fused image will appear between the two monocular images, especially when concentrated upon mentally. This free fusion can be learned with little difficulty. It is, in fact, required for viewing the autostereograms of the 1990's, when many millions of people aquired the skill. Adding this to Wheatstone's results, it is clear that stereopsis has nothing to do either with an image falling on corresponding point of the retinas, or with convergence of optic axes.
Curiously, Wheatstone says he first observed stereo fusion when viewing a candle reflected in a disc that had been turned on a lathe. The two eyes saw slightly different lines of light reflected from the fine circular grooves, and these fused to a single line out of the plane of the disk. This is not free fusion, but a case of the eyes each preparing their own stereogram.
The stereoscopic ability is not present at birth, since the visual system does not yet register objects at that time. Recent research shows that objects are not recognized before the 7th month of life. After that, the ability may begin to develop, but it may not be perfected for several years. It seems that the visual centers in the brain communicate through the corpus callosum, the thick cord connecting the hemispheres, but information from both eyes goes to the centers in each hemisphere. It would be interesting to know if one or the other connection was the essential one in stereopsis, or that they both play a role.
The visual scene that surrounds us is, of course, produced in the mind, and is to that extent an illusion, but its correspondence with reality is very close, so close that the difference is seldom perceived. The scene is always three-dimensional, even with one eye. Closing one eye does not change the perception. Many other distance clues are used by the mind to establish distance and shape, such as apparent size, shading, screening, and parallax. With one eye, moving the head gives the eye a new perspective, and this is a valuable clue. We never see the surroundings like a painting, but we see paintings in three dimensions. Leonardo da Vinci, Wheatstone reports, noticed that the views from the two eyes were different, hiding some objects and revealing others, so that a flat painting could never be an exact representation, but missed discovering stereopsis.
For additional information on the eye and visual perception, see Vision and Colour.
When Wheatstone wrote his paper, the only method easily available for preparing a stereopair was drawing. With the aid of a camera obscura, a careful artist could have made an acceptable stereopair. It would require the precision of a Seurat to do this; modern artists would have too little drafting skill and patience to achieve success. The differences in the views are small, but must be precisely rendered. Soon, however, a method was available that was ideally suited to making stereopairs with little effort: photography. Two cameras can be used, or a camera with two lenses and filmholders, or one camera moved between exposures. The interpupillary distance is about 65mm, so if the camera is moved laterally by this amount between exposures, the photographs will be a normal stereopair. By increasing the distance, the effect can be heightened. This is one way of making small changes between exposures at different times more visible. If the slides or prints are scanned, and loaded into a graphics program, a stereogram can be edited and produced quite readily.
For the usual observer, a means is necessary for presenting each image to its appropriate eye without interference from the other, with the eyes accommodated as for normal vision of a nearby object. That is, the optic axes are converged slightly and the eyes are focused on the planes of the images. Any such aid is called a stereoscope. Wheatstone's stereoscope used two mirrors to separate the optical paths. Each eye saw only the image in its mirror, and focused upon it. The angle between the mirrors could be changed to allow for convergence of the optic axes, and the distances of the pictures could be adjusted. This stereoscope was better suited to laboratory investigations than to the enjoyment of stereograms in the sitting room, largely because of the possible adjustments, which made it hard to set up, and its unwieldy size. Also, the stereograms had to be produced as mirror images, so they would be viewed correctly.
Brewster's stereoscope, invented a few years later, was much more convenient, and became the standard throughout the age of popularity of stereograms. The images were permanently mounted side by side, and had only to be inserted into the instrument to be viewed. One looked through prisms that diverged the optic axes so that each image was seen by its eye as if at the point of convergence. The prisms could also be lenses, which would give some magnification and provide an easier view. There were no adjustments with this instrument, so it was very easy to use. It is possible to put the prisms in eyeglass frames, so that stereopairs can be held up and viewed as, for example, printed in a book.
A recent stereoviewing apparatus used a disc with transparencies mounted at its periphery, stereopairs at the ends of diameters. The disc was placed in a viewing apparatus held up to the eyes. Magnifying systems presented the views, each to its appropriate eye. Different scenes could be seen by rotating the disc.
As was mentioned above, it is also possible to view stereopairs with free fusion, without any apparatus whatsoever. The images should be placed 65mm apart, not overlapping, and of a size to be viewed at normal reading distance. The observer then, by conscious effort, diverges the optic axes as if viewing a distant object, while still focusing on the images. This requires no strain; in fact, the feeling is one of relaxation when the axes become parallel. One sees a third image between the two presented, and this is the fused imaged seen solid. All three images can be perceived, but mental attention should be concentrated on the middle one, and the others will be less distinct. You can practice on the stereopair on the right, which is like those drawn by Wheatstone.
It is also possible to fuse the images by crossing the eyes. This separates the images because of the unusual eye positions, but the mind is astonished to find that the resulting images can be fused, and so it does this. I do not recommend this unnatural position because of the strain, and tend to ask people what would they think if their eyes stayed crossed? Each eye sees the image intended for the other, so the perspective is reversed. The converse image, as explained below, is seen rather than the intended one.
Some textbooks, such as Pauling's General Chemistry, included stereopairs to clarify spatial relationships, such as in crystal structure, that were intended to be viewed by free fusion. This excellent idea, easily implemented, was seldom used, and I have seen no recent example.
Another way to view a stereopair is to have them occupy the same space, but distinguished so that each is visible only to its proper eye, and not to the other. This stereogram is called an anaglyph, which was invented quite early in the history of stereoscopy, but only became popular much later. The two images can be separated by color or polarization. Two slide projectors with polarizers at right angles can be used to project the images on a screen so that they line up approximately. The viewers have spectacles with analyzers oriented so that each eye sees its own image. This is an excellent way to present stereograms to an audience. In the 1950's, motion pictures were projected in this way. It was quite successful, but the 3D movies had only a brief vogue.
A somewhat easier way is to print the images in two colors, and use colored spectacles to separate them. The usual spectacles have red and green lenses, and the image colors are red and green hues, selected so that each image vanishes when seen through a filter of the same color. The red is invisible through the red lens, and the green is invisible through the green lens. Colors are generally distinct only near edges; away from edges, a mixed grayish hue is the result. When the images are viewed, neither red nor green is seen, but an indistinct hue that does not seem distinctly colored. The impression is of a black and white scene. This is quite different from the antagonism of complementary hues, and one of the properties that makes this method popular. The greatest advantage, however, is that it requires only the spectacles, and no special effort on the part of the viewer. Pearce's text used anaglyphs to illustrate descriptive geometry, an excellent idea. Pearce called them "analglyphs," with an engineer's gift for words. The anaglyph at the right was made by a simple computer program, and is to be viewed with the red lens is over the left eye.
An anaglyph of a hexagonal prism, shaded to show one face solid, is shown at the left. The colors are rather garish, but when viewed through red-green spectacles, one face of the prism appears in 3D. Blue is the absence of both red and green, so this area appears black in either color. The green area appears black in the red filter, and vice-versa, so the face seems black as a whole when the anaglyph is fused. Red-green anaglyphs can show only black and white, of course. Polarized anaglyphs are more flexible in this regard, but cannot be represented on the printed page. Persons with anomalous color vision may have difficulty with red-green anaglyphs.
A computer program that can generate images as seen from arbitrary points of view can easily generate stereograms and anaglyphs. The images can be placed side-by-side, or superimposed in two colors. The computer is a powerful tool for producing stereograms, but it should be remembered that in this field, "3D" generally means only perspective drawing, and shading, not stereoscopy, unless specifically noted otherwise. Drawing an anaglyph on the computer screen is very convenient and easy, because of the additive color mixing. The pixels are formed from red, green and blue dots. Red and green is yellow, neither red nor green is blue. With a white background, the full intensities are used.
If the stereoimages for the right and left eyes are interchanged, fusion can still be achieved as easily, but the depth is inverted. Wheatstone called this the converse image. It is easily seen with anaglyphs by reversing the spectacles. Converse images show how much the mind trusts stereo displacements, believing them even when the resulting figure is impossible. Wheatstone compared this with the illusion seen in cameos and intaglios, when one is taken for the other when illuminated in unusual ways. There is probably no relation between the two phenomena, but the comparison is interesting.
It was believed by some that recognition of an object was necessary for the mental fusion of stereoscopic views of it. This view was demolished by Aschenbrenner in 1954, who created the first random-dot stereogram. I do not know how Aschenbrenner did it, exactly, but I would do it like this. I would make two indentical copies of a pattern of random dots. By random, I mean that no special outlines or pattern is visible in the dots. The dots do not have to be especially densely spaced. I would cut out a rectangle from the center of one copy, slice a narrow ribbon from one end and move it to the other end, then paste it all down as carefully as possible. The modified copy should look just like to unmodified copy, to casual inspection. In fact, the two copies might not even look very similar. However, when viewed in a stereoscope, a rectangular area would seem to float above the background, quite boldly and unmistakeably. His work, perhaps the most seminal since Wheatstone, seems to have been neglected by psychologists, probably because it conflicted with their prejudices.
It is extremely difficult to construct random-dot stereograms by hand, except for such simple cases as just mentioned. In 1960, before Aschenbrenner's work was generally recognized, Julesz showed how to use the computer to do the job. This is really very simple. One puts random dots on the surface to be depicted, perhaps in a computer model. Then coordinates (x,y,z) are known for each dot, and their perspective projections can easily be calculated and plotted for two different points of view. Now the two images really do look random and uncorrelated, but when viewed in a stereoscope, the surface is vividly revealed.
Another surprising advance occurred in 1990. This did not have as deep an effect on theory, but was to lead to public awareness of stereograms that for a few years amounted to a mania. Tyler and Clarke announced single-image random dot stereograms, or autostereograms. When a single image was viewed without any equipment whatsoever, stereoscopic fusion was achieved and a hidden picture was suddenly revealed. The one difficulty was that the observer had to achieve free fusion, and this was regarded as making it a kind of puzzle. Autostereograms appeared in newspapers, hard- and soft-cover books, on buses and billboards, and everywhere else possible. Millions of people learned free fusion, and most did not know what they had acquired.
The key to how an autostereogram works is in a much older illusion, called the wallpaper illusion, when a certain motif is regularly repeated along a line. In a range of spacings, different individual motifs are taken as representing the same object in the two eyes. The mind identifies them as the same, and fuses the views. Depending on the spacing, greater or less than the interpupillary distance, the motifs appear closer or farther away than the surface they are printed on. Free fusion is required, but it is very easy in this case, and possibly a good first step to learning it. This illusion is shown below. The blue dots are closest, the crosses in the middle, and the fleurs-de-lis the most distant.
In an autostereogram, the same feature serves as part of two different images, one for each eye, and must be very cunningly placed for this to be possible. Computer aid is essential, and many programs have been written to do the job, with various colorful motifs as well as boring random dots. An autostereogram of my own is shown below. It is designed to be seen on an 800x600 screen, and should be about 4 inches wide. If it is not the right size, you can download it and adjust the size in your own imaging program. The two dots at the bottom are to aid free fusion. When you see three dots, you are properly set to fuse. The stereogram shows the letters IEE raised above the background. It is not an easy stereogram, but I have fused it on my monitor.
There is a time lag in seeing. Dimmer images require longer to register than bright ones. If a pendulum swinging in a transverse plane is viewed through spectacles in which the lenses are of different optical densities, the two eyes register images at slightly different times, and the mind interprets the difference as a stereoscopic displacement. The pendulum seems to be swinging in an elliptical path, moving in and out as it moves. This is known as the Pulfrich pendulum and effect. Remarkably, Pulfrich had vision in only one eye, and had to have others verify his predictions. In 1994, the BBC filmed some short skits designed to use the Pulfrich effect to produce a 3D effect. The movement had to be carefully choreographed to give appropriate displacements. Of course, this not suitable for true 3D pictures, but the venture was successful and certainly demonstrated the Pulfrich effect excellently.
Less successful have been efforts to achieve 3D on a screen without requiring special equipment for the viewer. Some attempt to use a prismatic screen that emits the two images so that they are seen separately by the two eyes. Of course, location is very critical, and this renders the attempts fruitless.
A hologram, on the other hand, acts just like a real object, so that different views are seen by each eye and fused as in normal vision. This is often seen in the white-light holograms on credit cards, used to discourage fraud.
More successful are goggles that can project the two images separately into the eyes. This is really just the usual stereogram with a novel viewing method, but it is adapted to computer image creation, as in virtual reality.
An interesting demonstration of the importance of perspective in three-dimensional perception appears in recent (Dec 2004) TV ads for Daimler-Chrysler motors. These ads show legends rendered in accurate perspective in connection with views of motor vehicles. When the legends and the vehicles move concordantly, the strong perception that they are rigidly connected appears almost immediately, and the legends appear to move in three-dimensional space. When the vehicles and legends move discordantly, however, the illusion quickly disappears. The strongest stereoscopic clue is binocular stereopsis, which overrides all other evidence. The second strongest is clearly perspective, which allows the sense to model the external world accurately. This may be considered to include relative size. As we have just remarked, consistency in representation is very important. Aerial perspective (haze, blueness) is then the residual stimulus.
An article in IEEE Spectrum explains current efforts to create a totally different type of 3-D displays. Instead of generating images for the two eyes, these displays actually create a miniature luminous 3-D object. When this simulacrum is viewed, stereopsis arises naturally. One must realize how very special this is, and that it is by no means a general way to present 3-D views. It is simply a way to create a model electronically, and has the same applications as a model. One can walk around and view the display from different sides, instead of having the computer generate a rotated image, as in the normal 3-D display. One can imagine that such a display would be very attractive for things like molecular models.
There are two current approaches. In one, a moving screen sweeps out a volume, and pixels are displayed when the screen is in the appropriate direction. In the other, layered screens occupy a volume. Each screen displays pixels at a point (x,y) while the different screens correspond to different values of z. The means of creating these displays is explained in the article. Problems with opacity are obvious, and must be overcome by computing. The latter approach seems the more practical.
The article prompts some comments on the statements made there. The author states "We perceive three dimensions because our brain combines the slightly different images seen by each of our eyes." Just close one eye and you will see that this statement is false. Binocular vision is only one clue used in stereopsis, and by no means its cause. He also states that gyroscopic precession is "where energy from an applied force, say, a wave hitting ship, is transmitted at 90 degrees to the direction of rotation." A less enlightening definition could hardly be constructed. However, it is apparently sufficient to demolish the rotating screen display competing with the layered display being developed by his company.
I have not found a source for glasses for viewing red-green or polarized anaglyphs, but they are easy to make. You will need some stiff cardboard, red and green acetate or polarizing film, and cement or paste suitable for sticking cardboard. Cut out two frames about like that shown in the figure below (which is roughly full size), and two pieces of colored acetate or polarizing film, as appropriate, about 30x40mm. Tack the filters to one frame, taking care with the direction of polarization in the case of polarizing films. One eye should be polarized horizontally, the other vertically, if that is the way you are projecting the images. At any rate, the directions should be perpendicular. Now paste the second frame piece over the first, registering it neatly. The glasses can be held before the eyes either way as required.
Composed by J. B. Calvert
Created 28 November 2000
Last revised 24 April 2005