© BK media systems 1996-2004
All Rights Reserved.
No part of these pages may be reproduced, transmitted, or translated in any form, by any means electronic, mechanical, manual,
optical, or otherwise, without prior written consent of Bernhard Kockoth
White Paper for Digital Stereo Vision System
© by Bernhard Kockoth media systems 1996-2010
- Basics of Human Environment Apprehension
- Machine Vision Emulates Human Visual System
- History of 3D Vision shows ups and downs
- Personal Computer enables Digital Imaging
- Traditional Print Media Distribution
Over time many engineers and artists worked on improving visual communications. Renaissance artists made great steps towards a better reproduction of reality and persisted by innovation in spatial painting and sculpture. I'll draw a vision for digital immersion media for the 3rd millenium. To put it another way and cite Hollywood movies: Have you been intrigued by the possibilities of media development like in the movie 'Brainstorm' ?!? - Then this article here is for you!
What is the goal? Deeper immersion means less artefacts and an experience closer to reality. From a neurofunctional view deep immersion may be compared to stepping through life - but in a dream.
Like in dreams, the stepping is virtual and the theme park is the experience. During sleep neurotransmitters inhibit the movements of our muscles yet we have the impression to have 'lived' our dreams. Dreaming is a virtual (memorized) reality experience and most of us have it every night. The approach in this article is to replace memorized reality by recorded reality. The sensual experience needs to be close to total. Most of our input comes through eyes and ears. If these two information channels receive full information thanks to advanced digital technology then the immersion experience in our brains is close to real reality. We live the recorded reality like in a dream!
In the following sections I'll explain the fundations for the Digital Parsonal Stereo Vision System and how it is based on natural parameters of human physical and logical functioning. The first chapter shows two examples of how the environment determines why we humans are the way we are. The second chapter takes a closer look on human and machine vision and explains the necessity of stereo vision. Which is not a new thing to image recording. History has tried it and again but until the arrival of powerful digital technologies stereo imaging became a fringe technology. Chapter 4 shows the way out to public awareness and acceptance. Digital Personal Stereo Vision will become part of our modern entertainment culture much like personal stereo and video games have established themselves at the end of the 20th century.
Basics of Human Environment Apprehension
As the human brain can be seen as an embedded control center for the individual it somehow has to get all the informations necessary to deliver the best service for itself and the flesh that sorrounds it. Direct interaction with the environment is restricted by the length of the arms and legs or by the speed of running. Humans are not fast, eight meters per second is the maximum speed. The limb-based limits have been overcome recently in human history by the use of body-enhancement tools, like an ax or an automobile. But the human brain has been hardwired to the limb-based limits a long time ago and nature needs time [count generations] to adapt to the new technology-based limits. And technology evolves at its own speed.
To illustrate this point, two examples:
a) The speed in which the human brain and body can act on incoming information limits the speed in which we can deal with the natural environment. This speed is close to 30 km/h, i.e. 8 meters per second. That is the speed of a fast runner, some sprinters are faster but in an artificial environment. When we are running we still can deal in real time with changes in the environment like people crossing our way. Think of the ultimate playground like an accelerated pedestrians zone. All people in there can avoid all obstacles in real time when they concentrate on making their way. Like our ancestors running away from wild animals or hostile humans. Those who were no able to react fast enough were eaten or stampeeded.
b) The three-dimensional viewing capacity of the eyes and brain is limited by the base distance of the eyes which is 64 mm in adults. Again, the immediate environment had a prime influence on the construction, our stereoscopic view works best at arms' length. Objects that are placed much further away gain their three-dimensional appearance only through motion. The brain compensates physical limitations (i.e. limited eye distance) by computation, much the same way that birds move their heads quickly to get three-dimesional information.
These two examples underline the fact that the way we are is determined by functional necessities. Through 100000 years of evolution there has been no necessity to see three-dimensional faster than we can run. Far objects gain their appearance through computation or may remain flat because the human head has no place to space eyes further apart than the 64mm with which we cope best at arm's length.
Machine Vision Emulates Human Visual System
With advances in digital computing technology machines become more intelligent and we can add vision with no high cost. Either machines need to see where they move or they may see things for us which are either tedious to see, invisible to the human eye or moving too fast. And this in distances and speeds different to what humans are used to.
Early experiences in artificial intelligence have shown that it is easier and more rewarding to emulate solutions done by nature. Which in the case of vision meant that we need two eyes to havc a clue about depth in scenes without motion. Which brings us to the case of a robot which handles parts in different arrangements so it needs vision to grab them correctly. If the distance is similar to an arm's length the robot vision has a similar aspect to human eyes. If the distance is closer and measured in millimeters or centimeters the two lenses are spaced closer together and if the distance is measured in tens of meters or more the cameras need to be spaced further apart to give good depth information to the image treatment system.
Other systems like insect's facett eyes or arrangements with moving cameras make sense as well, both have their counterparts in nature. Insects survived a large part of Life's history on Earth with a lot of very simple eyes. And we enjoy movies more when the one-eyed camera moves through a scene. This adds a high degree of depth and reality to the otherwise flat images.
Just on a byword, most of the above also applies somewhat to stereo audio. Thanks to our two ears and clever treatment we can determine the distance and direction of a sound very acurately. Hence the necessity of stereo sound to achieve a natural hearing experience.
History of 3-D Stereoscopy shows ups and downs
With the appearance of photography in the mid-19th century, the two-lens camera quickly became popular. Stereoscopic images printed from real-size glas plates became too difficult to handle for our convenience-oriented 21st century. From time to time print-magazines try a revival of anaglyph (red/green or red/blue) images but the thrill is shortlived. The black-and-white photos followed by headaches or false color vision at best are just an experience to thrill youngsters. Anyone serious about 3-D vision invests into real stereoscopic viewing. The early 90's hype around computer-generated stereogram pictures underlines this point. Not easy to see with a limited degree of realism they quickly lost the favor of the interested public.
To come back to the color of reality, until recently this meant complex projection systems and polarizing glasses for spectators. With the arrival of advanced digital media, other solutions of the old seperate pictures to seperate eyes distribution problem. After many blurred images the time has come for the Digital Personal Stereo Vision System.
The key ingredient in the current media mix are tiny LCD screens, one for each eye as we know it with personal stereo audio ('walkman') for over twenty years now. Conventional headphones were too clumsy and limited in range as long as the storage system was not following. Same with the Digital Personal Stereo Vision System. A CD-like player delivers the information and LCD-goggles deliver the images, tiny loudspeakers for the audio. Currently the tiny screens inside goggles are still at a high price but with rising demand and better production technology this obstacle will be history soon. Early samples used only resolution of 320x240 which is insufficient for realistic 3D-Video.
Digital Stereo Vision Goggles
Personal Computer Brings Digital Images to Life
How could this work? Digital images used to put a heavy load on computer systems not to speak of digital video. But again, technology advances fast towards vision - just remember what 40 Megabytes of harddisk were like in the late 70's - more a washing machine or a file cabinet, and now it fits into a small memory chip. To be more technical, common CD-ROM holds only 700 Megabytes of information which do not last long when it comes to store images. Digital Versatile Disc (DVD) hovers around 10 Giga Bytes of storage capacity and is designed for digital movies.
The trick to deliver 3-D movies on a sequential medium is not difficult to understand. It mostly works like an audio CD: one signal (right channel) follows the other (left channel). The first information is stored until the second is available as well and both together are presented. Another way to deliver two images instantly would be to interlace the two images, this way the memory would be limited to a minimum and the two images appear with unnoticable time difference. Popular DVD works with MPEG-2, the second, nearly identical picture frame which holds the three-dimensional information requires only little differential encoding.
Currently employed compression technologies (MPEG ...) work best on larger chunks of images with motion estimation. An easy start would be the simulation of the simultaneous encoding of two linked images and as memory is not the main cost factor in computer systems anymore it could be employed generously to build a prototype. This also permits future improvements by software updates. Once a viable solution has been found, part of the algorithms may be transformed into specialized hardware.
After so many talk about the viewing and playing technology a few words on the recording side: I have experimented with stereo photo since 1995. My approach (see Stereo Photo page)was simple but very efficient.
The problem of moving subjects was solved by electronic synchronization which gets as good as 20 ms, i.e. motion is 'frozen' by both cameras at the same time. The effect of seeing frozen human movement in 3-D is awesome!!! I got one better in 1997. A friend of mine built me a water housing for my stereocamera. I went into the waves of the Atlantic coast and Hawaii, and wow!, the water appears frozen or rather like glass on the stereoviews.
In 1999 I worked with the first generation stereo-video. The cameras of my choice are Panasonic's DV entry-level cameras. They got a brilliant image and, most important, the IEEE 1394 interface for direct data transfer to a computer.
On my travels I started a collection of powerful images to further advance the quest for total immersion media. As both channels work real time the full information can be transferred to a powerful computer system which can put the video out in what ever format imaginable (DVD, dual DV, side by side VHS, stereo stills). The problem of synchronization does not exist to the extent as with traditional stills. By way of computer based post-production it is easy to exactly match the left and right images with the occasional cropping at beginnings or ends of scenes.
Digital Stereo Vision System Cameras from 1999
Once high quality digital video is in the computer it can be used in every imaginable way. Whether you need viewmaster-slides, interlaced computer animation or side-by-side video or stereo-DVD - the choice is yours!
Traditional Print Media Viewing
Today's digital video has a resolution of 768x480 pixels, roughly a small picture of 7cm x 5cm when printed at 240 dpi. Which fits nicely into the 64mm eye's distance required for lenticular stereo vision or parallel viewing! With the plentitude of small pictures delivered by digital video one can imagine books or magazines full of screen shots. Visualization is done like in the old days with glasses - no big deal, think merchandising.
The Digital Personal Stereo Vision System for individual picture/movie distribution has a great future and compared to the bulkiness of traditional home theatre a sure market with the young. Although the exclusion of real reality to the favor of recorded reality seems antisocial the experience with older individual media like personal stereo (Sony Walkman, since 1979) and videogames (Atari, since 1974) has proved different. There are more than enough moments in life where we'd rather be immersed in a dream world than deal with the real thing. This creates a new cultural paradigm for social interaction. Videogames and cinemas are doing better than ever before.
The Digital Personal Stereo Vision System is reality from a technical point of view and its cultural implications are preceded by personal stereo and videogames to cite two popular individual entertainment technologies of the late 20th century.
Digital Stereo Vision HD-System Cameras 2009
Note: A more detailed version of this text is available, please ask bkmedia add exanova.com for your professional copy.