The Capture of Reality

When creating experiences for Immersive Virtual Reality, there are essentially two approaches. The first one of these is manual construction through Computer-Generated Imagery (CGI), and is how most games and VR experiences are made.  The second approach is far more automatic and attempts to ‘capture reality’ instead of actively generating it. It is this approach that we will discuss in this entry. In addition to presenting the technicalities of the methods of capture, we will also discuss its limitations, and provide an innovative example of how these can be solved in the future, drawn from a student project at the University of Bergen.

An early 360° camera — horisontally at least — probably the first with a synchronised shutter.

360° Video

In a previous article on Virtual Reality Journalism, we discussed how 360° 3D cameras can be used to present a user to an immersive experience. This approach has several unique benefits. First of all, it is far less time-consuming to capture and re-use already existing physical environments, instead of spending time creating it through 3D modelling. The same is perhaps especially the case when the environment involves any human actors, as it easier to avoid the uncanny valley effect and simultaneously maintain high standards of realism when using image capture equipment, than it is to create it with 3D animation.

How does it work?

360° cameras usually comprise two or more (ultra-)wide angle lenses. In the case of cameras with just two lenses, such as the GEAR 360 or Ricoh Theta V, each of these lenses then have to be able to capture 180° degrees horizontally and vertically. The recordings from these lenses, when raw straight from the camera, are separate — and need to be stitched together with software (for instance) an equirectangular view to compose a spherical view of 360° (See Illustration 2). Illustration 1 illustrates how the equirectangular format works, in the format of a world map, perhaps our most relatable example of spherical / global shapes presented in the format of rectangles.

Illustration 1: A relatable example of the equirectangular format. The furthest point west is close to the furthest point east, and as such we deal with a ‘sphere’, or more rightly globe, that is stretched out to a rectangle. The closer we get to the poles, such as Antarctica, the more the image is stretched, as the circumference of the earth is lesser at the poles.
Illustration 2: In this equirectangular photo, captured with a Ricoh Theta V, we see the same effect as in Illustration 1. My hands, which enclose the bottom of the camera, are given the same effects as Antarctica in the map. The stairs, however, which appear to be circular are straight, but it’s bending by the lenses are especially clear when viewing it ‘equirectangularly’.

When an equirectangular image is viewed through an HMD or a smartphone, the software selects only about 110° of 360° of the image, relying on the sensors in the HMD or phone on which degrees of the image to present.

3D Images

Although regular 360˚ cameras (GEAR VR; Ricoh Theta V) to a large extent cover the world as we see it in all it’s 360°, their images are still monoscopic. Essentially, this means that the same image  is presented to each eye when viewed in a HMD, and this is not the way we ordinarily see reality. As our eyes are distanced by a centimeter or two,  the visual feed slightly varies in its capture of reality. It is this which enables us to perceive the depth of the world, that is, when our eyes are not fooled by illusions exploiting this effect, such as VR itself. We discuss this in more detail in our entry on the History of VR, in which we discuss the invention of the Stereoscope, but a small introduction will also be given in this entry. Essentially, 3D 360˚ cameras utilise the same feature as human beings to capture depth, by separating the cameras similarly to that of the human eye. Such cameras are, however, more cumbersome and costly to produce, and to capture stereoscopic images one needs to double the minimum of lenses — leading to a minimum of four lenses —two for each eye for each 180˚ of capture. Unlike the  4K 360˚ monoscopic cameras available rather cheaply at the commercial market (from $200 and up), stereoscopic cameras have not entered the market at very reasonable prices yet. There is hope, however, and I can personally recommend Vuze+, a 360˚ 3D camera that deliver 4K resolution per eye, and comes with a well-designed acommpanying stitching- and editing software. The price is still a bit stiff for most non-professional use, at $1200, but it brings hope for future technology that these can soon be more affordable. We have used the Vuze+ camera in a research project at the University of Bergen, with good results. It is comparable to the quality of a Ricoh Theta V — except that it delivers the stereoscopic images rather than monoscopic ones.

Regarding Resolution

Unfortunately, a resolution of 4K per eye sounds great — and many are dissapointed when they view the recordings of a camera such as GEAR VR, Ricoh Theta, or the Vuze+. They may recall their images on their 4K TV as incredibly sharp, and yet, their recorded videos appear somewhat blurry and pixelated. The answer to why this is the case is quite simple. The 360˚ images do indeed have a 4K resolution, however, we are unable to view all the pixels at a time as they are stretched out on a sphere.  To keep matters simple, let’s say that your Head-Mounted Display has a Field of View of 90˚ (although most have 110˚). In this case, just  1/4 of the 4K image is being seen at any given time. Thus, we will have to divide the pixel count by four. This is somewhat simplified because of stretching, but it should be enough to get the point. To get an effective resolution of 4K, or something akin to 3K such as the HTC Vive Pro and Samsung Oddysey(+) can afford, one would need a far higher resolution of the cameras.

Another Step in Fidelity: Volumetric Video

At first thought, it may perhaps be hard to imagine how we can proceed to more details in immersive  360˚ 3D recordings except by increasing the resolution. As we briefly commented, however, stereoscopics in 3D movies at the cinema, or in 360˚ 3D recordings merely provide an illusion of depth — not actual depth. The same goes for our eyes, although they mostly perceive it correctly,  they are easily fooled. 360˚ 3D cameras is an example of this, they merely fool our eyes: although it seems that there is depth, we can not really move in the image — as there is no actual depth to it. Here, volumetric video acts differently, and affords positional interaction. Volumetric video attributes the recorded images in a 3D (x,y,z) space, in addition to delivering stereoscopy so that we can perceive it. Volumetric video is unfortunately very hard to create while still retaining high quality, and plug-and-play solutions still seem far off. To get an idea of how volumetric video works, we recommend to look into the concepts of photogrammetry — and perhaps even to create a 3D model yourself, using images captured with your smartphone. This YouTube tutorial shows you how to do this in Agisoft Photoscan Pro, which has a free trial available.

Limitations

Developed in an undergraduate course at the University of Bergen, the short 360 movie “Schizophrenia“, experimented with interactive 360 video.

Despite these great innovations in the capture of reality, CGI has some benefits that neither 360˚ 3D or Volumetric videos can really achieve. The most important of these is that of interactivity . As 360° videos are linear (that is, they have a predetermined beginning and end), the user can not really affect what happens in the video — except by choosing which degrees of the video to see.

In our course in VR Journalism at the University of Bergen where I taught students VR programming, 360° video and photogrammetry — we faced this exact limitation. A group that worked on providing an experience of the reality-shattering disorder of Schizophrenia, wanted hallucinations to occur when the user viewed at certain areas. The students solved this by placing transparent gifs over the video in A-Frame, edited based on the real footage, and put gaze event listeners to activate the playing of the gif. The results were extraordinary, and could well provide a new way to provide a means of simpler interaction on top of 360° videos. The experience, which voices are in Norwegian, can be viewed here (WebVR browser such as Chrome is necessary).

Oddysey+: an alternative route towards hi-res VR

Samsung Odyssey+: going beyond the nasty grid of the SDE, and into something nicer — apparently.

N.B: This blog entry is in Matrise’s category “Lights”, which holds more technical, often smaller posts, that concern actual and recent events. These entries stand out from other entries at Matrise, which are often more conceptual, ideal and philosophical.  You can read about Matrise here.


This week, Samsung gave notice on their new Windows Mixed Reality (WMR) Headset, the Samsung Odyssey+. Priced very reasonably at $500, like it’s predecessor the Samsung Odyssey, the Head-Mounted Display (HMD) is a very attractive option for those who value high resolution in HMDs (and who doesn’t — it is obviously a desirable trait to have a greater fidelity of the virtual world!) The market has also shown its hunger for high-resolution displays, which the Kickstarter for the Pimax 8K and 5K HMD’s have shown. When on the topic of hi res displays, resolution strumpets should also check out StarVR, a high-res 210-degree Field of View HMD with integrated eye tracking to provide foveated rendering, which can be especially fruitful with that intense FOV. Digression aside — the Odyssey+ is now already for sale in the US, and in this entry we will discuss why it can be an alternative way to experience a higher resolution.

New Features

The Odyssey+ features the same high resolution of 1400 * 1600 per eye as its predecessor. For reference, this is the same resolution as found in the HTC Vive Pro which cost far more (priced from $1098 — $1399 with two controllers and base stations). Unlike the Vive Pro, however, the Odyssey+ features inside-out tracking similar to what is used in other WMR HMD’s, and also the upcoming Oculus Quest. None of this is any news, however, as all of this could equally be said of the original Odyssey. The new feature they are releasing, which makes this a particularly HMD, is a technology they have called ‘Anti SDE’ — that is, a technology that seeks to eliminate the ‘Screen Door Effect‘ experienced in most HMD’s today.

An illustration by Samsung that attempts to illustrate the difference between the Odyssey and the Odyssey+.

Screen Door Effect

The screen door effect occurs when a user is to perceive the physical space or room between the pixels themselves.  This is of course not ideal for realism, as it becomes apparent that what you are viewing is a screen. The new Odyssey+ features a technology that diffuses the light from the pixels in between the pixels, to eliminate the SDE. Their press release stated:

“Samsung Anti-SDE AMOLED Display solves SDE by applying a grid that diffuses light coming from each pixel and replicating the picture to areas around each pixel. This makes the spaces between pixels near impossible to see. In result, your eyes perceive the diffused light as part of the visual content, with a perceived PPI of 1,233PPI, double that of the already high 616PPI of the previous generation Samsung HMD Odyssey+ [sic].”

RoadToVR report that they suspect this is the technology that Playstation VR has used in their own HMDs. The Playstation VR, with a resolution of only 1080p on the eyes combined, has surprisingly little SDE — which has made me prefer the display to the Vive regular or Oculus, albeit the tracking and computing power is vastly inferior. I’m therefore eager to see how this would work on a HMD with a lot more of the pixels.

Conclusion

Technologies such as low pixel persistence modes, asynchronous timewarp and foveated rendering are all genius technologies that enables perceived higher refresh rates than what our computers are capable of, some of which are indispensable especially for mobile VR. Anti-SDE technology may be yet such another technology, that may make it not so necessary to have 16K displays or whatever for VR to be perceived as very close to real human sight. That being said, although Samsung claims that their new HMD have a perceived PPI (pixels per inch) of 1233, it will naturally not offer the same sharpness of clarity as an actual 1233 ppi display would. The extra 50% potential increase in “perceived ppi” is only replicating, or diffusing, the already-existing pixels. Still, the tech is very welcome, and HTC also has something to learn from the way Samsung prices their products. Customers may now find a whole lot more value for their money in Samsung, and this comes from someone who already owns a Vive Pro. For those considering doing the purchase, it should be noted that the tracking is not as good as in the HTC Vive (Pro), but depending on your needs it may be more than good enough.