RICOH THETA TECHNOLOGY

Bringing innovation and amazement to the imaging experience.
RICOH technology is about to pioneer uncharted territory.

Optical technology that makes a slim body possible

Ultra compact twin lens folded optics technology realizes an amazingly slim body.

RICOH THETA spherical images are created by an innovative design that uses ultra compact, ultra wide-angle lenses at the front and rear of the camera, collecting light at an angle of view that far exceeds the 180 degree of a fish eye lens, then allocating that light to the image sensors on the left and right using two 90º prisms.

Even with twin lenses, the THETA boasts a slim body at a mere 17.9 mm thick (THETA V, excluding the lens section) due to the symmetrical arrangement of the folded fish eye optics.
The design is not only simple and stylish, but a shorter distance between the lens results in extremely minimal parallax error*1.

*1 Differing viewpoint of two cameras.

The ultra compact, ultra wide-angle lenses are designed and tuned specifically for spherical images.

Raising the resolution of the entire image to a uniform level is essential to stitch two images with minimal parallax difference and deliver a completely natural appearance. To achieve this, a RICOH-original projection method was adopted. The lens coating not only covers the visible light spectrum, but also the infrared spectrum, which is not available on average camera lenses. Because the wavelength characteristics and angle dependency differ between the center and periphery, it can often be difficult to tune fish eye lenses. However, despite this difficulty, we have succeeded in creating uniform coloring across the center and edges, making it possible to obtain beautiful, natural image quality.

Image processing technology that produces 360º images

Kiyotaka Kitajima

Process of creating spherical images with no visible stitch

1. Image processing

First, the image data from two image sensors are used to conduct basic image processing. Next, in addition to the basic image processing carried out on general digital cameras, images are processed to obtain the appropriate brightness and coloring from the two image sensors. Specifically, the individual sensitivity variation between the two image sensors is corrected, and exposure compensation is applied for each image sensor based on a comprehensive decision obtained from brightness detected in the data of both images.

2. Image stitching

Next, image stitching is carried out for two images. For each of the two images, pattern matching calculates the offset amount between the reference image and comparison image in each area to detect the stitching position. Then, referencing the detected stitching position and the characteristics of each optical lens system, the two images are converted to a spherical image format. Blending the two images in the spherical image format forms a final, single spherical image. In this way, pattern matching detects the stitching position and applies it to the image conversion parameters to the spherical image format, resulting in a dynamic stitching process which enables real-time stitching of two images.

1. Image processing

First, the image data from two image sensors are used to conduct basic image processing. Next, in addition to the basic image processing carried out on general digital cameras, images are processed to obtain the appropriate brightness and coloring from the two image sensors. Specifically, the individual sensitivity variation between the two image sensors is corrected, and exposure compensation is applied for each image sensor based on a comprehensive decision obtained from brightness detected in the data of both images.

2. Image stitching

Next, image stitching is carried out for two images. For each of the two images, pattern matching calculates the offset amount between the reference image and comparison image in each area to detect the stitching position. Then, referencing the detected stitching position and the characteristics of each optical lens system, the two images are converted to a spherical image format. Blending the two images in the spherical image format forms a final, single spherical image. In this way, pattern matching detects the stitching position and applies it to the image conversion parameters to the spherical image format, resulting in a dynamic stitching process which enables real-time stitching of two images.

Spherical image

Mercator Projection is used in spherical images to assign coordinates to each pixel position on a spherical surface. In other words, when compared to the Earth with the latitude and longitude as two axes, the surface of the Earth would be two-dimensional. Users can use a dedicated app to change the point of view by dragging a finger up, down, left, and right on the image, as well as zooming in and out the image to view the entire spherical image. On the app, the spherical image is mapped as a texture for a spherical object, and by specifying the direction and angle of view, the spherical image can be displayed as attached to a sphere.

360° spatial audio with linked video and audio

360° spatial audio refers to linked video and audio when played back on a VR viewer and headphones.
Two technologies are used on the RICOH THETA V for recreating changes in the acoustic field according to the direction the viewer is looking. This delivers a more realistic, immersive VR experience.

1. THETA V spatial audio recording - Ambisonics

1. THETA V spatial audio recording - Ambisonics

Ambisonics is a type of three-dimensional audio technology that records spatial audio in 360° and recreates the natural directionality in a [Recording/Playback] format. Ambisonics converts audio sources into four signals including the base signal (W), the front/back expanding signal (X), the left/right expanding signal (Y), and the up/down expanding signal (Z) to recreate an audio field with directionality. The RICOH THETA V features multiple built-in omni-directional microphones which compose directionality from recorded audio sources, creating the four WXYZ signals.
The TA-1 is an Ambisonics-compatible microphone which converts recorded audio sources into the four WXYZ signals.
Because this method makes it possible to rotate the entire audio field after recording, the audio field tracks the front/back, left/right, up/down movement of images even when the viewpoint of 360° images changes, making it possible to record and play back audio that is very close to the actual scene.

2. THETA V spatial audio playback - HRTF (Head-related transfer function)

2. THETA V spatial audio playback - HRTF (Head-related transfer function)

People can identify the difference in volume of both ears and the time lag until sound reaches the ears for the arrival direction of audio. By understanding the difference of these two factors, it is possible to produce the feeling of the audio source moving in any direction during playback.
HRTF mathematically expresses (filters) the difference characteristic value of the above as audio data. When this is combined with the audio source of the recorded images that were recorded using the three-dimensional audio, the audio source can be moved to the front/back, left/right, and up/down directions just as 360° images track the movement of the user's head.
This technology is used on the RICOH THETA V, and convoluting the filter into the recorded audio makes it possible to detect a sense of directionality and distance that is similar to the real scene.
* Because data for HRTF changes according to reverberation of the individual's head, body, and ear shape, the same sensation may not be available for everyone.

Against the background of these two technologies, the RICOH THETA V spatial audio file format compatible playback environment can be paired with an HMD or VR viewer with headphones and head tracking function to link the 360° images viewers are seeing with 360° audio for a you-are-there feeling that delivers a realistic visual and audio VR experience.