Vol.4
RICOH THETA Developer Interview
4K spherical videos,
Ask the developer.
“Users can fully enjoy the benefits of 4K videos. That is what the 4K spherical video means.”
First, we’d like to ask the difference between regular 4K videos and the 4K spherical video.
MatsumotoIn case of regular 4K videos, the entire screen is within the 4K resolutions (3840 pixel x 1920 pixel). In case of the 4K spherical (360-degree) video on the other hand, part of the 4K video is cropped and displayed so the resolutions are more important than regular 4K videos.
The importance of resolutions means; When a 2K video is used as a 2K spherical video, the actual view range will have resolutions as (960 pixel × 540 pixel) or so. When a 4K video is used as a 4K spherical video, the resolutions of the actual view range will be enhanced to Full HD resolutions (1920 pixel × 1080 pixel) or so.
I think regular 4K videos unexpectedly have somewhat overengineered aspects for general users to handle in terms of the fact that regular 4K videos' resolutions are too high. On the other hand, I believe 4K spherical videos enable users to enjoy the benefits of 4K videos to the fullest.
THETA V has realized 360-degree video recording in 4K resolution. Why was this feature adopted at this timing?
Shohara The resolutions of previous THETA videos in particular has been 2K resolutions (1920 pixel × 960 pixel). Development from this resolution to 4K resolutions should be legitimate. In term of resolutions, when we released THETA S in 2015, we also received requests from users, who asked “why 4K resolutions are not supported?”.
In those days, however, considering the spread of playback environments, we judged that it was too early to support 4K resolutions. When using THETA, many users prefer to connect it to a smartphone to view images. However, smartphone itself didn't provide sufficient support for 4K-based playback.
MatsumotoSpecifications to play back 4K videos were left behind. It seems that they have been spreading lately.
“The first impression was just amazing, and amazing was the only word I could say.”
What difficulties did you face to install the feature of 4K spherical video recording?
MatsumotoThe specifications of both image sensors (imaging elements) and processors for taken videos and images have improved this time. Despite this improvement, the current processing speed barely enables the playback of 4K videos. For this reason, placing a higher emphasis on “improved image processing” may end up producing clumsy moves when a video is played back.
THETA V has realized 4K 29.97fps (fps means the number of frames recorded per second). We were attentive in balanced processing so that improved image quality is provided while the number of frames per second for smooth playback is realized.
Still images and videos differ in processing details performed inside the device. We have created THETA V so that processing at a high speed is possible solely for 4K videos. For previous models of THETA, we emphasized clear shooting of still images, and we hope users are satisfied with THETA as a video camera from now on.
We assume that hardware changes meant a turning point.
ShoharaAs we achieved our original image processing mainly by GPU (Graphics Processing Unit), we considered the performance of GPU was essential. As it was required to process 30 frames of a large-size image with 4K resolutions in a second, we also needed to improve GPU programs. In addition, we considered various ways of enabling effective signal processing for the entire system while reviewing memory transfer, processes, and threads.
Based on this experience, do you intend to realize 60 frames per second in the future so that more smooth images can be produced?
MatsumotoYes, I do. I know processing becomes even harder though (laughs).
ShoharaI also want to improve resolutions. I hope to realize 8K resolutions for videos.
What was the first impression you had when you first saw a 4K video recorded using an actual THETA V model while developing it?
MatsumotoWe were much excited. After all, videos compared with those with 4K resolutions were videos recorded using previous models. We were so impressed that we felt as if the image quality of a television using a Braun tube had suddenly changed to that of terrestrial digital broadcasting.
ShoharaAs 4K resolution means reproducing fine details, I felt as if still images were moving. The impression I had was that the reality of images is totally different.
MatsumotoIt was really more than what we imagined, and our colleagues only said “Amazing!” as their first impression. Since then we have continued tune-ups for image quality, and I currently feel we have created higher impact images.
“We have succeeded in supporting the live streaming of 4K videos for the first time in the world.”
We have learned that the live streaming function also supports 4K resolutions now.
ShoharaThis time we have provided support for 4K UVC (USB Video Class, a standard for Web cameras and those that connect to USB devices). Especially we have provided support for UVC 1.5 because video compression codecs (H.264) is used. While previous models were also compatible with UVC 1.5, support was provided for full HD at most, and providing support for 4K UVC was our first experience. In fact, after starting development, we found our development members were embarrassed. There were no devices compatible with 4K UVC 1.5 that we could refer to. There, we recognized that “THETA V might be the first device that enables 4K UVC 1.5 output” (laughs).
ShoharaAccordingly, miscellaneous problems also occurred. Although there were no problems when standards and software were for full HD or lower, non-applicable problems and malfunctions occurred when standards and software were higher than those for full HD. We needed to respond to this problem. I think it was April of this year (2017) that Windows® and Mac drivers started to support 4K UVC 1.5. As the development of THETA V was ahead of these trends, what happened was....
MatsumotoWhen UVC-based live streaming did not work well, we simply could not figure out whether our THETA was wrong or the Windows® computer receiving data was wrong (laughs).
ShoharaAfter all, we conducted verifications rigorously while developing drivers for computers until THETA V was released.
As a 4K camera capable of live streaming, THETA V is likely to be a product for future reference, isn’t it?
ShoharaIn future, I hope “we are the first to provide what can be done”.
A previous challenge for computers to handle spherical videos was operational load for connecting fish-eye images to make a 360-degree spherical image. THETA S and its previous models had a piece of software embedded in the driver to connect shot images within the computer. As 4K spherical videos are data-heavy, operational load can be enormous if only computers perform these processes. Therefore, we have changed the previous specifications so that the main unit of THETA V can process this job, reducing the computer's load.
MatsumotoNumerically, 4K is twice as large as 2K, but 4K video's data amount is four times as large as that of 2K video. If a computer needs to process this amount of data fully, its operational load can be considerable.
ShoharaHowever, it is true that requirements are reasonably higher if live streaming is actually performed using 4K 29.97fps. For instance, using a high-spec computer is needed. Even though it is true, we cannot advance further unless there is a device like THETA V. I assume that we could provide a material for new challenges.
“For further evolvement, there is no goal in a process for pursuing image quality.”
What was the point of leading the development team and managing 4K and other image process development?
ShoharaThis time we adopted an Android™-based system, which was the first attempt for RICOH's camera production. In order to create a robust product, we made a development team consisting of experienced specialists. Originally, RICOH is equipped with knowledge of image quality designing and image processing that has been cultivated through the production of “PENTAX”, “GR”, and other SLR cameras or conventional models. We designed a system where we could utilize the knowledge we have developed. Although Android™ was used as the base, the contents of the system were modified exclusively for THETA.
MatsumotoAs this was our first attempt, we were also confronted with challenges. While parts produced in each site were excellent, assembling those parts to make the entire system sometimes failed to produce a result. We repeated adjustments in order to improve the final image quality.
How did you overcome those challenges?
MatsumotoWe repeated test shooting earnestly. We often pursued problems from the final images where we found something strange. As I have been in charge of image quality for a long period of time, I first obtain intuitive insights of problems, delving into them later.
There is no goal in a process for pursuing image quality. The more you are involved in the process, the more eager you will become. It is because levels of issues to be pointed out rise as levels of products from each site continue to improve through a number of steps.... I wish to commit to it endlessly unless there is a deadline, which is the release date of the product (laughs). Image quality is the issue I was particular about to the end.
ShoharaWe were lucky because Mr. Matsumoto was really careful about details. He engaged in image quality designing, always comparing THETA V's image quality with that of THETA S. He performed quality adjustment until the very end in order to improve THETA V's image quality even just a little. Without Mr. Matsumoto, we could not have created a product that surpasses THETA S.
MatsumotoIt is an exaggeration, isn't it? (laughs)
“We were often told that THETA is not a camera.”
What development have you been engaged in this time other than 4K video, Mr. Shohara, as a person Mr. Matsumoto has just described?
ShoharaStarting from what product we will produce by using what technology or what members we will have for the development team, I have been involved in THETA V from its launch. I have been engaged in a broad range of activities from something like planning to development and designing.
In this model, we have incorporated new features such as 4K video, 360° Spatial Audio, Remote Playback , and so on. To tell the truth, I have been proposing these ideas since we were developing the prototype of THETA's first model. Part of these ideas have been taken up this time. There is still a list of unrealized ideas I have been thinking about since that time.
MatsumotoMr. Shohara often says "THETA is not a camera". He says it in a good sense, of course.
ShoharaFrom the point of view of “creating a camera”, we will be bothered with “parts that are unlike those of conventional cameras”. We build a wall by ourselves and cannot afford to move forward. THETA is a product that has surpassed the existing preconceptions. We aim to create a product to provide new experiences for users. Based on an accumulation of those experiences, THETA V takes shape.
In the first place, we want THETA V to be used as a regular spherical camera in the future. In addition, in order to provide new experiences such as Remote Playback, we have found Android™ is an ideal system.
THETA's world view is further widening thanks to THETA V's proprietary operating system based on Android™.
MatsumotoFrom the eyes of THETA V developers, what Android™ can do can be basically performed on THETA V, too. There were many THETA V development members who have not used Android™, and they say "THETA V is a smartphone more than we imagined". Before creating the main unit, we used a large smartphone while considering software applications. We were running applications on it. Those applications we developed in that way have been installed on the current THETA V.
Although THETA V is a device not equipped with a display, there exists a familiar screen inside the device that cannot be seen from users. In the initial stages of development, we could watch YouTube from an application normally used on a smartphone.
“360-degree Spatial Audio is a good match with 4K spherical videos.”
How will image expression change through a combination of 4K spherical video and 360-degree Spatial Audio?
ShoharaWith an enhanced sense of presence, we will have higher levels of pseudo experiences of being on the spot than those we have had before. In order to realize sound localization in spherical images for THETA V, we have adopted an Ambisonics-based spatial audio recording system. Thanks to this system, we have realized sound localization so that we can identify the location of a detected sound in direction. Many users see part of spherical images and the spatial region to be seen is limited. Meanwhile we hear sounds from spatial regions that are not seen. Moreover, as we look back to the direction a sound comes from, we can see the high-resolution image of the object emitting the sound, I assume that 360-degree Spatial Audio is a good match with 4K spherical videos.
The resolution of THETA V is equivalent of that of THETA S, which is 14M resolution. Is there any change to image quality?
MatsumotoCompared with THETA S's ISP (Image Signal Processor), the ISP of THETA V has been advanced so that there are benefits in terms of image quality. After realizing 4K resolutions for videos, we have had a modified sense of resolution to check how sharp a still image is. Referring to videos, thanks to full HD output mode that is available, the sense of resolution for the same full HD video mode has improved greatly when THETA V is compared with THETA S.
Although the pixel number of the sensor installed on THETA V is the same as that of the sensor installed on THETA S, the interface connecting the sensor and the entire system has improved to enable high-speed data transfer. This transfer speed realizes 4K resolution presentation.
These features have been realized by utilizing the performance of the main processor Snapdragon. Additionally, by utilizing the image processing expertise we have cultivated through the production of SLR cameras, we have improved THETA V's basic features as a camera.
“Particular about the image quality of still images, we have also enabled video filming in dark places.”
We have more questions about image quality. Specifying ISO3200 high sensitivity mode for still images has been realized. We can specify ISO6400 sensitivity mode for videos, can't we?
MatsumotoConsidering still images and videos, higher sensitivity cannot be provided for videos in a normal sense. It is because there is no room for spending time in processing images. Only lighter processing can be provided for videos compared with still images. Meanwhile, in case of still images, it is also possible to perform low sensitivity shooting with reduced noise in dark places by lengthening exposure time. Naturally it cannot be done for THETA V's video due to 29.97fps restrictions.
But, we have decided to “manage it somehow”. It is a kind of nonsense...(laughs).
ShoharaIt must be a story Mr. Matsumoto does not like, as he is concerned with image quality (laughs).
Have you received many requests for “shooting images in dark places” from users?
MatsumotoYes, we have, and we have decided to work hard for it. We specify ISO3200 sensitivity as the upper limit for still images as we wish to maintain a certain level of image quality. Specifying ISO6400 sensitivity for videos is not a matter of image quality but a result of emphasizing benefits of “seeing what we have failed to see up to date”.
You have emphasized the importance of “being recorded and visible anyhow” for videos.
ShoharaEven if video images are recorded in a considerably dark place, we can see them as fairly normal video images.
MatsumotoIt does not pose a problem at all to record video images on the street at night where street lamps are scattered. Recording is reasonably possible in a bed room where only a miniature light bulb as large as a night-light bulb is lit. Of course, we cannot record very bright video images.
ShoharaRecorded video images in this way may be closer to natural-looking images.
Looking at the specifications, we find H.265 is also supported.
ShoharaH.265 (HEVC) is a video compression standard, and it was created as a format to support 8K resolutions. Using this standard, we can apply about double the data compression ratio without reducing image quality, halving the data amount.
However, this standard has just started to spread among general users. There are only a few devices to play back contents compressed using H.265. But, we have made THETA V compatible with H.265, expecting that H.265 will come into mainstream use in the future. Arguably, if the data amount is halved, a storage capacity is not consumed. Using H.265 means significant benefits for users.
“THETA is a selfy maker, and it can be a tool to store our daily lives entirely.”
Finally, please tell users the way you recommend to use THETA V.
ShoharaBLE (Bluetooth® Low Energy) was installed. With this, location information obtained using the GPS capability of a smartphone can be automatically embedded in image data. For this, you can keep your smartphone in your pocket. As BLE can reduce power consumption, it is also possible to reduce battery power consumption.
Although location information has already been included in THETA S's image data when shooting was performed using a smartphone, we wanted to include location information in image data only by pressing a button on the main unit. If an image was taken in a place where we often visit, we will know where it was taken, but if it was taken in a place where we visited for the first time, this feature will help us.
MatsumotoWhat I am going to say is an opinion of a person who has been using a conventional camera for a long period of time. THETA's advantage includes “taking a photograph of an object with the photographer”. Unless you are intentionally in hiding when performing shooting, you are definitely taken with the object when using THETA in a normal manner.
I have taken lots of photographs privately so far as well. But, I only have group photographs where I was captured. In some cases, I was not captured in group photographs. This was because I was the photographer of those photographs (laughs). I feel glad to find my figure there in a photograph when I look at it later.
ShoharaTHETA's advantage is that it is easy to use. I hope users record their daily lives as much as possible. If you record your daily life, there must be something you will notice later. There should be something you notice, for instance, like “such a thing was happening” or “we used a piece of furniture like this in those days”. I think it is very meaningful that “objects other than those a photographer intended to shoot are recorded” beyond the photographer's intention.
Profile
Kazuhiro Matsumoto
Ricoh Company, Ltd.
Smart Vision Business Group,
Product Development Center,
Device Development Department
Joined Ricoh Imaging Company, Ltd. in 2002.
At Ricoh Imaging Company, he is in charge of image quality designing for DSLR and compact digital cameras.
Makoto Shohara
Ricoh Company, Ltd.
Smart Vision Business Group,
Product Development Center,
Device Development Department
Joined Ricoh Company, Ltd. in 2011.
Has been working on spherical cameras since joined Ricoh. Ph.D. in Information Science.
*The contents written in this interview is based on the information as of Sep. 15th, 2017