Front-line Researchers

Vol. 22, No. 3, pp. 6–11, Mar. 2024. https://doi.org/10.53829/ntr202403fr1

Estimating the True Brightness of Each Pixel by Using Noise in a Video Image

Seishi Takamura
Visiting Senior Distinguished Researcher,
NTT Computer and Data Science Laboratories

Abstract

The key to decoding a coded video in such a manner that the decoded video is as clear as possible is to remove noise contained in the original video, which also improves coding efficiency. Many researchers studying imaging have investigated this noise-removal problem; as a result, it is now said that random noise can be reduced to the limit at which only shot noise, a type of noise resulting from the random arrival of photons, remains. Although noise is generally an undesirable phenomenon, research that exploits it to achieve various effects is attracting attention. Seishi Takamura, visiting senior distinguished researcher at NTT Computer and Data Science Laboratories, developed a technology that uses fluctuations (noise) in the brightness of light to estimate brightness beyond the upper limit of digital values. We asked him about this technology and his stance as a researcher: “question everything,” “don’t be content,” “never say no,” and “let it lie.”

Keywords: image/video coding, noise, 3D-point-cloud image, limit-break sensing

PDF

To simultaneously achieve two contradictory goals concerning image/video coding: high reproducibility and improved compression ratio

—Could you tell us about the research you are currently conducting?

I am mainly researching three technologies. The first is “omni-ambient data-organizing technology” that enables data to be stored and distributed without being discarded and compressed 100 to 1000 times more than possible with general technologies while maintaining higher quality. The second, “real-entity mining technology,” removes disturbances such as noise, distortion, out-of-focus, and missing information from captured images, infers the true appearance of the subject, and encodes the images on the basis of that information. I talked about these two technologies in the previous interview (April 2021 issue). The third technology uses fluctuations (noise) in the brightness of light to estimate brightness beyond the upper limit of digital values and accurately estimate the brightness of dark areas. Since April 2022, I have concurrently been a member of a university laboratory researching the omni-ambient data-organizing and real-entity mining technologies with students as well as at NTT and researching technology for estimating brightness beyond the upper limit of digital values by using fluctuations in the brightness of light at the university.

Let me talk about my ongoing research at the university on a method for automatic acquisition of rules hidden in natural pattern images and ultra-high compression of those images. With this method, if a pattern in nature, such as the pattern on a shell, is converted into a mathematical expression and a manually created algorithm (evolutionary computation engine) is used to automatically generate the pattern (Turing model), or if an algorithm is automatically created and used to automatically generate the pattern (fractal model), a pattern similar to the original pattern can be reproduced (Fig. 1). For example, when a fractal model is used to reproduce a photograph of a fern, the learned perceptual image patch similarity (LPIPS), a scale close to human perception, shows an accuracy of 0.7173 (1.0 is a perfect match) achieved in 33 bytes. A JPEG* image of the same LPIPS level would require 1580 bytes, which comparatively indicates that the compression by this method is quite high. I believe that by applying this method, it may be possible to simulate to some extent what is happening inside living organisms.


Fig. 1. Automatic acquisition of rules hidden in natural pattern images and ultra-high compression of those images.

As I mentioned in the previous interview, by adding one frame of image to the original video and compressing it, it is possible to achieve the same level of video quality with a higher compression rate. I also confirmed that even-higher compression rates can be achieved by using an infrared image for the additional frame.

With Video-based Point Cloud Compression (V-PCC), which is the international standard for three-dimensional (3D)-point-cloud coding, 3D-point-cloud information is decomposed into 2D images for compression, and as shown in Fig. 2, the black areas in the 2D images are unnecessary areas that are not used for final display, although they are compressed and transmitted. Since the unnecessary areas can be filled in arbitrarily (so-called “padding”), reducing the difference in color, for example, between a padded area and the adjacent area, makes it possible to increase compression efficiency. I am investigating this padding method and aiming to use it for actual 3D-point-cloud encoding. The effect was confirmed in all four test point clouds that I tested.


Fig. 2. Converting a 3D-point-cloud image to 2D images.

* JPEG: Standards for still image coding developed by the Joint Photographic Experts Group, a working group of International Organization for Standardization/International Electrotechnical Commission (ISO/IEC).

—Could you explain the technology for estimating brightness beyond the upper limit of digital values by using fluctuations in the brightness of light?

An image always contains noise, such as shot noise caused by random arrival of photons; random noise, such as thermal noise derived from irregular thermal vibrations of free electrons in a conductor; quantization distortion in the digitization process; and clipping distortion, namely, the top of an output-signal waveform is distorted because the input signal is larger than the standard. Images contain much (about three times) more noise than audio, that is, images have a signal-to-noise ratio (SNR) of around 40 dB, but audio signals have an SNR of around 120 dB (lower SNR indicates greater noise). Video that contains a large amount of noise is coded as it was at the time of capture. Therefore, to obtain a “clear” image, it is necessary to remove the noise, and since removing noise improves the efficiency of encoding, noise removal has been extensively investigated.

Technology that exploits properties of noise has also been investigated and enabled us, for example, to detect falsified areas in images. One effective use of such noise is what we call “limit-break sensing,” which makes it possible to infer the “true brightness” of an image beyond the theoretical limit of the image-display system.

Each pixel in an image captured with a camera is digitized as a numerical value (pixel value) that ranges from 0 to 4095 in the case of 12-bit digitization. The highest (brightest) pixel value is 4095, but if the actual brightness of a pixel exceeds 4095, all pixels would be thought displayed at 4095 (saturation value). However, since the image contains random noise, after multiple shots, the pixel value will fall below 4095 at a rate determined by the actual brightness.

In an experiment to verify our limit-break sensing, the row of pixels in the green box on a piece of blank paper in the photo in Fig. 3 was plotted as a graph with horizontal pixel position on the horizontal axis and pixel values on the vertical axis, and the profile shown in the graph was obtained. When the brightness of the piece of the paper was adjusted to cause partial overexposure (saturated value) and 10,000 images were captured with a separate 12-bit monochrome camera, the average pixel values were plotted and the profile shown in Fig. 4(a) was obtained. The overexposed area is indicated with the red line. By comparing the profile plotted in Fig. 4(a) with that plotted in Fig. 3, it can be estimated that the pixel values of the red-line area would be as shown as the dashed purple line in Fig. 4(a). When the ideal pixel values (true brightness) without clipping or quantization are plotted on the horizontal axis and the expected values of the actual output pixel values with clipping and quantization are plotted on the vertical axis, the relationship shown in Fig. 4(b) was obtained as the purple curve. If the average pixels value near the saturation value in Fig. 4(a) are applied to recover the pixel values of true brightness (namely, average pixel value A is converted to B in line with the purple curve in Fig. 4(b)), the values in the green points encircled by the ovals in Fig. 4(c), which exceed the maximum camera output pixel value (4095), are obtained. This indicates that the pixel values of true brightness can be estimated. Note that the fully saturated area (the area in which the pixel value never falls below 4095 in 10,000 shots) cannot be recovered, so the pixel value is determined as 4300.


Fig. 3. Experimental scene.


Fig. 4. Average pixel values of captured image and results of recovering pixel values of true brightness.

—You have received many awards for these achievements in a short period, haven’t you?

I am grateful that I have received or will receive the following 11 awards and have given six invited lectures since May 2022:

  • Standardization Achievement Award from Information Processing Society of Japan (IPSJ)/Information Technology Standards Commission of Japan (May 2022)
  • IPSJ Fellow (June 2022)
  • FY2022 Excellent Patent Award (1st Class) from NTT Corporation (November 2022)
  • Best Paper Award at Picture Coding Symposium of Japan and Image Media Processing Symposium (PCSJ/IMPS) (jointly won by S. Kudo, Y. Bando, S. Takamura, and M. Kitahara) (December 2022)
  • Best Poster Award at PCSJ/IMPS (December 2022)
  • IE Award from the Institute of Electronics, Information and Communication Engineers (IEICE) Technical Committee on Image Engineering (jointly won by S. Kudo, Y. Bando, S. Takamura, and M. Kitahara) (December 2022)
  • Certificate of Appreciation from IEEE Region 10 (December 2022)
  • IE Award from the IEICE Technical Committee on Image Engineering (March 2023)
  • Fellow from Asia-Pacific Artificial Intelligence Association (AAIA) (July 2023)
  • Certificate of Appreciation from Asia-Pacific Signal and Information Processing Association (APSIPA) (November 2023)
  • Achievement Award from IEICE (jointly won by S. Matsuo, Y. Bando, and S. Takamura) (June 2024, to be awarded)

Researchers usually receive awards after several years of research, so ten awards within a year and a half of changing my job seems like a lot to me, so my timing must have been good. However, among those awards, the PCSJ/IMPS Best Poster Award in December 2022 and the IE Award in March 2023 were awarded for research I started after April 2022, and I am surprised that they were awarded in the first year of that research. These two cases are good examples of how world-first findings can be obtained without spending a lot of money; for example, the experiment for estimating true brightness that I mentioned used a camera costing no more than 100,000 yen. Because I did not spend much money, I devised experimental methods and conducted experiments with a certain idea of what kind of results I would get, and those experiments turned out exactly as I had presumed. Therefore, I feel a great sense of accomplishment.

Acquiring research opportunities at universities, conducting research in a wide range of fields, and contributing to the increase in the number of IEEE Fellows from Japan

—What kind of research activities will you be focusing on in the future?

I have been researching at a university for a year and a half now, and I have had two things that I could not have experienced at NTT laboratories, namely, obtaining research funds, such as Grants-in-Aid for Scientific Research and other competitive research funds, and giving student guidance, including setting research themes. As well as applying for research funds myself, I also collaborate with researchers outside the university to apply for them. Although it is very difficult to obtain funding, I finally obtained a research fund in 2023. The number of students who I am mentoring has increased from 2 in 2022 to 16 in 2023, and the weight of research guidance on me has increased considerably.

At NTT, I have been delving into research themes within the framework of my laboratory or research projects; in contrast, at the university, research themes are not subjected to such a framework, so I can freely set my themes. However, funds are scarce, so in addition to continuing the research I have been doing, I want to gain knowledge of things that no one else has done in a wide range of fields through experimenting (i.e., desktop theoretical research and thought experiments) while spending as little money as possible. The students I give guidance are not professional researchers like those at NTT laboratories, but apprentice researchers. The number of such students is increasing, and a good environment for expanding the scope of their research has been created. I also want to increase joint research with companies.

I recently attended the Fellow Committee of the Institute of Electrical and Electronics Engineers (IEEE), which is a meeting of 50 members selected from IEEE Fellows around the world to make the final selection of new IEEE Fellows who will become Fellows on January 1 of the following year. I was greatly inspired by the committee members and applicants who had made outstanding achievements. IEEE Fellow is a prestigious honor; however, I noticed recently that the number of Fellows selected from Japan has been significantly decreasing. As I am fortunate to be one of them, I want to contribute—even to a small extent—to increasing the number of Fellows selected from Japan.

Give it a try, let it lie, and think it from a different perspective

—Please tell us what you keep in mind as a researcher.

I keep in mind four maxims that form my stance as a researcher: “question everything,” “don’t be content,” “never say no,” and “let it lie.”

Many researchers are aware of the importance of questioning everything. Questioning experimental results and phenomena, rather than accepting them on face value, can lead to the truth or falsehood of the results, further results, and new discoveries. Questioning is truly an attitude of pursuit.

The “don’t be content” maxim has its origins in my father’s greeting speech at my wedding reception, when he said, “My son has done too much (things have gone too well).” I took his words to mean that I had to be careful because I never knew when I might fail, get sick, or lose my footing. For example, when you received an award, if you are complacent about it, you will stop there. The moment you receive an award, it is a thing of the past. The next step has already begun. If you follow a predetermined route, you can only see the road ahead, but if you look aside a little, you will see a different view.

As I mentioned in the previous interview, “never say no” means to give without expecting anything in return. By continuing to give, you will be not only helped by many people but also able to make important contacts. Since you will be able to understand the feelings of those who ask you to do something, you will be able to make a request while considering the other person when you are in the position of the one who is asking. I have served as a committee member, officer, chairperson, and seminar organizer for several academic societies, and through these activities, I have come to realize that experience is the best teacher. I believe that taking on a variety of positions will lead to one’s growth.

I will give you two examples of “let it lie.” When I applied for the IEICE 100-Year Memorial Paper Award Competition, I first gathered a lot of information, continued gathering information without writing even though I wanted to, and only wrote the paper after “letting it lie” (leaving it as is and giving it time), and I received the Best Paper Award. In my research on uniform color space, I had been trying and failing to come up with a variety of methods to solve a problem. While relaxing at the beach on holiday, it suddenly occurred to me to apply a method used in structural analysis, and by applying that method, I was able to find a way to solve the problem which led to receiving the Niwa-Takayanagi Award (Best Paper Award) from the Institute of Image Information and Television Engineers. I think both of these awards were the result of having a distance from the research and being able to look at it from different perspectives.

—Please give a message to future researchers.

Some researchers carry out research in accordance with their rigid plans, and other researchers do in a more flexible way, but I think it is good to have both types of researchers. Although I tend more toward the flexible type, I still managed to proceed with my research. In other words, since research is often subject to direction changes due to unexpected results, I think that it has been better to be flexible and repeat trial and error. That may seem contradictory to the above-mentioned “let it lie” maxim, but what I do is give it a try, let it lie, then change my perspective and think about it, and sometimes that kind of “give it a try” attitude is necessary. Being young has two advantages: you have relatively more free time, and you don’t have to be afraid of losing something since you don’t have many achievements. I believe that these advantages lead to freedom of ideas; therefore, I think it is good to think freely and give things a try.

Interviewee profile

Seishi Takamura received a B.E., M.E., and Ph.D. from the Department of Electronic Engineering, Faculty of Engineering, the University of Tokyo, in 1991, 1993, and 1996. His current research interests include efficient video coding and ultrahigh-quality video processing. He has served as associate editor of IEEE Transactions on Circuits and Systems for Video Technology (2006–2014), editor-in-chief of the Institute of Image Information and Television Engineers (ITE), executive committee member of the IEEE Region 10 and Japan Council, and director-general of ITE affairs. He has also served as chair of ISO/IEC Joint Technical Committee (JTC) 1/Subcommittee (SC) 29 Japan National Body, Japan head of delegation of ISO/IEC JTC 1/SC 29, and as an international steering committee member of the Picture Coding Symposium. From 2005 to 2006, he was a visiting scientist at Stanford University, CA, USA.

He has received numerous academic awards including ITE Niwa-Takayanagi Awards (Best Paper in 2002, Achievement in 2017), the IPSJ Nagao Special Researcher Award in 2006, PCSJ Frontier Awards in 2004, 2008, 2015, and 2018, the ITE Fujio Frontier Award in 2014, and the Telecommunications Advancement Foundation (TAF) Telecom System Technology Awards in 2004, 2008, and in 2015 with highest honors, the IEICE 100-Year Memorial Best Paper Award in 2017, the Kenjiro Takayanagi Achievement Award in 2019, Industrial Standardization Merit Award from Ministry of Economy, Trade and Industry of Japan in 2019 (as an individual) and in 2020 (as NTT team), PCSJ/IMPS Best Paper Award and Best Poster Award in 2022, Certificate of Appreciation from IEEE Region 10 in 2022, IE Award in 2022 and 2023, and Certificate of Appreciation from APSIPA in 2023.

He is an IEEE Fellow, IEICE Fellow, ITE Fellow, IPSJ Fellow, AAIA Fellow, and member of Japan Mensa, the Society for Information Display, and APSIPA.

↑ TOP