CMOS image sensor development at Sony began in 1996 and led to the launch of our first CMOS image sensor (IMX001) in 2000. At the time, CMOS image sensors produced noisy images under low light and were also inferior to CCD image sensors in the number of pixels. However, the lower readout speed of CCD image sensors convinced us that they would be unable to support high-resolution data as the industry moved from SD to HD video. In 2004, we therefore changed course greatly by shifting our focus from CCD to CMOS image sensor development. It was a bold decision. Instead of holding the world’s number one share in CCD image sensors, we would be building on a negligible market share in CMOS image sensors.
Later in 2007 we commercialized CMOS image sensors with an original column A/D conversion circuit for fast, low-noise performance, followed in 2009 by back-illuminated CMOS image sensors with twice the sensitivity of conventional image sensors – beyond the human eye.
Further examples of technical innovation that has enabled us to constantly lead the industry include stacked CMOS image sensors in 2012 – with higher image quality and multiple functions in a smaller package, thanks to layering of the pixel and signal-processing sections – and, in 2015, the world’s first image sensors with a Cu-Cu connection, enabling smaller packages, higher performance, and greater productivity in manufacturing.
These sensors use original Column-Parallel A/D conversion technology, with an A/D converter for each vertical row of pixels, arranged in a parallel configuration.
In this arrangement, analog signals read from the vertical signal lines can be conveyed directly to each row’s ADC across a minimal distance, which reduces loss of image quality from noise entering the signal during analog transmission and accelerates signal readout. Noise is also reduced through dual noise cancellation, with high-precision cancellation applied to both analog and digital circuits.
These image sensors adopt an innovative back-illuminated structure that offers lower noise and nearly double the sensitivity of conventional front-illuminated CMOS image sensors.
Without interference from wiring or transistors, light is received from the back side of silicon substrate, which increases the amount of light entering each pixel and reduces loss of sensitivity relative to light entering at various angles. This enables smooth, clear images even at night or in other low-light conditions.
In the stacked structure adopted by these image sensors, the pixel section where back-illuminated pixels are formed is layered over a chip (instead of the supporting substrate used in conventional back-illuminated CMOS image sensors) where signal processing circuits are formed.
One advantage is that large-scale circuits can be mounted on a small chip. Since each section is formed on a separate chip, specialized manufacturing processes can be used to produce a pixel section with high image quality and a circuit section with high performance, enabling higher resolution, multi functionality, and a compact size.
Cu-cu connections involve direct connections between the copper pads formed on the layering surfaces of pixel chips and logic circuit chips. Without the need to provide electrical connections through pixel chips or special areas for connections, manufacturers can make smaller image sensors at a higher rate of productivity. Offering greater freedom in pin layout and higher density, the technology will help enable stacked CMOS image sensors with expanded functionality.
In the image sensor market, we are also focusing on the sensing field, where applications are expected to expand. Besides the imaging technology we have honed for viewing captured images, we will be combining this technology with sensing technology for acquiring and using a various information. In this way, we are cultivating new applications and markets for image sensors.
Time-of-flight (ToF) image sensors determine the distance to objects from the time it takes emitted light to reflect off the objects and reach the sensor. Pixel technology in Sony’s back-illuminated structure produces depth maps as accurate as those of conventional sensors even at 1.5 times the distance. Gesture, object recognition, and obstacle detection are expanding applications for these sensors, which are used in augmented reality (AR) and virtual reality (VR) scenarios, and in robots and drones that require autonomous operations.
Sony has commercialized the IMX490 CMOS image sensor for automotive cameras with 5.4 effective megapixels, the industry’s highest *1 in a sensor that both mitigates LED flicker (from LED signs and traffic signals) and offers a wide dynamic range. Also, IMX324 was released, which is a stacked CMOS image sensor with 7.42 effective megapixels, the industry’s highest resolution *2 for forward-sensing cameras. We expect these sensors to be used in more cars than ever, in applications including advanced driver-assistance systems (ADAS) and camera monitoring systems (CMS) that replace rearview mirrors.
Sample comparison movies of image sensors
Comparison of distant sample images
Comparison under low-light (0.1 lux) in sample images
One forward-looking initiative at Sony is called Sensor Fusion. The technology under development integrates raw data from camera feeds, LiDAR, and milliwave radar to identify vehicles and other objects. As a sensor manufacturer, Sony is uniquely positioned to combine the signal processing, noise reduction, and data optimization in this technology. These examples show how effective Sensor Fusion can be. Even under challenging conditions for object recognition – such as fog, glare, or rain at night – Sensor Fusion enables accurate identification sooner than other systems.
Current Technology
Sony's Sensor Fusion
Current Technology
Sony's Sensor Fusion
Current Technology
Sony's Sensor Fusion
Leveraging Sony’s semiconductor technologies, Solid State LiDAR uses highly accurate distance measurement to gain a precise 3D grasp of real-life spaces. This also leads to improved recognition capability for long-distance objects.
This is the stacked direct Time of Flight(dToF) depth sensor for automotive LiDAR using single-photon avalanche diode (SPAD) pixels, an industry first*.
By stacking SPAD pixels and distance measuring processing circuits into a single chip, distances of up to 300meters can be measured at 15cm resolution with high precision and high speed.
*Among stacked depth sensors for automotive LiDAR. According to Sony research(as of announcement on February 18, 2021)
SPAD is a pixel structure that uses avalanche multiplication to amplify electrons from a single incident photon, causing a cascade like an avalanche, and it can detect even weak light.
This is one of the distance measurement methods, in which the flight time (time difference) of the light emitted from the light source and reflected by the object reaches the sensor is detected, and it measures the distance to the object.
The distance measurement sensor using the dToF method uses SPAD pixels that detect a single photon, enabling long-distance and highly precise distance measurement.
Sony's SPAD Depth Sensor employs a back-illuminated SPAD pixel structure that uses a Cu-Cu connection to achieve conduction for each pixel between the pixel chip (top) and the logic chip equipped with distance measuring processor circuits (bottom).
By its original pixel structure, sensor can be compact yet high-resolution, also enables high-precision, high-speed measurement at 15-centimeter range resolutions up to a distance of 300 meters*
*When measuring an object with a height of 1 meter and reflectance of 10% using additive mode of 6 x 6 pixels (H x V) under cloudy daylight conditions.
Polarizers were separate in conventional polarization cameras, but innovative CMOS image sensors from Sony incorporate the polarizer into a back-illuminated CMOS image sensor. As a one-chip solution with the polarizer on the photodiode, it enables more compact polarization cameras. Potential applications are not limited to automotive field but include a variety of other applications that involve capturing subjects obscured by glare, or capturing details of surface unevenness.
An ordinary image sensor
A polarization image sensor
An ordinary image sensor
A polarization image sensor
An ordinary image sensor
A polarization image sensor
Intelligent Vision Sensor is the first image sensor in the world to be equipped with AI processing functionality.* Including AI processing functionality on the image sensor itself enables high-speed edge AI processing and extraction of only the necessary data, which, when using cloud services, reduces data transmission latency, addresses privacy concerns, and reduces power consumption and communication costs.
*:Among image sensors. According to Sony research (as of announcement on May 14, 2020)
The spread of IoT has resulted in all types of devices being connected to the cloud, making commonplace the use of information processing systems where information obtained from such devices is processed via AI on the cloud.
On the other hand, the increasing volume of information handled in the cloud poses various problems : increased data transmission latency hindering real-time information processing ; security concerns from users associated with storing personally identifiable data in the cloud ; and other issues such as the increased power consumption and communication costs cloud services entail.
From these backgrounds, the need for Edge AI that enables AI processing within edge devices
is increasing.
To realize Edge AI processing, edge devices need to be equipped with AI processing functionality, but there are differences depends on where AI processing works within the edge devices.It is also possible to be equipped with an AI processor separately from an image sensor within a system configuration of the camera. On the other hand, Intelligent Vision Sensor has AI processing functionality so that enables Edge AI processing within the image sensor.
Intelligent Vision Sensor features a stacked configuration consisting of a pixel chip and logic chip. In addition to the conventional image sensor operation circuit, the logic chip is equipped with Sony’s original DSP(Digital Signal Processor) dedicated to AI signal processing, and memory for the AI model. This configuration eliminates the need for high-performance processors or external memory, making it ideal for edge AI systems.
Intelligent Vision Sensor can output metadata which is semantic information belonging to image data, making for reduced data volume.Ensuring that image information is not output helps to reduce security risks and address privacy concerns. In addition to the image recorded by the conventional image sensor, users can select the data output format according to their needs and uses, including ISP (Image Signal Processor) format output images (YUV/RGB) and ROI (Region of Interest) specific area extract images.
*:AI functionality has the characteristic of using a statistical or probabilistic method, and as a result, metadata that is erroneously recognized may be added.
When a video is recorded using a conventional image sensor, it is necessary to send data for each individual output image frame for AI processing, resulting in increased data transmission and making it difficult to deliver real-time performance. Intelligent Vision Sensor performs ISP processing and high-speed AI processing (3.1 milliseconds processing for MobileNet V1*) on the logic chip, completing the entire process in a single video frame. This design makes it possible to deliver high-precision, real-time tracking of objects while recording video.
* MobileNetV1:An image analysis AI model for object recognition on mobile devices.
Users can write the AI models of their choice to the embedded memory and can rewrite and update it according to its requirements or the conditions of the location where the system is being used. For example, when multiple cameras employing this product are installed in a retail location, a single type of camera can be used with versatility across different locations, circumstances, times, or purposes. When installed at the entrance to the facility it can be used to count the number of visitors entering the facility; when installed on the shelf of a store it can be used to detect stock shortages; when on the ceiling it can be used for heat mapping store visitors (detecting locations where many people gather), and the like. Furthermore, the AI model in a given camera can be rewritten from one used to detect heat maps to one for identifying consumer behavior, and so on.
Demonstration video of Worker Monitoring
*Realized in combination with various sensors