Low Bit-Depth ADCs for Multi-bit Quanta Image Sensors

Zhaoyang Yin\textsuperscript{a}, Graduate Student Member, IEEE, Yibing M. Wang, Senior Member, IEEE, and Eric R. Fossum, Fellow, IEEE

Abstract—A 1024 × 896 test chip is presented in this article to explore a low-power readout circuit for a multi-bit quanta image sensor (QIS). Five well-known analog-to-digital converter (ADC) approaches (flash, pipeline, successive approximation register (SAR), cyclic, and single-slope (SS)) are studied, and two types of ADCs, namely, SAR ADCs and SS ADCs, are implemented in the sensor. The ADC power dissipations are compared under the condition of constant imaging throughput in counting photoelectrons. In QIS devices, one LSB corresponds to one photoelectron. Measurement results show that the power consumption of SAR ADC is better than that of the SS ADC and decreases by a factor of 17 when the resolution is changed from 1 b to 6 bits. By contrast, the power consumption of the SS ADCs decreases by a factor of 2.5. Thus, SAR ADCs seem more suitable for use in low-power multi-bit QIS.

Index Terms—Analog-to-digital converter (ADC), column ADC, low power, multi-bit quanta image sensor (QIS), readout circuit.

I. INTRODUCTION

PIXEL shrinkage to subdiffraction limit (SDL) pitch brings the potential benefit of lower costs for CMOS image sensors (CIS) as the resolution increases. However, with reduced photon flux on each pixel, the signal-to-noise ratio (SNR) and the dynamic range is lower, resulting in reduced image quality. Photon-counting quanta image sensors (QIS) have been proposed to address this problem [1]. The QIS concept features spatial and temporal oversampling that requires high pixel count, very small pixels, and high frame-rate readout, such as a billion 0.5-μm pixels readout at 1000 frames/s (fps).

The specialized small pixels in a QIS with deep SDL pitch, small full-well capacity (FWC), deep subelectron read noise (e.g., < 0.6-μm, 100 e-, and < 0.3 e-r.m.s., respectively) and binary output upon readout are called jots. For QIS operation, the single bit [or least significant bit (LSB)] normally corresponds to a single photoelectron. Much effort has been put into the exploration of such jot devices focused on high conversion gain (CG) of 300–500 μV/e-. A pump-gate (PG) jot [2], a tapered reset PG jot [3], and a JFET jot with punch-through reset [4] have been explored. When the shared readout (SRO) technique is applied in a QIS [5], the fill factor of a pixel can be further improved or pixel size is decreased, both at the expense of CG reduction. The parasitic capacitance on a column bus can also be reduced to improve readout speed.

A prior exploration of readout circuits for the future realization of lower-power gigapixel QIS has been demonstrated by designing a 1-Mpixel single-bit image sensor with a conventional 3T pixel operating at 1000 frames/s. The single-bit column-parallel analog-to-digital converters (ADCs) were implemented based on charge-transfer amplifiers (CTAs) [6]. The power consumption of the whole chip was only 20 mW including pads. However, the CG in that implementation was too low and read noise too high to permit single photoelectron detection, and the emphasis was on low power, high frame rate demonstration of the readout signal chain.

A 1-Mjot photon-counting QIS using a 3D-stacked structure and a new cluster-parallel architecture was introduced [7]. This allows the column bus to be changed to a shorter cluster bus, reducing the parasitic capacitance. The cluster-parallel architecture also eases the tall-thin layout requirements of column-parallel architectures, possibly giving more freedom in ADC choices. The 1.1-μm SRO jot had a CG of 368 μV/e- with an average read noise of 0.21 e-r.m.s. and well-defined photodetector quantization allowing photon-number resolution.

In scaling QIS to gigajot resolution, the cluster-parallel readout architecture is easily repeated across the larger array. However, power dissipation becomes important since the 20 mW demonstrated for the 1-Mjot sensor of [7] would be amplified 1000-fold. Reduction in power can be achieved through the use of more advanced technology nodes with lower parasitic capacitance and reduced power supply voltages. For example, we estimate that changing from a 65- to 45-nm process for the readout circuits might result in 5 W readout power dissipation for a gigajot array.

A multi-bit QIS has also been considered alongside single-bit QIS [8]. In many ways, a multi-bit QIS with photon number resolution, counting to perhaps less than 100 e- per readout, is conceptually midway between single-bit QIS and conventional CIS. Compared to single-bit QIS, a multi-bit QIS with variable bit depth has several advantages [9], [10]. First, by selecting the bit depth of the digital signal of the signal path, the linearity or compression between average input photon flux and output signal can be adjusted. Second, increasing the bit depth (counting more photoelectrons per each
readout cycle) allows the field readout rate to be reduced while maintaining constant photon imaging throughput. Photon flux capacity (FC) is a measure of the photon-counting capability of the QIS and is the product of the QIS pixel or jot FWC (including readout signal chain limitations) and frame rate, divided by quantum efficiency. The QIS output data rate (b/s) can be effectively decreased by using ADCs with higher (albeit still low) bit resolution. The required frame rate for constant flux capacity (FC) is reduced by factor $2^n-1$ for an n-bit QIS. For example, when the bit depth of the readout ADC in a QIS is increased from $n = 1$ to $n = 6$ bits, the frame rate can be decreased by a factor of 63. The photon FC of a 1000-fps single-bit QIS is equal to that of a 1000-fps/($2^n-1$) n-bit QIS. Multi-bit readout helps to address the issue of high output data rates in QIS [4]. It is meaningful to explore the design of a multi-bit QIS not only by circuit simulation, but by reducing it to practice.

To lower the power consumption of the readout circuit in a multi-bit QIS, it is necessary to choose relatively low-power ADCs from among the various available types of ADCs. Further, we are interested in algorithmic ADCs where the bit depth is readily changeable during sensor operation, unlike, say, a flash ADC. Layout area of the ADC is also highly constrained in an image sensor, especially for column-parallel or cluster-parallel architectures. Variable-bit-depth also disfavors pipeline ADC architectures for layout area reasons. While a review letter has been published that compares various topologies for tall, thin, column-parallel ADCs [11], there is often a trade-off between power and layout area that has not been well explored for low-bit-depth/variable-bit depth ADC design.

In this exploratory work, we were also constrained by sponsor interests and funding, and options for image sensor fabrication. The sponsor desired a global shutter (GS) pixel, and practical foundry choices limited us to TPSCo. This resulted in a larger, conventional GS pixel with CG too low, and read noise too high for photon counting, and a column-parallel architecture rather than 3D-stacked with a cluster-parallel option. Nevertheless, aside from some layout differences, we believe the ADC implementations and their performance results allow us to draw important conclusions for future multi-bit QIS development activity.

In this article, not only are different types of ADCs compared but also the relationship between power consumption and bit depth is explored to determine whether any power benefit can be obtained by increasing the bit depth at constant FC. Compared to the preliminary multi-bit QIS ADC studied in [9] and [10], in this article, the ADC design is improved and its bit depth is increased. The resolution of the ADC is decreased to 500 μV to be consistent with anticipated future high-CG QIS implementations. In the present device, 1 LSB corresponds to 4.2 e-. An image sensor prototype is implemented with SS ADCs. Ten tall, thin successive approximation register (SAR) ADCs are also implemented in the test chip for the purpose of comparing different ADCs. The SS ADC design is based on a conventional architecture for the sake of reliability. The SAR ADC design is based on a CTA. To verify whether the energy consumption can be reduced at a constant FC for these ADCs, a detailed analysis of power consumption and area occupancy is performed to select the best type for the readout signal path of a multi-bit QIS. SAR ADCs are found to have better low-power characteristics. The experimental results also show that excluding area requirements, SAR ADCs are superior to SS ADCs and their power consumption greatly reduces with increasing bit depth for constant flux capacity. The results of this comparison can provide meaningful guidance for the future design of low-power, multi-bit, gigajot QIS devices.

The rest of this article is organized as follows. The architecture of the test chip is presented in Section II. The comparison of various ADC architectures is also presented in Section II. Section III explains the detailed design of each block in the sensor. Section IV reports the experimental results. Section V compares the power consumption of two kinds of ADCs. Finally, the conclusion is given in Section VI.

II. SENSOR ARCHITECTURE

The floor plan of the sensor is depicted in Fig. 1. The chip consists of a complete imager and several column-parallel SAR ADCs. The size of the pixel array is 1024 columns x 896 rows. Two-row logic blocks are placed on the left and right sides of the pixel array for layout convenience. The pixel output is connected to an analog column multiplexer (MUX). The output of the MUX is fed to a programmable gain amplifier (PGA), which drives the SS ADCs. There is a 6-bit register in each SS ADC to store the conversion results, thus the previous digital codes can be read out through the digital MUX during the conversion process. A global ramp generator and a global counter for the SS ADCs are located on the left side of the ADC array. In addition to these main blocks, ten tall, thin column-parallel SAR ADCs are included in the upper-left corner of the chip to enable the experimental exploration of a second type of ADC without compromising the image quality of the full array. The signals related to the SAR ADCs are shared with the signals related to the imager due to pad number limitations.
Fig. 2. (a) Diagram of the analog signal chain. (b) Timing diagram of the PGA.

A. SS ADC Analog Signal Chain

Fig. 2 shows a diagram and the simplified timing of a column in the imager. Each pixel is a standard GS pixel produced by TPSCo foundry; its specifications will be introduced in the following section. $V_{bias}$ denotes the source-follower (SF) load of the pixel. The pixel output is fed to the analog MUX, which has three inputs, one output, and two select lines. The other two input lines are connected to $V_{cali1}$ and $V_{cali2}$, which are known signals from outside of the chip and facilitate the calibration of the readout circuit. The output of the MUX is connected to the PGA block. S1 is used for sampling the pixel output. S2 is used for resetting the PGA. When the pixel outputs the reset voltage, denoted by $V_{rst}$, S2 and S1 are turned on simultaneously to reset the PGA and sample $V_{rst}$. The comparator is also reset at the same time. In this way, a cascaded noise cancellation process is applied to the signal chain to remove the reset noise [12]. After the reset, S2 and S1 are both turned off. When the pixel outputs the exposure level, denoted by $V_{sig}$, then $V_{sig}$ and $V_{rst}$ are amplified. $V_{out}$ is the difference multiplied by the gain, where the gain is determined by the ratio of $C_S$ and $C_H$. In the meantime, a correlated double sampling (CDS) operation is realized. The gain is designed to be 4 or 5. $C_L$ is used to sample and hold the PGA output. $V_{pcm}$ is the reference voltage for the PGA. Dummy transistors are added to switch S1 and S2 to reduce charge injection and clock feedthrough [13]. The power supply of the PGA is 3.3 V, while the power supply for the subsequent blocks is 1.2 V to reduce power consumption [14]. The number of components and operations per sample for each unit resistor ($R_u$) and capacitor ($C_u$), respectively, used in the ADCs. The number of operations per sample reported in the table is derived from the number of clock periods needed for full conversion of the ADC $\times$ the total number of comparators and amplifiers. Generally, the more comparisons that are needed for the conversion, the more energy (or power) the ADC will consume. The power consumption can be said to be linearly proportional to the number of operations per sample. While it is possible to design the comparator in an SS ADC as a continuous-time amplifier with continuous power dissipation traded for just a single comparison, in this article a D latch is used as the comparator so the number of operations/sample of the SS ADC is $2^n$. It is preferred to use the ADC that requires the fewest operations per sample and consumes the least power. The SAR ADC is the most desirable in this respect, while the SS ADC is the least desirable.

B. Comparison of ADCs

The most common types of ADCs are flash ADCs, pipeline ADCs, SAR ADCs, cyclic ADCs, and SS ADCs. To choose the best option among these ADCs for QIS, only their traditional structures are considered [14]. Table I shows the number of components and operations per sample for each unit resistor ($R_u$) and capacitor ($C_u$), respectively, used in the ADCs. The number of operations per sample reported in the table is derived from the number of clock periods needed for full conversion of the ADC $\times$ the total number of comparators and amplifiers. Generally, the more comparisons that are needed for the conversion, the more energy (or power) the ADC will consume. The power consumption can be said to be linearly proportional to the number of operations per sample. While it is possible to design the comparator in an SS ADC as a continuous-time amplifier with continuous power dissipation traded for just a single comparison, in this article a D latch is used as the comparator so the number of operations/sample of the SS ADC is $2^n$. It is preferred to use the ADC that requires the fewest operations per sample and consumes the least power. The SAR ADC is the most desirable in this respect, while the SS ADC is the least desirable.

However, one also needs to consider the areas of the different ADC types. The layout of the column-parallel ADCs in an image sensor is usually of great concern, as it is for cluster-parallel readout. It is assumed that the area of a comparator is equal to that of an amplifier and that the area of $C_u$ is the same as that of $R_u$, whereas the comparator area is assumed to be 20 times larger than the $C_u$ area. It is also assumed that the $C_u$ in pipeline and cyclic is $10 \times$ that of the other ADCs. Additionally, it is assumed that the power consumption of the amplifiers is $10 \times$ larger than that of the

<table>
<thead>
<tr>
<th>ADC type</th>
<th># of comparators</th>
<th># of amp’s in ADC</th>
<th>capacitor or resistor</th>
<th>operation/sample</th>
</tr>
</thead>
<tbody>
<tr>
<td>Flash</td>
<td>$2^n - 1$</td>
<td>0</td>
<td>$(2^n - 1) R_u$</td>
<td>$2^n - 1$</td>
</tr>
<tr>
<td>Pipeline</td>
<td>$n$</td>
<td>$n$</td>
<td>$(20n) C_u$</td>
<td>$2n$</td>
</tr>
<tr>
<td>Cyclic</td>
<td>1</td>
<td>1</td>
<td>$20 C_u$</td>
<td>$2n$</td>
</tr>
<tr>
<td>SAR</td>
<td>1</td>
<td>0</td>
<td>$2^k C_u$</td>
<td>$n$</td>
</tr>
<tr>
<td>SS</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>$2^n$</td>
</tr>
</tbody>
</table>
TABLE II
PRODUCT OF POWER AND AREA FOR VARIOUS RESOLUTIONS IN DIFFERENT TYPES OF ADCS

<table>
<thead>
<tr>
<th># of bits</th>
<th>Power-area product (operation/sample) × (number of components)</th>
<th>Flash</th>
<th>Pipeline</th>
<th>Cyclic</th>
<th>SAR</th>
<th>SS</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>21</td>
<td>660</td>
<td>660</td>
<td>22</td>
<td>40</td>
<td></td>
</tr>
<tr>
<td>2</td>
<td>189</td>
<td>2640</td>
<td>1320</td>
<td>48</td>
<td>80</td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>1029</td>
<td>5940</td>
<td>1980</td>
<td>84</td>
<td>160</td>
<td></td>
</tr>
<tr>
<td>4</td>
<td>4725</td>
<td>10560</td>
<td>2640</td>
<td>144</td>
<td>320</td>
<td></td>
</tr>
<tr>
<td>5</td>
<td>20181</td>
<td>16500</td>
<td>3300</td>
<td>260</td>
<td>640</td>
<td></td>
</tr>
<tr>
<td>6</td>
<td>83549</td>
<td>23760</td>
<td>3960</td>
<td>504</td>
<td>1280</td>
<td></td>
</tr>
<tr>
<td>N</td>
<td>21 × (2^2)^n</td>
<td>60n</td>
<td>31n</td>
<td>(20+2^n)×n</td>
<td>20×2^n</td>
<td></td>
</tr>
</tbody>
</table>

Fig. 3. GS pixel used in the imager.

In Table II, the product of power and area is shown for various resolutions of these ADCs. This product is a guide to identify the most promising ADC choices, and an ADC with a smaller power-area product may be more suitable for use in an image sensor. As seen from Table II, when the bit depth is small, the power-area products of the different ADCs are similar to each other. The design of multi-bit QIS mainly focuses on low-bit-depth ADCs [9], specifically those with bit depths of 1 b to 6 bits. When the bit depth is in the range of 1–6 bits, the SAR and SS ADCs have the lowest power-area products. However, the power consumption of logic circuits is not included in Table II. The digital power consumption of SS ADCs is equal to its analog power consumption due to the high-speed counter. The digital power consumption of the other ADCs is only a small portion of analog power consumption because they only have a few digital blocks. Therefore, this would not change the conclusion that SAR ADC and SS ADC have the better metric values and are the preferred options for use in a multi-bit QIS. To verify and explore their performance in silicon, a prototype with two of these ADC types was designed based on the 65-nm TPSCo process. ADCs with 6-bit nominal depth were designed so that a variable bit depth could be chosen through timing and control of the ADC.

III. SENSOR DESIGN

A. Pixel

Since this multi-bit QIS is both a test chip for the readout circuits as well as intended for the sponsor’s research activities, a foundry-provided 2.5-μm GS pixel from the 65-nm TPSCo foundry [15], [16] was used. The schematic of a pixel is shown in Fig. 3. The GRST signal is a global reset plus antiblooming gate, and TX1 can also be operated for all pixels to realize the GS operation. The memory node (MN) in the pixel stores the electrons transferred from the photodiode (PD). There is no row select transistor in the pixel. The row select function is moved from the pixel to the row driver to reduce pixel size. When a row is selected to be read out, the VDDC related to that row is activated, while the VDDCs of the rest of the rows are grounded efficiently deselecting those rows. V_{pixel} is connected to the column bus. The RS, SF, and FD are also shared by two adjacent pixels to increase the fill factor of the pixels.

B. Design of the PGA

The PGA is used to reduce the input-referred noise introduced by the readout circuit. Fig. 4 shows the schematic and the specifications of the amplifier used in the PGA. Here, vb1, vb2, and vb3 are three bias signals of the amplifier; vin+ and vin− are the positive and negative input nodes, respectively; and vout is the output node. The PGA drives the SS ADC, and the full scale of the PGA is only 31.5 mV. Therefore, a simple cascode amplifier was used to realize a high open-loop gain while sacrificing some of the output swing. The specifications of the cascode amplifier are shown in Fig. 4. The closed-loop gain of the PGA is determined by the ratio of $C_S$ and $C_H$, and is set to be 4 or 5 because the equivalent LSB of the analog signal chain can be less than 125 μV. Assuming a CG of 350 μV/e- or more for a real jot [7], the equivalent LSB would be less than 0.5 e- (although LSB = 1 e- is the target for photon counting in a multi-bit QIS).

The input-referred noise contributed by the ADC and PGA, $V_n^2$, is expressed as follows:

$$V_n^2 = \frac{V_{n,\text{pga}}^2}{\text{gain}_\text{pga}} + \frac{V_{n,\text{ss}}^2}{\text{gain}_{ss}^2}$$

(1)

where the gain is the PGA closed-loop gain, $V_{n,\text{pga}}^2$ is the noise contributed by the PGA itself, and $V_{n,\text{ss}}^2$ is the noise from the SS ADC. It is found through simulation that $V_n$ is approximately 116 μV when the gain is 5, corresponding to
input-referred noise of 0.96 e-r.m.s. in this test device, and 0.33 e-r.m.s. if implemented with a demonstrated pump-gate jolt [4]. This PGA design provides the useful starting point for the implementation of a future deep-sub-electron-read-noise multi-bit QIS.

C. Design of the SS ADCs

The structure of the 6-bit SS ADC is quite traditional to guarantee its reliability. The resolution of each SS ADC is 0.5 mV. Fig. 5 shows a diagram of the SS ADC. An SS ADC consists of a two-stage preamplifier, a dynamic comparator, two latches, a chip-level ramp generator, and a chip-level Gray counter. Two 5-transistor (5T) amplifiers are used as the preamp because of their robustness. $V_{\text{out}}$ is the output of the PGA and $V_{\text{ramp}}$ is the chip-level ramp signal. $C_1$, $C_2$, $C_3$, and $C_4$ are all ac coupling capacitors and help cancel the offset of the preamplifier during the reset operation as well as the reset noise. TR1 and TR1d are the preamplifier reset switches. The reset of the preamplifier is started by turning on TR1d and TR1 simultaneously. The reset is completed by turning off TR1 and TR1d consecutively. After preamp reset, the ramp generator and the Gray counter start operating. A sel_ramp control signal is designed to select between an internal ramp signal and an external ramp input for ADC calibration. As the ramp output ramps up, as soon as $V_{\text{ramp}}$ exceeds $V_{\text{out}}$, the dynamic comparator output toggles from “0” to “1.” At the same moment, the value of the Gray counter is latched into the cell “latch1.” After the ADC conversion is finished, clk_lat will be set to “1” for a short time to transfer the saved Gray code from latch1 into latch2. Then, latch2 outputs a 6-bit code, $D_i$ ($i = 5, 4, 3, 2, 1, 0$), to the digital MUX.

A diagram of the ramp generator is shown in Fig. 6. It is a current-steering digital-to-analog converter (DAC). The ramp generator is controlled by an off-chip binary code input. A two-step decoder converts the binary code into thermometer code, which controls the unit current cells. RD and RDbar denote the converted thermometer code from the decoder. A cascode current source is used for increased accuracy, and two bias voltages, vbr1 and vbr2, are needed. The unit current cell has a differential architecture with two output nodes, vp and vn. One of them connects to column ADCs. The other connects to a dummy load which is equivalent to that of the ADCs. One 6-bit column-parallel SS ADC occupies $660 \times 2.5-\mu\text{m}^2$ of silicon area.

D. Design of the SAR ADCs

A 6-bit column-parallel SAR ADC design, with passive CDS and shared architecture, is implemented on this integrated circuit. The resolution of the SAR ADC, $V_{\text{LSB}}$, is the same as that of the SS ADC. A diagram of two adjacent SAR ADCs is shown in Fig. 7. This structure consists of three main blocks: two capacitive DACs, two comparators, and a shared logic block. In Fig. 7, $m$ denotes the column number. The CDS operation is incorporated into the sampling operation of the DAC array in the SAR ADC. The logic module is split into two parts: a shift register and a code register. The shift register is shared by two adjacent columns to minimize the power consumption and layout area.

Diagrams of a DAC and the CDS operation are shown in Fig. 8. Capacitive DACs, which are a very common type of DAC, are used here. $V_{\text{pixel, in}}$ is a synthetic pixel output mimicked by an off-chip DAC and $V_{\text{cm}}$ is the common-mode voltage. $V_{\text{FS}}$ is the full scale of the ADC, which is 31.5 mV. $C_U$ is the unit capacitor. The bi ($i = 0, \ldots, 5$) represents the binary code from the logic module. $V_{\text{dac1}}$ and $V_{\text{dac2}}$ represent the output voltages of the top and bottom capacitor arrays, respectively, in the DAC. To reduce the layout area needed for the DAC in the column, four reference voltages are used.
Fig. 8(c) shows a simplified diagram of the DAC during the CDS operation. The DAC array is simplified as two capacitors because the bottom plates of all unit capacitors in each array are connected. Fig. 8(b) shows the timing for the CDS operation. The rse signal is turned off ahead of the rs signal, and the same pattern is followed for the sse and ss signals. When the rst signal is “1,” the input terminals of the rst switches are connected to the in1 and in2 nodes. The CDS operation based on the DAC array is divided into three phases: the reset (rs) phase, the signal sampling (ss) phase, and the column (col) phase. First, during the rs phase, the pixel reset level ($V_{rst}$) is sampled to the bottom plate of the capacitor array (in the green box), while its top plate is connected to $V_{cm}$. Second, during the ss phase, the pixel exposure level ($V_{sig}$) is sampled to the bottom plate of the capacitor array (in the red box), while its top plate is connected to $V_{cm}$. Then, the two sampled signals, the reset level and exposure level, are both sampled to the DAC array. The difference between these two signals is obtained to complete the CDS operation. Finally, during the col phase, the two-capacitor array is cross-connected [17]. The common-mode charge stored in the two arrays is neutralized, and only the charge representing the difference between the reset and exposure levels remains in the capacitor array. After this, the bottom plates of all unit capacitors are connected to $V_{cm}$. One obtains $V_{dac1}$ as $V_{cm} + (V_{rst} - V_{sig})/2$ and $V_{dac2}$ as $V_{cm} - (V_{rst} - V_{sig})/2$. Aside from this, the switching scheme of the DAC is the same as the conventional switching scheme.

The comparator in the SAR ADC is designed based on a low-power differential CTA [18], and is almost the same as the previous comparator design in [6] except for the more advanced process. These characteristics can help decrease the power consumption and reach state-of-the-art performance.

The offset of the comparator is designed to be less than $\pm V_{LSB}/2$, which is $\pm 0.25$ mV.

A common logic module for an SAR ADC is shown in Fig. 9(a). A D flip-flop (DFF) is used to implement the logic module. The logic module is divided into two parts: a shift register and a code register. The shift register, shown on the top, generates a pulse that is shifted from left to right. The code register saves codes from the shift register in accordance with the comparator results, and these codes feedback to the ADC.
Fig. 10. Column MUX structure for the SAR ADCs and SS ADCs.

to control the switches in the DAC. Therefore, all of the shift registers in the SAR ADCs generate the same output, while the outputs of the code registers are unique. Consequently, a shift register can be shared by multiple columns of SAR ADCs to reduce both the layout area and power consumption required. A diagram of a logic module implemented with shared architecture is shown in Fig. 9(b). This logic module is shared by two columns. Simulation results show that the power consumption of a logic module with a 1-MHz clock and a 1.2-V supply voltage is 212 nW without shared architecture and 135 nW with shared architecture. Hence, the power consumption is decreased by 36%. A conventional CMOS DFF is used in this design to ensure its reliability. In the future, dynamic DFFs will be used to reduce the power consumption and layout area of the logic modules of the SAR ADCs.

For the SAR ADCs in this article, CDS operation is realized during the sampling operation of the DAC array. A sharing technique is applied to the logic module to save layout area and power consumption. The layout area of one column-parallel 6-bit SAR ADC is $1980 \times 2.5-\mu\text{m}^2$. The column size of SAR ADC is $2.5-\mu\text{m}$ because the pixel pitch is $2.5-\mu\text{m}$ and a fair comparison to the SS ADC was desired. If implemented in a cluster-parallel architecture, the layout area for the SAR ADC can be rectangular instead of a tall, thin stripe although the area will not be significantly impacted.

E. Digital Blocks

Due to reliability considerations, standard cell logic is used for row addressing circuits and the ADC's digital outputs. A 10-bit-to-896-row decoder is implemented in this test imager to address the pixel array. In the digital output part of the readout circuit, every 64 columns share one I/O pad. Since these 64 columns have 384 $(6 \times 64)$ bits to be output, a 9-to-384 decoder is required for serial data transfer. A Gray code decoder is used because only one-bit flips for each count, which can reduce the power consumption. However, the decoder on the edge of the array needs to be designed as a 9-to-394 decoder because there are 10 extra SAR ADC outputs. Fig. 10 shows the column MUX structure for the SS ADCs and SAR ADCs.

Fig. 11. Die micrograph of the test chip.

IV. EXPERIMENTAL RESULTS

The sensor was fabricated using the TPS-Co 65 nm 1 P4 M process. A die microphotograph of the chip is shown in Fig. 11. The size of the chip is $5 \times 5 \text{ mm}$. Fig. 12 shows a block diagram of the test setup. On-board low-dropout regulators (LDOs) provide power for the sensor. The 2-to-1 MUX defines which output (the digital output of the device under test (DUT) or the on-board 18-bit ADC output) should be connected to a data grabber [19]. A Genesys FPGA development board [20] is used to provide synchronized control signals for the sensor, on-board DAC, on-board ADC, and data grabber. MATLAB running on a PC finally acquires data from the data grabber for processing.

The pixels used in the chip are GS pixels from TPS-Co and the characterization results of pixels are already reported [15], [16]. The characteristics of the pixels were independently measured using conventional methods and found to be in reasonable agreement with the referenced characteristics, including CG of 120 $\mu\text{V/e}$, and input-referred read noise of 1.9 e-r.m.s. Since this article focuses on ADCs, the measurement results of ADCs are primarily reported and discussed.

The performance of the SAR and SS ADCs was first verified by measuring their linearity. A known ramp signal was fed to the 50-kS/s 6-bit ADCs and used to calculate their differential nonlinearity (DNL) and integral nonlinearity (INL). The LSB of both the SAR ADC and SS ADC is 0.5 mV. The experimental results are shown in Figs. 13 and 14. For the
Fig. 13. DNL of the SS and SAR ADCs.

SS ADCs, the peak DNL error was $+0.43/-0.7$ LSB, and the peak INL error was $+0.2/-1.15$ LSB. For the SAR ADCs, the peak DNL error was $+0.88/-0.84$ LSB, and the peak INL error was $+1.3/-2.5$ LSB.

Since there are many columns of SS ADCs in the imager, the column fixed-pattern noise (FPN) was also measured, as shown in Fig. 15. Ten images ($896 \times 960$ pixels) were acquired under uniform light using the SS ADCs operating at 6-bit resolution. Sixty-four of the 1024 columns were dark pixels; therefore, only 960 columns were used here. The column FPN is derived from the average code for each column minus the mean value of the whole image. The measured column FPN is the standard deviation of the average column code divided by 63, which is 0.2384/63 = 0.34%. A sample image, shown in Fig. 16, was also acquired to demonstrate the performance of the imager. This image was obtained by subtracting an image acquired under dark conditions to remove the offset contributed by the readout circuit. One can observe a dark stripe on the right side of the image that corresponds to the dark pixels in the sensor.

V. DISCUSSION

The main goal of designing the multi-bit test chip was to study the power consumption and layout area tradeoff with two different ADC structures and different bit depths (1 b to 6 bits) in the readout signal path under the condition of constant FC. The focus was on low-bit-depth/variable-bit-depth ADCs because the FWC of a QIS pixel or jot is small. The resolution of the ADCs was the same for all bit depths explored.

For the purpose of what one cares about for comparing ADCs for QIS application, $FOM_{QIS}$ is defined as follows:

$$FOM_{QIS} = \frac{ADC \ power}{N \times \text{frames/s} \times \# \ of \ pixels} \left( \frac{J}{b} \right)$$  \hspace{1cm} (2)

where $N$ is the number of comparator strobes per conversion [6] and $N = 1$ for a single-bit QIS. A lower value of FOM is desired. It is noted that this formula is different from the conventional one [21]

$$FOM_{Conv} = \frac{ADC \ power}{\text{conversion \ rate} \times 2^N \left( \frac{J}{b} \right)} \hspace{1cm} (3)$$

where the conversion rate is typically equivalent to $\#$ of pixels $\times$ frames/s, but the power is divided by the total number of possible codes.
Table III shows the measured power consumption and FOM$_{QIS}$ of the ADCs for bit depths from 1 b to 6 bits under constant FC. According to Table III, the power consumption of the SAR ADCs is significantly reduced as the ADC bit depth is increased (while the conversion rate is lowered consistent with constant FC), whereas the power consumption of the SS ADCs decreases less significantly, as also shown in Fig. 17. The FOM$_{QIS}$ of the SAR ADCs generally decreases with increasing bit depth, whereas the FOM$_{QIS}$ of the SS ADCs increases. Hence, SAR ADCs appear more power efficient than SS ADCs for multi-bit QIS.

Comparing the layout area between the column-parallel SS ADCs and the column-parallel SAR ADCs, the column-parallel SS ADCs occupying $660 \times 2.5-\mu m^2$ are smaller than the column-parallel SAR ADCs occupying $1980 \times 2.5-\mu m^2$. If the SAR ADCs and the SS ADCs are both implemented in cluster-parallel style, the layout area of SS ADCs would still be smaller than that of SAR ADCs.

The two factors, power consumption and layout area, can also be taken into consideration by comparing the power-area product of the SAR ADC and the SS ADC. The power-area products of 6-bit SAR ADC and 6-bit SS ADC are $2128-\mu W \times \mu m^2$ and $5824-\mu W \times \mu m^2$, respectively, confirming an edge for SAR ADCs. However, it is noted that the SS ADC has better INL and DNL than the SAR ADC. It is expected that small INL and DNL may have a minor impact on photon-counting accuracy but detailed analysis is beyond the scope of this article. Calibration techniques might be used to improve INL and DNL if they are found to be important and could change the conclusion of this work if the power and/or layout area is significantly impacted. FPN for SAR ADCs may be higher than that of SS ADCs and further investigation of FPN is warranted, as is the impact of FPN on photon-counting accuracy.

If the image sensor resolution is scaled to one gigajot at the same frame rate, the design of the column-parallel SS ADC would be more challenging than that of the SAR ADC because of the much higher clock speed the SS ADCs would require. However, the architecture of the SS ADC is simpler than the SAR ADC, because there is no large capacitor array in the SS ADC. Yet, when the bit depth is low, the size of the DAC array in the SAR ADC is small, still making the SAR ADC a candidate for column-parallel ADCs. In a cluster-parallel architecture, scaling to a gigajot primarily impacts only off-chip data transmission.

If more advanced processes, such as 45 and 28 nm, are used, the pixel (jot) pitch can be reduced. Additionally, the layout area and power consumption of the ADCs can be further reduced. Smaller unit capacitors can be used in the DAC of SAR ADCs with more advanced processes due to better matching. The layout area of SAR ADC can gain more benefits from more advanced processes than that of SS ADCs. The frame rate can also be increased at the same power consumption.

This test chip was designed and implemented to explore the relationship between power consumption and bit depth, which is meaningful for multi-bit QIS readout circuits. Table IV shows a power consumption comparison between the 6-bit SAR ADCs and 6-bit SS ADCs considered here and the results.
of other published works. Our ADCs achieve state-of-the-
art performance. While the bit depth of the ADCs that were
explored in this work is lower compared to most of the ADCs
in Table IV, the FOMQIS normalizes for this difference.

VI. Conclusion

A variable bit ADC image-sensor test chip for exploring
design considerations for a future multi-bit QIS was designed
and tested. Five well-known ADC architectures (SAR, SS,
flash, pipeline, and cyclic) were studied in terms of power
consumption and occupied area. Theoretical calculations indica-
te that SAR and SS ADCs are both good candidates to be
implemented in the readout signal path of a multi-bit QIS
imager. The bit-depth of ADC versus the power consumption
of the ADCs was measured for various ADC bit depths.
A constant flux capacity was maintained by setting the frame
rate to 1000/(2^1−1) frames/s. When the bit depth of the ADCs
was increased from 1 b to 6 bits, the power consumption of
the two types of ADCs did not change in the same way. The
power consumption of the SAR ADCs decreased by a factor
of 17.2×, while the power consumption of the SS ADCs was
decreased by 2.5×. The power-area product of 6-bit SAR
ADCs is about 1/3 of that of 6-bit SS ADCs. Therefore,
SAR ADCs may be better candidates than SS ADCs for
implementation in the readout signal path of a multi-bit QIS,
for either column-parallel or cluster-parallel architectures.

ACKNOWLEDGMENT

The authors would like to thank S. Han and L. Shi at
Samsung, and A. Lahav and I. Mizuno at TowerJazz Panasonic
Semiconductor Co., Ltd., for the support and guidance from
former colleagues J. Ma and S. Masoodian currently at Gigajot
Technology, Inc. They would also like to acknowledge proof-
reading by groupmates K. Anagnost and N. Shade, and the
professional manuscript services of American Journal Experts.

REFERENCES

A proposal for a gigapixel digital film sensor (DFS),” in Proc. IEEE
Workshop Charge-Coupled Devices Adv. Image Sensors, Karuizawa,
gain for a quanta image sensor,” IEEE J. Electron Devices Soc., vol. 3,
no. 2, pp. 73–77, Mar. 2015.
tion of quanta image sensor pump-gate jots with deep sub-electron
Nov. 2015.
quanta image sensor: Every photon counts,” Sensors, vol. 16, no. 8,
sensor j ot device with shared readout,” IEEE J. Electron Devices Soc.,
pJ/b binary image sensor as a pathfinder for quanta image sensors,”
number-resolving megapixel image sensor at room temperature without
quanta image sensors,” IEEE J. Electron Devices Soc., vol. 1, no. 9,
dissertation, Thayer School Eng., Dartmouth College, Hanover, NH,
USA, 2017.
their FoM-based evaluations,” HiCETrns. Electron., vol. 101, no. 7,
CMOS sensor,” in Proc. IEEE Workshop CDDs AISs, Bavaria, Germany:
Schloss Elmau, May 2003, pp. 1–6.
shutter pixel and near infrared enhancement with light pipe technology,”
[16] T. Yokoyama, M. Tsutui, Y. Nishi, I. Mizuno, V. Dmitry, and A. Lahav,
“High performance 2.5 μm global shutter pixel with new designed light-
pipe structure,” in Proc. IEEE Int. Electron Devices, San Francisco, CA,
[17] C. Banjii et al., “Method and system to differentially enhance sensor
[18] W. J. Marble and D. T. Comer, “Analysis of the dynamic behavior of a
Available: http://edt.com/product/pcie4-cda/
digilentinc.com/genesys
[22] H.-J. Kim et al., “A delta-readout scheme for low-power CMOS image
sensors with multi-column-parallel SAR ADCs,” IEEE J. Solid-State
[23] J. Lee et al., “High frame-rate VGA CMOS image sensor using non-
memory capacitor two-step single-slope ADCs,” IEEE Trans. Circuits
CMOS image sensor with column-parallel two-stage cyclic analog-
power 12-b SAR/single-slope ADC without calibration method for
CMOS image sensors,” IEEE Trans. Electron Devices, vol. 63, no. 9,
640 fully dynamic CMOS image sensor for always-on operation,” IEEE
sensing capability and column zoom ADCs,” IEEE Sensors J., vol. 20,
[28] S. Xie and A. Theuwissen, “A 10 bit 5 MS/s column SAR ADC with
digital error correction for CMOS image sensors,” IEEE Trans. Circuits
sensor applications,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS),
Yibing M. Wang (Senior Member, IEEE) received the B.S. and M.S. degrees in electronic engineering from Tsinghua University, Beijing, China, in 1992 and 1994, respectively, and the Ph.D. degree in electrical engineering from the University of South California, Los Angeles, CA, USA, in 2001.

She joined Photobit Corp. (acquired by Micron Imaging Group), Pasadena, CA, USA, working on high dynamic range image sensors, in 1998. In 2001, she was with Centellax Inc. (acquired by Microsemi Corp.), Santa Rosa, CA, USA, working on high speed optical transceiver and SERDES designs. Since 2002, she has been with Forza Silicon Corp. (acquired by AMETEK, Inc.), Pasadena, CA, designing medical, automotive, and cinematography sensors. Since 2008, she has been with Hynix Semiconductor, San Jose, CA, USA, on mobile image sensor designs. She joined Samsung Semiconductor, Inc., in 2011, working on advanced image sensors, 3-D sensors, LiDAR and optical interconnects. She holds more than 40 issued U.S. patents.

Eric R. Fossum (Fellow, IEEE) is currently the Krehbiel Professor for Emerging Technologies with the Thayer School of Engineering at Dartmouth, Hanover, NH, USA. He is also the Primary Inventor of the CMOS image sensor used in smartphones and many other applications. He is currently exploring the quanta image sensor. He is a Queen Elizabeth Prize for Engineering Laureate and a member of the National Academy of Engineering. He was inducted into the National Inventors Hall of Fame. He cofounded the Int. Image Sensor Society and served as first President. He holds over 170 U.S. patents. He has coauthored over 300 publications.