

# RNS based Programmable Decimation Filter for Multi-Standard Wireless Transceivers

Shahana T. K.<sup>1</sup>, Babita R. Jose<sup>2</sup>, Rekha K. James<sup>3</sup>,  
K. Poulose Jacob<sup>4</sup>, and Sreela Sasi<sup>5</sup>, Non-members

## ABSTRACT

Current research on radio frequency transceivers focuses on multi-standard architectures to attain higher system capacities and data rates. Multiple communication standards are made adaptable by performing channel select filtering on chip at baseband in digital domain. The computationally intensive decimation filter in a sigma-delta analog-to-digital converter plays an important role in channel selection for multi-mode systems. As these architectures are targeted for portable applications, an area and power efficient reconfigurable implementation is an implicit requirement. To this end, a multi-stage, programmable decimation filter based on residue number system (RNS) that is adaptable for WCDMA and WLAN standards is presented in this research. Multi-stage decimation filter implementation offers low computational complexity and power dissipation. The FIR filters of the multi-stage decimator operating in RNS domain offers high data rate because of the carry free operations on smaller residues in parallel channels. Further power saving is achieved by reconfiguring the hardware architecture, and powering down the unused blocks in each mode of operation. For increased programmability modulo multiplication is performed by index addition utilizing the arithmetic benefits associated with Galois field. Finally, a performance comparison of the proposed RNS based decimation filter with traditional binary implementation is done in terms of area, critical path delay and power dissipation.

**Keywords:** multi-standard transceivers, sigma-delta ADC, decimation filter, residue number system, index addition

## 1. INTRODUCTION

The demand for higher system capacities and data rates led to the rapid development of wireless communication systems that allow coexistence of multiple

Manuscript received on December 25, 2007 ; revised on April 2, 2008.

<sup>1,2,3,4</sup> The authors are with Cochin University of Science and Technology, Kochi, Kerala-682022, India, E-mail: shahanatk@cusat.ac.in, babitajose@cusat.ac.in, rekha-james@cusat.ac.in and kpj@cusat.ac.in

<sup>5</sup> The author is with Department of Computer & Information Science Gannon University, Erie, Pennsylvania, USA, E-mail: sasi001@gannon.edu

standards. Reduction in cost and power in the implementation of multi-standard wireless transceivers is drawing market and research interest. Multi-standard operation is achieved by using a receiver architecture that performs channel selection on chip at baseband. The adaptability to different communication standards is achieved by processing the received signals in digital domain. Sigma-delta analog-to-digital converters (SD-ADCs) are widely used in wireless systems because of their superior linearity, robustness to circuit imperfections, inherent resolution-bandwidth trade off and increased programmability in digital domain [1]. The SD-ADC consists of a sigma-delta modulator and a decimation filter. The sigma-delta modulator is based on over-sampling technique to achieve high resolution. It shapes the noise spectrum so that most of the noise energy is pushed to the high frequency end. This allows noise suppression at low frequencies and provides high signal-to-noise ratio (SNR) in the signal band. The decimation filter removes the out-of-band quantization noise as well as the blocking and interference signals. Also it reduces the sampling rate from over-sampled frequency of the modulator to the Nyquist rate of the channel. A programmable decimation filter is required in multi-standard transceiver to adapt to the channel bandwidths, sampling rates, carrier to noise (C/N) ratio, and blocking and interference profiles needed for different communication standards [2].

Several researchers have addressed the design issues of decimation filters for multi-standard wireless transceivers. A fifth order comb decimation filter with programmable decimation ratios and sampling rates for GSM and DECT standards is presented in [3]. The design and implementation of digital filter processors that can be used as downsamplers in wireless transceivers is detailed in [4]. A low complexity decimation filter architecture is presented in [5] by using infinite impulse response (IIR) filters implemented by all-pass sum that avoids multiplications. A low-power high linearity variable gain amplifier (VGA) embedded in a multi-standard receiver that meets the standard requirements is reported in [6]. Decimation filter design for GSM, WCDMA, 802.11a, 802.11b, 802.11g and WiMAX standards are given in [7]. A decimation filter structure based on cascaded integrator comb (CIC) filters and polynomial interpolation filters to perform fractional sam-

ple rate conversion is presented in [8]. A digital IF down-converter with quadrature sampling based on polyphase filter, high rate CIC filter and interpolation filters, and compatible with WCDMA and EDGE is demonstrated in [9]. Multi-rate digital filters and fractional frequency conversion techniques are adopted to implement the front end of a dual-mode receiver for WCDMA/cdma2000 in [10]. A fast RNS field programmable logic (FPL) based communication receiver design and implementation is presented in [11].

In the previous work [12], the authors have investigated the FIR design in RNS and traditional techniques. The critical path delay and area requirement for implementing filters of different orders are analyzed in this work. The speed-up and area reduction of RNS implementation is utilized in the proposed programmable decimation filter. In addition, the preliminary section is improved by eliminating the forward converter as the front end of the proposed architecture is a sigma-delta ADC. This reduces the area about 10% compared to a traditional RNS filter. Also, the modulo multiplication is performed by index calculus approach to achieve increased programmability required for a multi-mode operation.

The principle contribution of this paper is the design and implementation of a programmable RNS based multi-stage decimation filter for dual-mode WCDMA and WLAN receiver. This technique differs fundamentally from the implementation proposed by [11] in two critical issues. Firstly, this technique addresses the problem of multi-standard decimation filtering. Secondly, and more importantly, since the implementation is multi-rate, the subsequent filters operate at lower sampling rates. The filter specifications of individual filters are relaxed to reduce the filter order while the multi-stage structure meets the overall decimator specifications. Thus it reduces power consumption compared to a single stage implementation given in [11]. Furthermore, the forward converter is eliminated in the proposed architecture by suitably selecting the moduli set.

The rest of the paper is organized as follows: Section 2 gives the RNS basics, modulo multiplication by index arithmetic and the general RNS FIR filter architecture. Section 3 describes the receiver architecture and sigma-delta modulator suitable for multi-standard operation. Section 4 presents the RNS based programmable multistage decimation filter structure with design specification for WCDMA/WLAN standards. Section 5 demonstrates the simulation results obtained for dual-mode decimation filter. The area requirement and critical path delay are tabulated, and is compared with the traditional FIR filter implementation. Finally, Section 6 gives the conclusion.

## 2. BACKGROUND

### 2.1 RNS Basics

RNS is defined by a set of 'r' relatively prime integers ( $m_1, m_2, \dots, m_r$ ) which are called the moduli. Any integer 'X' in the interval of  $[0, M)$  can be represented as a set of 'r' residues  $(x_1, x_2, \dots, x_r)$ , where  $x_i = X \bmod m_i$  and  $M = \prod_{i=1}^r m_i$ . 'M' is defined as the dynamic range of the number system. Negative numbers can be represented by partitioning the dynamic range into two sets. A common assignment is that all numbers  $X, 0 \leq X \leq (M/2)-1$  are considered positive and the rest are considered as negative. In RNS, arithmetic operations are computed by the formula:

$$(x_1, x_2, \dots, x_r) \Theta (y_1, y_2, \dots, y_r) = (z_1, z_2, \dots, z_r) \quad (1)$$

where  $z_i = |x_i \Theta y_i|_{m_i}$  and  $\Theta$  denotes one of the operations of addition, subtraction or multiplication [13]. Thus arithmetic operations on residues can be performed in parallel without any carry propagation among the residue digits. However, before performing any operations on residues the number is converted from binary to residue by performing modulo operations with respect to each modulus in the moduli set. The process of translating a binary integer  $X$ , in the range 0 to  $M-1$ , to the residue representation  $(x_1, x_2, \dots, x_r)$  with respect to a relatively prime moduli set  $(m_1, m_2, \dots, m_r)$  is called forward conversion. Getting back to the weighted representation of 'X' from a given residue representation is referred to as reverse conversion.

The reverse conversion can be done using Mixed Radix Conversion (MRC) and Chinese Remainder Theorem (CRT). MRC is usually used for sign determination, magnitude comparison and overflow detection while CRT is more adapted for generation of binary number directly from its residue. CRT is based on the general formula:

$$X = \left| \sum_{i=1}^r a_i x_i \hat{M}_i \right|_M, \text{ where } \hat{M}_i = \frac{M}{m_i} \text{ and } a_i = \left| \frac{1}{\hat{M}_i} \right|_{m_i} \quad (2)$$

The CRT implementation presented in [14] uses 'r' ROMs to store the precomputed values  $|a_i x_i \hat{M}_i|_M$  each being indexed by  $x_i$ . This reduces CRT implementation to a summation of 'r' values followed by modulo correction with 'M' using carry save adder stages and a final carry propagate adder. The overall hardware including the size of ROM stage used in the design is further reduced in [15]. This is achieved by selecting one of the moduli of the form  $2^n$  for the direct availability of the least significant 'n' bits of the binary number.

## 2.2 RNS Multiplier using Index Calculus

An algebraic field with finite number of elements is called a finite field or Galois field. There are two types of Galois fields: prime fields  $GF(p)$  and polynomial fields  $GF(p^m)$ , where  $p$  is a prime number and  $m$  is any positive integer. It has the property that all non-zero elements of the field can be generated by non-negative integer powers of certain elements, say  $g$ , called primitive roots. This property is exploited to perform multiplication over  $GF(p)$  using the isomorphism between a multiplicative group  $\{q_n\} = \{1, 2, \dots, p-1\}$ , with multiplication modulo  $p$ , and the additive group  $\{i_n\} = \{0, 1, \dots, p-2\}$ , with addition modulo  $(p-1)$  [16]. The relatively prime moduli in an arbitrary moduli set take any of the three forms  $p$ ,  $2^m$ , and  $p^m$ , or a value with any of these as a factor, where  $p$  is prime and  $m$  is a positive integer. Number theoretic approach shows that the groups formed by  $p$ ,  $2^m$ , and  $p^m$  integer elements fall into the category of Galois field  $GF(p)$ , and integer rings  $Z_2^m$  and  $Z_p^m$ . For prime modulus the normal index mapping in  $GF(p)$  is done as  $q_n = |g^{i_n}|_p$ . Multiplication of two numbers  $q_j$  and  $q_k$  is performed by adding their indices  $i_j$  and  $i_k$  modulo  $(p-1)$ , and then by doing the inverse index operation. Hence by index calculus approach, the product is represented as

$$|q_j q_k|_p = g^{|i_j + i_k|_{p-1}} \quad (3)$$

The elements of the integer ring  $Z_{2^m}$  are represented by a triplet index code  $\langle \alpha, \beta, \gamma \rangle$  as detailed in [17, 18]. Any integer  $X \in \{1, 2, \dots, 2^m - 1\}$  can be coded using the triplet index set as

$$X = 2^\alpha |5^\beta (-1)^\gamma|_{2^m} \quad (4)$$

where  $\alpha \in \{0, 1, \dots, m-1\}$ ,  $\beta \in \{0, 1, \dots, (2^{m-2}-1)\}$  and  $\gamma \in \{0, 1\}$ . Multiplication of two integers  $X_1$  and  $X_2$  is carried out as follows:  $X_1, X_2 \in Z_{2^m}$  where  $X_1 \neq 0$ ,  $X_2 \neq 0$ ,  $X_1 = 2^{\alpha_1} |5^{\beta_1} (-1)^{\gamma_1}|_{2^m}$  and  $X_2 = 2^{\alpha_2} |5^{\beta_2} (-1)^{\gamma_2}|_{2^m}$ . Then the product is given by

$$|X_1 X_2|_{2^m} = 2^{\alpha_1 + \alpha_2} |5^{\beta_1 + \beta_2} (-1)^{\gamma_1 + \gamma_2}|_{2^m} \quad (5)$$

The index addition is performed with the following constraints:  $\beta_1$  and  $\beta_2$  are added modulo  $2^{m-2}$ ,  $\gamma_1$  and  $\gamma_2$  are added modulo 2, and  $\alpha_1$  and  $\alpha_2$  are added in normal binary mode. When the sum of ' $\alpha$ ' indices is equal to  $(m-1)$  the corresponding ' $\beta$ ' and ' $\gamma$ ' are made 'zero', and when the sum exceeds  $(m-1)$ , the final product is made 'zero'.

Similarly, the elements of the integer ring  $Z_p^m$ , where  $p$  is odd, are represented by an index pair  $\langle \alpha, \beta \rangle$  [17, 18] and any integer  $X$  is represented by

$$X = |g^\alpha p^\beta|_{p^m} \quad (6)$$

where  $g$  is the primitive root of  $p$ . The modulo multiplication of  $X_1$  and  $X_2$  in this finite field is carried out as follows:  $X_1, X_2 \in Z_{p^m}$  where  $X_1 \neq 0$ ,  $X_2 \neq 0$ ,  $X_1 |g^{\alpha_1} p^{\beta_1}|_{p^m}$  and  $X_2 |g^{\alpha_2} p^{\beta_2}|_{p^m}$ . Then the product is given by

$$|X_1 X_2|_{p^m} = |g^{\alpha_1 + \alpha_2} p^{\beta_1 + \beta_2}|_{p^m} \quad (7)$$

The index additions are performed subject to the following constraints:  $\alpha_1$  and  $\alpha_2$  are added modulo  $\phi(p^m)$ , where  $\phi(p^m) = (p-1)p^{m-1}$ , and  $\beta_1$  and  $\beta_2$  are added in normal binary mode. When the sum of ' $\beta$ ' indices exceeds  $(m-1)$ , the final product is made 'zero'.

## 2.3 RNS FIR Filter Architecture

An FIR filter is described by (8), where  $X(n)$  is the input to the filter,  $H(k)$  represents the filter coefficients,  $N$  is the order of the filter and  $Y(n)$  is the output from the filter.

$$Y(n) = \sum_{k=0}^N H(k)X(n-k) \quad (8)$$

For a very large  $N$ , filters implemented in traditional binary weighted number system suffer from the disadvantages of carry propagation delay in binary adders and multipliers. In high data rate wireless systems such as WCDMA and WLAN, large number of multiplications and additions must be executed during a short sample interval. This will be seriously limited by the speed of the filter.

In RNS a large integer is broken into smaller residues which are independent of each other. Each residue digit is processed in parallel without carry propagation from one to another. This leads to significant speed up of multiply and accumulate (MAC) operations which in turn results in high data rate for RNS based FIR filters [12, 19]. Fig. 1 shows the general block diagram of RNS based FIR filter. For the moduli set  $(m_1, m_2, \dots, m_r)$ , there will be ' $r$ ' parallel filter channels, which process the signals from the forward converter. The forward converter is shown in dotted lines as it is not used in the proposed design by suitably selecting the moduli set. Finally, the reverse converter combines the signals from all parallel channels and puts the output in binary form.

## 3. RECEIVER ARCHITECTURE FOR MULTI-STANDARD OPERATION

Multi-standard operation is achieved by performing channel select filtering on chip at baseband [20].



**Fig.1:** RNS based FIR filter

The baseband channel selection is performed in digital domain. This allows programmability to adapt to the channel bandwidths, sampling rates, carrier to noise (C/N) ratio, and blocking and interference profiles needed for multiple communication standards. The direct conversion homodyne receiver architecture shown in Fig. 2 is suitable for high integration and multi-standard operation [2]. This is capable for multi-standard operations because channel select filtering is done at baseband. The noise and DC offset created at the output of the mixer are to be reduced to achieve adequate dynamic range. A wideband high dynamic range sigma-delta ( $\sum \Delta$ ) modulator is used to digitize both the desired signal and potentially stronger adjacent channel interferers. A highly linear  $\sum \Delta$  modulator for multi-standard operation that can achieve high resolution over a wide variety of bandwidth requirements remains challenging. A reconfigurable ADC [21, 22] is a promising solution to keep the power dissipation as low as possible.

Multi-stage noise shaping (MASH) structures can be adopted for  $\sum \Delta$  modulator considering the stability and reconfigurability. The theoretical dynamic range has been used in conjunction with the implementation attributes to choose the optimal topology for different RF standards. The dynamic range, DR[1] of a  $\sum \Delta$  modulator is given by

$$DR = \frac{3}{2} \frac{2L+1}{\pi^{2L}} M^{2L+1} (2^B - 1)^2 \quad (9)$$

where  $L$  is the order of the modulator,  $M$  is the over-sampling ratio (OSR), and  $B$  is the number of bits of the quantizer. For WCDMA and WLAN, the dynamic range requirements are chosen as 79 dB and 69 dB respectively. In order to meet the DR requirement demanded by WCDMA, a fourth order cascaded MASH topology is sufficient with a single-bit quantizer and an OSR of 16. A fifth order topology is a good compromise to achieve the required DR for WLAN, with a 4-bit quantizer and an OSR of 8. Sigma-delta modulator is followed by a programmable decimation filter operating in the digital domain. The proposed work focuses on the design of programmable multistage decimation filter for WCDMA/WLAN standards, which is highlighted in

**Fig. 2.**



**Fig.2:** Direct conversion homodyne receiver architecture

#### 4. DESIGN OF RNS BASED DECIMATION FILTER PROGRAMMABLE FOR WCDMA/WLAN

The specifications for WCDMA and WLAN standards and the corresponding decimation filter design parameters are given in Table 1. In order to set the parameters for decimation filter, the receiver specifications and the blocking and interference profiles are defined first. The interference signals are to be limited within a certain range for proper reception of the desired signals. The decimation filter is generally designed to minimize the undesired signals in the desired band of operation. The output C/N ratio is calculated from the bit error rate (BER) of each standard and the modulation scheme used. The passband frequency edge is taken as 80% of the bandwidth. The passband ripples are chosen to minimize signal distortions in the signal band. The stopband attenuations are selected according to the interference profile and C/N ratio for each standard.

The decimation filter consists of a lowpass filter and a down-sampler. Decimator can be implemented either by single stage or by multi-stage approach. Single stage implementation of sampling rate converter (SRC) requires excessively large filter order to meet the specification. The power consumption of the filter depends on the number of taps as well as the rate at which it operates. So computational complexity is high for a single stage decimation filter, and it consumes much power. This can be overcome by multistage approach. Implementing decimation filter in several stages reduces the total number of filter coefficients. The filters operating at higher sampling rates have larger transition bands, and the filters with lower transition bands operate at reduced sampling frequencies. Subsequently, the hardware complexity and computational effort are reduced in multistage approach. This will lead to low power consumption. To prevent aliasing in the overall decimation process, the individual filter of each stage is to be designed within the frequency band of interest. The cutoff frequency for the first stage can be less constraining. But the final stage filter operating at lower sampling rate is responsible for attaining the overall filter speci-

fications. For stage ' $i$ ', passband is from  $0 \leq F \leq F_{pc}$ , where  $F_{pc}$  is the passband edge. If  $F_{i-1}$  and  $F_i$  are the input and output sampling frequencies for stage ' $i$ ', and  $F_{sc}$  is the stopband edge, transition band for stage ' $i$ ' is from  $F_{pc} \leq F \leq F_i - F_{sc}$  and stopband is from  $(F_i - F_{sc}) \leq F \leq (F_{i-1}/2)$ .

**Table 1:** Standard specifications and design parameters for decimation filter

| Specification                   | WCDMA                           | WLANa                |
|---------------------------------|---------------------------------|----------------------|
| Frequency range(GHz)            | DL: 2.11-2.17<br>UL: 1.92-1.98  | 5.15-5.35            |
| Channel Spacing                 | 5 MHz                           | 20 MHz               |
| Data rate                       | 3.84 Mchips/s                   | 12 Msymbols/s        |
| OSR                             | 16                              | 8                    |
| Input sampling frequency, $F_s$ | 61.44 MHz                       | 96 MHz               |
| Passband edge                   | 2 MHz                           | 8 MHz                |
| Stopband edge                   | 2.5 MHz                         | 10 MHz               |
| Offset frequency (MHz) :        | 5 : -63<br>10 : -56<br>12.5:-44 | 20 : -63<br>40 : -47 |
| C/N ratio                       | 7.2 dB                          | 28 dB                |
| Passband ripple                 | 0.5 dB                          | 0.5 dB               |
| Stopband attenuation            | 55 dB                           | 44 dB                |

The OSR is selected as 16 for WCDMA and 8 for WLAN to meet the DR requirements. Decimation filter is implemented in 3 stages with decimation factors of 4, 2 and 2 for WCDMA, and in 2 stages with decimation factors of 4 and 2 for WLAN. Remez Parks-McClellan optimal equiripple FIR filter is chosen for implementation. The filter orders obtained for WCDMA are 14, 11 and 37 for the first, second and third stages respectively. For WLAN, filter orders are 33 and 25 respectively. The block diagram for programmable decimation filter is shown in Fig. 3, where N1, N2 and N3 denote the filter orders in each mode. The third filter is operating only in WCDMA mode and is bypassed in WLAN mode. First 14 MAC units of the first stage and first 11 MAC units of the second stage are shared for both modes. The unused hardware in each mode are bypassed to get power saving.

The FIR filters used in all the three stages are implemented in residue number system defined by the moduli set (25, 29, 31, 37, 43, 47, 59, 64), which provides 43-bit dynamic range. A key point in the design of RNS filter is the choice of proper moduli set. The dynamic range required for RNS is decided based on the values of filter coefficients and maximum possible output from the filter. The filter coefficients are taken with 14-bit accuracy. The first stage filter receives maximum of 4-bits from sigma-delta modulator as the input. The moduli set is selected to get sufficient dynamic range such that there exists a unique representation for each possible value of filter output. The moduli set chosen for RNS affects both the representational efficiency and the complexity of arithmetic algorithms. Since the magnitude of the largest mod-

ulus decides the speed of arithmetic operations, all the remaining moduli can be chosen so that they are comparable with the largest one. The proposed work uses index transform based multipliers to reduce the complexity of modular multipliers, which are ideally suited for prime and powers of prime moduli [16]. In the selected moduli set, the prime moduli include 29, 31, 37, 43, 47 and 59, and the powers of prime moduli include 25 and 64. The modulus 64 is of the form  $2n$  so that including it in the moduli set simplifies the reverse converter [15]. The modulo operations on  $m = 64$ , are easily implemented by normal binary operations limited to the least significant 5-bits. Moduli of the form  $2n-1$  are also desirable as modulo addition is easily performed by n-bit binary adder with end-around carry [13]. As input to the filter has maximum of 4-bits and the moduli set consists of 5-bit and 6-bit numbers, no forward converter is required in the proposed filter. The reverse converter at the last stage converts filtered outputs from parallel channels to binary form. A filter channel corresponding to modulus ' $m_i$ ' of the first stage is shown in Fig. 4, where  $\alpha$  and  $\beta$  represent modulo multiplication and addition respectively. A demultiplexer is used at the input to load filter coefficients sequentially for each mode or to distribute input through the register chain as shown in Fig. 4. The filter structure is made reconfigurable for WCDMA/WLAN using switch 'S' and multiplexer, leading to power saving. In each stage, the outputs from multipliers are combined using modulo adder trees. The filtered output corresponding to each mode is selected using a multiplexer.

Modulo multiplication is performed by index calculus. In the selected moduli set of (25, 29, 31, 37, 43, 47, 59, 64), the moduli 29, 31, 37, 43, 47 and 59 are prime numbers and performs multiplication by index addition in the corresponding Galois field  $GF(p)$ . The primitive roots used for generating the Galois fields for these numbers are shown in Table 2. The modulus 25 is power of a prime number denoted as  $p^m$ , with  $p = 5$ ,  $m = 2$ , primitive root  $g = 2$  and  $\phi(p^m) = 20$ . So any integer  $X$  in this field is represented as  $X = |2^\alpha 5^\beta|_{25}$ , where  $\alpha \in \{0, 1, \dots, 19\}$  and  $\beta \in \{0, 1\}$ . Multiplication is performed by addition of  $\alpha$  and  $\beta$  indices in the integer ring  $Z^{pm}$ . The modulus 64 being power of 2 form integer rings of the form  $Z2m$  where each number is represented by a triplet index code  $\langle \alpha, \beta, \gamma \rangle$ . Here multiplication is done by normal binary addition for  $\alpha$ , modulo 16 addition for  $\beta$ , and modulo 2 addition with an XOR gate for  $\gamma$  indices. When the residue digit becomes zero, as index can not be defined, extra logic is incorporated in the design for each modulus. As modulus 31 is of the form  $(2^n - 1)$  and 64 is of the form  $2^n$ , modulo multiplication can be performed more efficiently by combinational logic than using index calculus [23]. So, combinational circuits are implemented to perform modulo multiplication for these two channels.



Fig.3: Dual-mode programmable decimation filter



Fig.4:  $i^{th}$  filter channel of stage 1 programmable for WCDMA/WLAN

Table 2: Primitive roots for the selected moduli set

| Prime modulus (p) | Primitive root (g) |
|-------------------|--------------------|
| 29                | 2                  |
| 31                | 3                  |
| 37                | 2                  |
| 43                | 3                  |
| 47                | 5                  |
| 59                | 2                  |

## 5. SIMULATION RESULTS AND ANALYSIS

The input sampling frequency is 61.44 MHz for WCDMA and is down-sampled to the data rate of 3.84 Mchips/s in three stages. The cascaded two stage filter structure down-samples the input sampling frequency of 96 MHz for WLAN to the data rate of 12 Msymbols/s. The overall decimation filter responses obtained for WCDMA and WLAN, satis-

fying the standard specifications, are shown in Fig. 5 and Fig. 6 respectively.

The programmable decimation filter architecture is defined by VHDL code and the functional verification is performed using ModelSim. The hardware synthesis is done with Synopsys design compiler. The area requirement and critical path delay of each block of the RNS decimation filter is shown in Table 3. The critical path delay and area for each block of the filter is normalized with respect to a full adder critical path delay of 0.33 ns and area of  $76\mu\text{m}^2$ . The percentage area requirement for each block of the RNS decimation filter is shown in Fig. 7. The area requirement of the decimation filter in single mode WCDMA receiver and the additional area required for making it adaptable for dual-mode operation are given in Table 4. It is observed that programmability is achieved at the expense of 34% of additional area compared to single mode WCDMA receiver.



**Fig.5:** Decimation filter response for WCDMA



**Fig.6:** Decimation filter response for WLAN

**Table 3:** Area, critical path delay and dynamic power dissipation for RNS decimation filter

| Block                     | Area        | Critical path delay |
|---------------------------|-------------|---------------------|
| Filter 1                  | 15983.5     | 58.95               |
| Filter 2                  | 12200.1     | 52.41               |
| Filter 3                  | 17875.2     | 58.95               |
| Reverse converter         | 852.18      | 79.12               |
| Total area                | 46910.9     |                     |
| Dynamic power dissipation | 479.1387 mW |                     |

**Table 4:** Area requirement for programmability

| Type of transceiver                 | Area              | Percentage area (%) |
|-------------------------------------|-------------------|---------------------|
| Single mode WCDMA                   | Filter 1          | 6780.8              |
|                                     | Filter 2          | 5368                |
|                                     | Filter 3          | 17823.04            |
|                                     | Reverse converter | 852.18              |
|                                     | Total             | 30824.02            |
| Dual-mode transceiver               | 46910.9           | 100                 |
| Additional area for programmability | 16086.88          | 34.3                |



**Fig.7:** Area requirement for RNS decimation filter

In order to operate the first stage of RNS filter at 96 MHz, two-stage pipelining is done to meet the critical path delay. The second and third stages are operating at down-sampled frequencies of 24 MHz and 7.68 MHz respectively. They do not require pipelining due to the fast MAC operations in RNS domain. Table 5 reports the characteristics of decimation filter implemented in traditional binary number system performing signed multiplication and addition. RNS filter implementation requires only 87% of area with respect to the traditional implementation. The dynamic power dissipation for the RNS based dual-mode decimation filter is 28.4% less than that for traditional case. The inherent delay for each stage of traditional implementation is more, compared to the RNS implementation. The first stage of RNS decimation filter operates 2.6 times faster compared to the traditional implementation. Similarly, the second and third stages of RNS decimation filter operate 5 and 7.4 times faster than the traditional filter. The pipelining used for the RNS filter will not meet the critical path delay for traditional case. Hence, pipelining is required in the multipliers as well as in the adder chain of all the stages for traditional implementation to meet the critical path delay.

**Table 5:** Area, critical path delay and dynamic power dissipation for traditional decimation filter

| Block                     | Area       | Critical path delay |
|---------------------------|------------|---------------------|
| Filter 1                  | 2966.59    | 153.68              |
| Filter 2                  | 14504.41   | 259.58              |
| Filter 3                  | 36165.2    | 437.91              |
| Total area                | 53636.2    |                     |
| Dynamic power dissipation | 669.621 mW |                     |

To evaluate the design techniques, the proposed architecture is implemented using RTL synthesizable VHDL code. Also the design is synthesized with ArtisanTM 0.18 m and VDD=1.8V technology using Synopsys design compiler tools. The back end pro-

cess, place and route, are done using Cadence Encounter™ tool set. The placed cell structure and routed design for the RNS decimation filter is shown in Fig. 8 and Fig. 9 respectively.



**Fig.8:** Placed cell structure for RNS based decimation filter



**Fig.9:** Routed view of RNS decimation filter

## 6. CONCLUSION

A dual-mode programmable RNS based decimation filter that meets the performance requirements of WCDMA and WLAN standards is presented in this paper. The preliminary section is improved by eliminating the forward converter, as the front end of the proposed architecture is a sigma-delta ADC. This reduces the area about 10% compared to a traditional RNS filter. Also, modulo multiplication is performed by index calculus approach to achieve increased programmability required for multi-mode operation. Multi-stage sampling rate conversion results in reduced hardware complexity and power consumption for the decimator. Powering down or bypassing of the unused hardware in each mode of operation leads to further power saving. Since the entire

filter stages are implemented in RNS and are operating with the same moduli set, a reverse converter is needed only at the last stage's output. The performance comparison shows that the proposed RNS implementation requires only 87% of total area, and operates faster compared to the traditional FIR filter implementation. Also, the total dynamic power dissipation for the RNS based dual-mode decimation filter is 28.4% less than that for traditional implementation. The programmability for dual-mode architecture that can handle both WCDMA and WLAN, is achieved with an increase of 34% of total area compared to that for single mode WCDMA transceiver.

## References

- [1] S. R. Norsworthy, R. Schreier and G. C. Temes, *Delta-Sigma Data Converters, Theory, Design, and Simulation*, Piscataway, NJ: IEEE Press, 1997.
- [2] S. Jagannathan, "Discrete-Time Adaptive Control of Feedback Linearizable Nonlinear Systems," *IEEE Proceedings of the 35th Conference on Decision and Control*, Kobe, Japan, pp.4747–4752, 1996
- [3] C. J. Barrett, "Low-power decimation filter design for multi-standard transceiver applications", Master of Science in Electrical Engineering, University of California, Berkeley.
- [4] A. Ghazel, L. Naviner and K. Grati. "Design of down-sampling processors for radio communications", *Analog Integrated Circuits and Signal Processing*, 36, Kluwer academic publishers, pp. 31–38, 2003.
- [5] J. Luis Tecpanecatl, Ashok Kumar and M. A. Bayoumi, "Low complexity decimation filter for multistandard digital receivers", *IEEE International Symposium on Circuits and Systems (ISCAS 2005)*, Vol. 1, pp. 552–555, May 2005.
- [6] S. D' Amico, M. De Matteis and A. Baschirotto, "A 6.4mW, 4.9nV/√Hz, 24dBm IIP3 VGA for a multi-standard (WLAN, UMTS, GSM and Bluetooth) receiver", *32nd European Solid-State Circuits Conference*, pp. 82–85, September 2006.
- [7] Ze Tao and S. Signell, "Multi-standard delta-sigma decimation filter design", *IEEE Asia Pacific Conference on Circuits and Systems (APC-CAS 2006)*, Singapore, pp. 1212–1215, Dec. 2006.
- [8] F. Sheikh and S. Masud, "Efficient sample rate conversion for multi-standard software defined radios", *IEEE Int. Conf. on Acoustics, Speech and Signal Processing*, HI, pp. II-329 - II-332, Apr. 2007.
- [9] W. Li, J. Liu, J. Wang, C. Zhang and W. Guo, "An efficient digital IF down-converter for dual-mode WCDMA/EDGE receiver based on software radio", *IEEE 6th CAS Symp. on Emerging*

ing Technologies: Mobile and Wireless Comm., China, pp. 713-716, May 31-June 2, 2004.

[10] M. Kim and S. Lee, "Design of dual-mode digital down converter for WCDMA and cdma2000", *ETRI Journal*, Vol.26, No.6, pp.555-559, Dec. 2004.

[11] J. Ramírez, A. García, U. M-Baese and A. Lloris, "Fast RNS FPL-based communications receiver design and implementation", *FPL 2002*, LNCS 2438, pp. 472-481, Sept. 2002.

[12] Shahana T.K., R.K. James, B.R. Jose, K.P. Jacob and S.Sasi, "Performance analysis of FIR digital filter design: RNS versus traditional", *7th IEEE International Symp. on Communications and Information Technologies (ISCIT 2007)*, Australia, pp. 1-5, October 2007.

[13] Soderstrand M.A., Jenkins W.K., Jullien G.A., and Taylor F.J., *Residue number system arithmetic: modern applications in digital signal processing*, IEEE Press, New York, 1986.

[14] B. Parhami and C.Y. Huang, "Optimal look up schemes for VLSI implementation of input/output conversions and other residue number operations," in *VLSI Signal Processing VII*, J. Rabaey, P. M. Chau and Eldon, eds., IEEE Press, New York, 1994.

[15] D. Radhakrishnan, T. Srikanthan and J. Mathew, "Using the  $2n$  property to implement an efficient general purpose residue-to-binary converter", *Proceedings SCS '99*, Iasi, Romania, July 1999, pp. 183-186.

[16] D. Radhakrishnan and Y. Yuan, "A fast RNS Galois field multiplier", *IEEE International Symp. on Circuits and Systems*, LA, USA, Vol.4, pp. 2909-2912, May 1990.

[17] D. Radhakrishnan, "Modulo multipliers using polynomial rings", *IEE Proc. Circuits Devices syst.*, Vol. 145, No.6, December 1998.

[18] A.P. Preethy, D. Radhakrishnan and A.Omondi, "A high performance RNS multiply-accumulate unit", *11th Great Lakes symposium on VLSI*, USA, pp.145-148, March 2001.

[19] G. L. Bernocchi, G.C. Cardarilli, A. D. Re, A. Nannarelli and M. Re, "Low-power adaptive filter based on RNS components", *IEEE Int. Symp. on Circuits and Systems*, pp. 3211-3214, May 2007.

[20] P. Gray and R. Meyer, "Future directions in silicon ICs for RF personal communications", *Proc. of Custom Integrated Circuits Conference*, pp. 83-90, May 1995.

[21] A.Xotta, A.Gerosa and A.Neviani, "A multi-mode  $\sum \Delta$  analog-to-digital converter for GSM, UMTS and WLAN," *IEEE Int. Symp. on Circuits and Systems*, vol.3, pp. 2551-2554, May 2005.

[22] L. Zhang, V. Nadig and M. Ismail, "A high order multi-bit  $\sum \Delta$  modulator for multi-standard wireless receiver", *IEEE Int. Midwest Symp. on Circuits and Systems*, pp. III-379-III-382, 2004.

[23] Z.Wang, G.A.Jullien and W.C.Miller, "An algorithm for multiplication modulo  $2^{N-1}$ ", *Proc. 39<sup>th</sup> IEEE Midwest Symp. on Circuits Syst*, pp. 1301-1304, 1996.



**Shahana T.K.** received her Bachelors Degree in Electronics and Communication Engineering from Mahatma Gandhi University, Kerala, India in 1997 and Masters Degree in Digital Electronics from Cochin University of Science and Technology, Kerala, India in 1999. She is a Lecturer in School of Engineering, and is also working towards her Ph. D. degree in Department of Computer Science, Cochin University of Science and Technology. Her research interests primarily focus on the design of RNS-based arithmetic circuits, Multi-standard Wireless Transceivers, Digital Filters and Low-power design.



**Babita R. Jose** received the B. Tech degree in Electronics and Communication Engineering from Mahatma Gandhi University, Kerala, India in 1997 and Masters Degree in Digital Electronics from Karnataka University, India in 1999. She also holds a M.S degree in System on Chip designs from Royal Institute of Technology (KTH), Stockholm, Sweden. Currently, she is serving as a Lecturer in School of Engineering, and also working towards her Ph. D. degree at School of Engineering, Cochin University of Science and Technology, on a part time basis where her interests are focused on development of System on chip architectures, Multi-standard Wireless Transceivers, Low-power design of sigma delta modulators.



**Rekha K. James** received her Bachelors Degree in Electronics and Communication Engineering from Kerala University, Kerala, India in 1989 and Masters Degree in Digital Electronics from Cochin University of Science and Technology, Kerala, India in 2002. She is a Reader in School of Engineering, and is also working towards her Ph. D. degree in Department of Computer Science, Cochin University of Science and Technology. Her research interests include the design of RNS-based arithmetic circuits, Decimal arithmetic, Reversible logic and Low-power design.



**K. Poulose Jacob** a National Merit Scholar all through, got his degree in Electrical Engineering in 1976 from University of Kerala, followed by his M.Tech. in Digital Electronics and Ph. D. in Computer Engineering from Cochin University of Science and Technology (CUSAT), Kochi, India. He has been teaching at CUSAT since 1980 and currently occupies the position of Professor and Head of the Department of Computer Science. He has served as a Member of the Standing Committee of the UGC on Computer Education and Development. He is on the editorial board of two international journals and has more than 70 papers in various international journals and conferences to his credit. His research interests are in Information Systems Engineering, Intelligent Architectures and Networks, Wireless Communication and Low-power design.



**Sreela Sasi** is an Associate Professor at Gannon University, Erie, Pennsylvania, USA. She did her Ph.D. in Computer Engineering from Wayne State University, Michigan, USA, M.S. in Electrical Engineering from University of Idaho, Idaho, USA, and B.S. in Electronics and Communication Engineering from University of Kerala, India. Research interests include VLSI Design, Computer Vision and Intelligent System Design. She is a Senior Member of IEEE, member of Eta Kappa Nu, IEEE Computer Society, IEEE Women in Engineering, ISTE (L)(India), and Fellow IETE (India).