SPECIAL ISSUE (ASCB) Nirmala and Sekar **ARTICLE** **OPEN ACCESS** #### SUB SAMPLING FILTER ARCHITECTURE DESIGN OF **FOR** WAVELET TRANSFORM Nirmala R<sup>1</sup> and Sathiya Sekar K<sup>2</sup> <sup>1</sup>Dept. of ECE, Vivekanandha college of Engineering for Women, Ellayampalayam, Tiruchengode, INDIA <sup>2</sup> Dept. of EEE, S A Engineering College, Chennai, INDIA ### **ABSTRACT** Power dissipation and area reduction is the main constraint in the present scenario of VLSI circuits. The power dissipation of the circuit is mainly due to static or leakage power. The leakage power contributes to about 50% of the power dissipation in all the devices that are used in our day to day life. Discrete Wavelet Transform (DWT) has more advantages compared with FFT and DCT and it has many applications. Design of VLSI architecture of DWT is very important in the present scenario. To get better efficiency and throughput, DWT architecture is proposed with sub sampling filter with array multiplier and ripple carry adder .In the existing system, Carry save adder and Bough Wooley multiplier were used to design a subsampling filters, resulting that the power dissipation and number of transistors required are more which leads to complexity .In the proposed method the number of transistors required to design the sub sampling filter and power dissipation are measured using LTSpice IV tool which is less compared with existing filter. Published on: 08th - August-2016 **KEY WORDS** VLSI circuits; Discrete Wavelet Transform; Bough Wooley Multiplier; Transistors; LTSpice IV tool. \*Corresponding author: Email: nirdha06@gmail.com, ksathiyasekar@gmail.com; Tel.: +91 9688864777 # INTRODUCTION In the field of signal/image processing currently we are using wavelet transform instead of Fourier Transform (FT), the Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST). The Discrete Wavelet Transform (DWT) is an efficient platform for multi resolution analysis, with this signals can be decomposed into different sub-bands with excellent characteristics in the time and frequency domain. Comparing with previous DCT, DWT has better coding efficiency and excellent quality of restoration of image with high compression ratio. Multiplier is one of the basic functional unit in digital signal processor. Most of the high speed DSP systems has minimized multiplication units with high data throughput. Most hardware implementations address one or two essential design optimizations to improve their performance in terms of area, throughput or power dissipation. High-throughput and lower-power VLSI implementations are considered two of the essential optimization axes, especially when considering portable and real-time DSP applications. The hardware implementation for the DWT design can be significantly improved by designing application-specific FIR filters. The effort required to design the different filter blocks prolongs the design time and reduces the overall design productivity. In the paper [1] DWT architecture combines several optimizations that improve the performance of the hardware design in terms of throughput and power dissipation. We designed and analyzed the performance of numerous DWT architectures using pertinent metrics and cost functions that assess the impact of the design optimizations. In the paper [2] the conventional array multiplier is implemented using 16T full adder cell. In conventional array multiplier has a tradeoff between power and area. But array multiplier is synthesized using 10T full adder cell uses 96 less transistor count which reduces 2.82% of total power and increases the speed by 13.24% also 15.69% less power delay product when being compared with the conventional array multiplier. Paper [3] implements the 1D discrete Wavelet Transform architecture using poly phase structure. This structure has high scalability and hardware requirement is very less. In the paper [4] convolution based 1D discrete Wavelet Transform combines polyphase decimated FIR filter with pipelined computation structure to get higher through put and minimized chip area. Paper [5] implements the 1-bit full adder with metal-oxide-semiconductor (CMOS) logic and transmission gate logic. It explores the power and area improvement. In our work we have designed the sub sampling filter using multiplier and adder. The multipliers are implemented using Array, Baugh-wooley multiplier concepts and the adder structures used in the MAC are Ripple Carry, Carry Save, The paper is organized as follows, in 2<sup>nd</sup> subdivision DWT and its blocks are explained. In 3<sup>rd</sup> subdivision different types of multipliers and adders are explained .In 4<sup>th</sup> subdivision sub sampling filter design is designed with proposed methods. In 5<sup>th</sup> Results and Discussion and Finally conclusion. ## DISCRETE WAVELET TRANSFORM In digital image processing field, compression is one of the most effective techniques. A signal can be decomposed into set of basic functions using wavelet transform, those set of basic functions are known as wavelets. Dilations and shifting are used to obtain wavelets from a mother wavelet. It is also known as single prototype wavelet. For the hardware implementation of DWT plenty of VLSI architectures are available. The architectures of 1-D DWT is classified into two types convolution-based and lifting-based. Plenty of VLSI architectures available and proposed for DWT hardware implementation. Convolution based and lifting based architectures are mainly used for 1-D DWT.. Critical path has been reduced using pipelining, will require large number of registers for 1-D structure. Polyphase decomposition is used for proper utilization, there are classified as two types: Type-I decomposition: $$H(z) = H_o(z^2) + z^{-1}H_o(z^2)$$ $$G(z) = G_e(z^2) + z^{-1}G_o(z^2)$$ $$\widetilde{H}(z) = \widetilde{H}_s(z^2) + z^{-1}\widetilde{H}_o(z^2)$$ $$\tilde{G}(z) = \tilde{G}_{s}(z^{2}) + z^{-1}\tilde{G}_{o}(z^{2})$$ Type-II decomposition $$H(z) = H_{e}(z^{2}) + zH_{o}(z^{2}), \qquad G(z) = G_{e}(z^{2}) + zG_{o}(z^{2})$$ $$\widetilde{H}(z) = \widetilde{H}_{s}(z^{2}) + z\widetilde{H}_{o}(z^{2}), \quad \widetilde{G}(z) = \widetilde{G}_{s}(z^{2}) + z\widetilde{G}_{o}(z^{2})$$ For DWT and IDWT, convolution based architectures can be constructed using the sub sampling and down sampling filters. ### ADDER ARCHITECTURES Adders are used in digital signal processing applications or in processing elements. Adder unit is not only for addition and subtraction it's also used to perform multiplication and division. Two basic adders are half adder and full adder. The carry propagate adders can be classified as follow: # Carry Save Adder (CSA) To perform addition of multiple operands carry save adder is used. The generation of sum and carry in CSA is similar to basic adder structure; the difference is that at next stage the carry is added to the sum. RCA is used to produce the final stage results. **Figure-1** shows transistor level diagram of carry save adder. Fig:1. Transistor level diagram of carry save adder # Ripple Carry Adder (RCA): N-bit RCA consist of N full adders where N is the total number of bits, here the carry signal propagate from LSB to MSB, it traverses longest path known as worst case delay path via N-stages. Mathematical equations are used to calculate propagate time **figure-2,3** shows the implementation. $$t_{adder} = N - 1 t_{carry} + t_{sum}$$ Where, $t_{carry}$ – time taken to propagate carry, $t_{sum}$ – time taken to produce sum. Fig:2. Block diagram of ripple carry adder Fig: 3. Transistor level diagram for full adder Conventional CSA using Ripple Carry Adder. In final stage of conventional carry save adder, RCA is used ### **BAUGH WOOLEY MULTIPLIER** To handle the sign bits, Baugh wooley multiplication is the best effective method. To design a regular multiplier this approach has been developed and it's suited for 2's complement number in **Figure- 4**,5. Fig: 4. Block diagram of Bough Wooley Multiplier The **Figure- 4** describes the unsigned multiplication based on Baugh wooley algorithm. In that algorithm initially AND terms are created, those created AND terms are sent through an half-adders and full-adders with carry outs chained to the next MSB at each level of addition. It's also used to multiply the negative operands. Although the algorithm is not able to work with unsigned inputs, the goal of the project was a signed multiplier. Fig: 5. Transistor level diagram of bough-wooley multiplier #### ARRAY MULTIPLIER Array multiplier is very familiar because of its regular structure. Array multiplier structure is working based on repeated addition and shifting principle. The multiplication of each multiplier digit with multiplicand generate a partial product. Those partial products are added, before it is shifted into their bit sequence. A normal carry propagation adder is used to perform the summation. In array multiplier, N is the no. of multiplier bits; N-1 adders are required for implementation shows in **Figure -6,7**. Fig: 6. Block diagram of array multiplier Fig: 7.Transistor Level Diagram of Array Multiplier The advantage of array multiplier is its regular structure; therefore layout becomes simple and it occupies less area since it has small size. In VLSI, the regular structures can be cemented one over another; this reduces the amount of mistakes and also reduces layout design. ### PROPOSED SUBSAMPLING FILTER In the DWT structure, the filters called sub sampling filters were used. The sub sampling filter was designed by using different module which consists of multiplexer, adder; multiplier and D- flip flop. For these sub sampling filters inputs are designed and implemented with 4 bits shown in figure 8,9. We implement the design with PTM 90 nm technology. The proposed sub sampling filter consist of 2:1 multiplexers, D Flip flop, Ripple carry adder, Array multiplier. The 4 bit binary data is given as a input for sub sampling filter. The 2:1 Multiplexer consist of two inputs A (4 bits) & B (4 bits) and select line 0 and 1, the output is produced by choosing a select line which is multiplied with input data using array multiplier. This output is added with another multiplexer output by ripple carry adder. This data bits are given to D Flip flop, which produce some delay. In that delays time period other module will perform the above function and produce the output which is given to another multiplexer. This process will be repeated for three module. Here, Ripple carry adder and array multiplier are used to reduce the power dissipation and area. Fig: 8. Proposed Sub sampling filter Fig: 9. Schematic View of Sub sampling Filter # **RESULTS AND DISCUSSION** In the 90nm technology, design of sub sampling filter with bough-wooley multiplier and carry save adder requires 3816 transistors and power dissipation is 3.85nW. But in the proposed system with ripple carry adder and array multiplier requires 3486 transistors and power dissipation is 3.48nW Shown in **Table-1**. **Table-2** explores the different types of adder and multiplier design with their power dissipation. From the results, the proposed design has significant improvement in reduction both area and power dissipation. Table: 1. Comparison between existing and proposed system | Description | Subsampling filter with bough-wooley multiplier and carry save adder | Subsampling filter ripple carry adder and array multiplier | |----------------------------|----------------------------------------------------------------------|------------------------------------------------------------| | Technology | 90nm | 90nm | | No. of. Transistor<br>Used | 3816 | 3486 | | Power dissipation | 3.858 nW | 3.4787 nW | COMPUTER SCIENCE Table: 2. Power dissipation for Adders and Multipliers | DESCRIPTION | POWER DISSIPATI | |-------------------------|-----------------| | Bough-wooley multiplier | 574.34pW | | Array multiplier | 476.45pW | | Carry save adder | 470.23pW | | Ripple carry adder | 151.77pW | # CONCLUSION In this project, subsampling filter in discrete wavelet transform are designed using LT spice IV software with PTM(Predictive Technology Model) 90nm technology. In the existing system of subsampling filter with boughwooley multiplier and carry save adder has 3816 transistors and power dissipation is 3.85nW. But in the proposed system with ripple carry adder and array multiplier has 3486 transistors and power dissipation is 3.48nW. From the results the proposed design of subsampling filter for DWT has better efficiency and less power conception. #### **ACKNOWLEDGEMENT** None ### **CONFLICT OF INTEREST** No conflict of interest ### FINANCIAL DISCLOSURE No financial support was received to carry out this project. # **REFERENCES** - R Hourani, I Dalal, W Davis, C Doss, and W Alexander. [2008] An efficient VLSI implementation for the 1D convolutional discrete wavelet transform, - [2] KripaMathew, S AshaLatha, T.Ravi and E Logashanmugam. [2013] Design and Analysis of an Array Multiplier Using an Area Efficient Full Adder Cell in 32nm CMOS Technology .high speed. - [3] J Chilo and T Lindblad.[2008] Hardware implementation of 1D wavelet transform on an FPGA for infrasound signal classification," *IEEE Transactions on Nuclear Science*, 55(1): 9–13. - [4] R Hourani, I Dalal, W Davis, C Doss, and W Alexander.[ 2008]An efficient VLSI implementation for the 1D convolutional discrete wavelet transform," in IEEE International Conference on Midwest Symposium on Circuits and Systems (MWSCAS), Knoxville, Tennessee, U.S.A., August 2008, pp. 870–873. - [5] Partha Bhattacharyya, Dept. of Electron. & Telecommun. Eng., Indian Inst. of Eng. Sci. & Technol., Howrah, India; Bijoy Kundu; Sovan Ghosh; Vinay Kumar "Performance Analysis of a Low-Power High-Speed Hybrid 1-bit Full Adder Circuit" in IEEE Transactions on Very Large Scale Integration (VLSI) Systems 23 (10)