Abstract
This paper presents a 2-D DCT/IDCT processor chip for high data rate image processing and video coding. It uses a fully pipelined row–column decomposition method based on two 1-D DCT processors and a transpose buffer based on D-type flip-flops with a double serial input/output data-flow. The proposed architecture allows the main processing elements and arithmetic units to operate in parallel at half the frequency of the data input rate. The main characteristics are: high throughput, parallel processing, reduced internal storage, and maximum efficiency in computational elements. The processor has been implemented using standard cell design methodology in 0.35 μm CMOS technology. It measures 6.25 mm2 (the core is 3 mm2) and contains a total of 11.7 k gates. The maximum frequency is 300 MHz with a latency of 172 cycles for 2-D DCT and 178 cycles for 2-D IDCT. The computing time of a block is close to 580 ns. It has been designed to meets the demands of IEEE Std. 1,180–1,990 used in different video codecs. The good performance in the computing speed and hardware cost indicate that this processor is suitable for HDTV applications.
Similar content being viewed by others
References
K. R. Rao and P. Yip, “Discrete Cosine Transform: Algorithms, Advantages and Applications,” Boston/San Diego/New York/London/Sydney/Tokyo/Toronto: Academic, 1990.
V. Bhaskaran and K. Konstantinides, “Image and Video Compression Standards: Algorithms and Architectures,” Boston/Dordrecht/London: Kluwer, 2nd Edition, 1997.
K. R. Rao and J. J. Hwang, “Techniques and Standards for Image Video and Audio Coding,” New Jersey: Prentice Hall PTR, 1996.
T. Kuroda, T. Fujita, S. Mita, T. Nagamatsu, S. Yoshioka, K. Suzuki, F. Sano, M. Norishima, M. Murota, M. Kako, M. Kinugawa, M. Kakumu, and T. Sakurai, “A 0.9-V, 150 MHz, 10-mW, 4 mm2, 2-D Discrete Cosine Transform Core Processor with Variable Threshold-voltage (VT) Scheme,” IEEE J. Solid-state Circuits, vol. 31, no. 11, 1996, pp. 1770–1779.
T. H. Chen, “A Cost-effective 8 × 8 2-D IDCT Core Processor with Folded Architecture,” IEEE Trans. Consum. Electron., vol. 45, no. 2, 1999, pp. 333–339.
I. K. Kim, J. J. Cha, and H. J. Cho, “A Design of 2-D DCT/IDCT for Real-time Video Applications,” 6th Int. Conf. on VLSI and CAD, 1999, pp. 557–559.
T. S. Chang, C. S. Kung, and C. W. Jen, “A Simple Processor Core Design for DCT/IDCT,” IEEE Trans. Circuits Syst. Video Technol., vol. 10, no. 3, 2000, pp. 439–447, April.
Dae-Won-Kim, Taek-Won-Kwon, Jung-Min-Seo, Jae-Kun-Yu, Suk-Kyu-Lee, and Jung-Hee-Suk, “A Compatible DCT/IDCT Architecture Using Hardwired Distributed Arithmetic,” IEEE Int. Symp. Circuits Syst. Proc., vol. 2, 2001, pp. 457–460.
J. I. Guo and J. C. Yen, “An Efficient IDCT Processor Design for HDTV Applications,” J. VLSI Signal Process., vol. 33, 2003, pp. 147–155.
Y. P. Lee, T. H. Chen, L. G. Chen, M. J. Chen, and C. W. Ku, “A Cost-effective Architecture for 8 × 8 Two Dimensional DCT/IDCT using Direct Method,” IEEE Trans. Circuits Syst. Video Technol., vol. 7, no. 3, 1997, pp. 459–466.
D. Gong, Y. He, and Z. Cao, “New Cost-efective VLSI Implementation of a 2-D Discrete Cosine Transform and its Inverse,” IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 4, 2004, pp. 405–415.
S. I. Uramoto, Y. Inoue, A. Takabatate, J. Takeda, Y. Yamashita, H. Terane, and M. Yoshimoto, “A 100 MHz 2-D Discrete Cosine Transform Core Processor,” IEEE J. Solid-state Circuits, vol. 27, no. 4, 1992, pp. 492–499.
A. Madisetti and A. N. Willson, “A 100 MHz 2-D 8 × 8 DCT/IDCT Processor for HDTV Applications,” IEEE Trans. Circuits Syst. Video Technol., vol. 5, no. 2, 1995, pp. 158–165.
R. Rambaldi, A. Uguzzoni, and R. Guerrieri, “A 35 μW 1.1 V Gate Array 8 × 8 IDCT for Video Technology,” Proc. ICASSP, vol. 5, 1998, pp. 2993–2996.
L. Fanucci and S. Saponara, “Data Driven VLSI Computation for Low Power DCT-based Video Coding,” 9th IEEE Int. Conf. on Electronics-Circuits-and-Systems, vol. 2, 2002, pp. 541–544.
B. D. Tseng and W. C. Miller, “On the Computing the Discrete Cosine Transform,” IEEE Trans. Comput., vol. C-27, no. 10, 1978, pp. 966–968.
H. Malvar, “Fast Computation of Discrete Cosine Transform Through Fast Hartley Transform,” Electron. Lett., vol. 22, no. 7, 1986, pp. 352–353.
S. Yu and E. E. Swartzlander, “A Scaled DCT Architecture with the CORDIC Algorithm,” IEEE Trans. Signal Process., vol. 50, no. 1, 2002, pp. 160–167.
Y. T. Chang and C. L. Wang, “A New Fast DCT Algorithm and its Systolic VLSI Implementation,” IEEE Trans. Circuits Syst., 2 Analog Digit. Signal Process., vol. 44, no. 11, 1997, pp. 959–962.
P. A. Ruetz, P. Tong, D. Bailey, A. D. Luthi, and P. H. Ang, “A High-performance Full-motion Video Compression Chip Set,” IEEE Trans. Circuits Syst. Video Technol., vol. 2, no. 2, 1992, pp. 111–122.
L. Fanucci, R. Saletti, and F. Vavala, “A Low-complexity 2-D Discrete Cosine Transform Processor for Multimedia Applications,” 1999, pp. 449–452.
G. A. Ruiz, J. A. Michell, and A. M. Burón, “Parallel-pipeline 2D DCT/IDCT Processor Architecture,” SPIE Symp. on Microtechnologies for the New Millennium, pp. 774–784, May 2005.
“IEEE Standard Specifications for the Implementations of 8 × 8 Inverse Discrete Cosine Transform,” Institute of Electrical and Electronics Engineers, New York, March 1991.
Video Codec for Audio-Visual Services at px64 kbits/s, ITU-T H.261, 1993.
ITU-T recommendation H.263. Video Coding for Low Bit-Rate Communication, 1996.
ITU-T recommendation H.263+, Jan. 27, 1999, Draft 21.
W. Chen, C. H. Smith, and S. Fralick, “A Fast Computation Algorithm for the Discrete Cosine Transform,” IEEE Trans. Commun., vol. 25, 1977, pp 703–709.
S. Kim and W. Sung, “Optimum Wordlength Determination of 8 × 8 IDCT Architectures Conforming to the IEEE Standard Specifications,” Conference Record of The 29th Asilomar Conference on Signals, Systems and Computers, vol. 2, 1996, pp. 821–825.
T. Y. Chang and M. J. Hsiao, “Carry-select Adder Using Single Ripple-carry Adder,” Electron. Lett., vol. 34, no. 22, 1998, pp. 2101–2103.
R. P. Brent and H. T. Kung, “A Regular Layout for Parallel Adders,” IEEE Trans. Comput., vol. C-31, no. 3, 1982, pp. 260–264.
G. S. Taylor and G. M. Blair, “High Design for the Discrete Cosine Transform in VLSI,” IEE Proc. Comput. Digit. Tech., vol. 145, no. 2, 1998, pp. 127–133, March.
Author information
Authors and Affiliations
Additional information
This work was supported by the Spanish Ministry of Science and Technology (TIC2000-1289).
Rights and permissions
About this article
Cite this article
Ruiz, G.A., Michell, J.A. & Burón, A. High Throughput Parallel-Pipeline 2-D DCT/IDCT Processor Chip. J VLSI Sign Process Syst Sign Image Video Technol 45, 161–175 (2006). https://doi.org/10.1007/s11265-006-9764-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-006-9764-7