[论文解读] On-Chip Implementation of Pipeline Digit-Slicing Multiplier-Less Butterfly for Fast Fourier Transform Architecture
本文提出了一种流水线化、数位切片、无乘法器的蝶形架构,用于基-2 DIT FFT,以降低计算复杂度并提升速度。通过用数位切片单常数乘法替代传统乘法器并结合流水线优化,该设计在Virtex-II FPGA上实现了549.75 MHz的最大时钟频率,相较于传统蝶形架构提升了276.28%。
The need for wireless communication has driven the communication systems to high performance. However, the main bottleneck that affects the communication capability is the Fast Fourier Transform (FFT), which is the core of most modulators. This study presents an on-chip implementation of pipeline digit-slicing multiplier-less butterfly for FFT structure. The approach is taken, in order to reduce computation complexity in the butterfly, digit-slicing multiplier-less single constant technique was utilized in the critical path of Radix-2 Decimation In Time (DIT) FFT structure. The proposed design focused on the trade-off between the speed and active silicon area for the chip implementation. The new architecture was investigated and simulated with MATLAB software. The Verilog HDL code in Xilinx ISE environment was derived to describe the FFT Butterfly functionality and was downloaded to Virtex II FPGA board. Consequently, the Virtex-II FG456 Proto board was used to implement and test the design on the real hardware. As a result, from the findings, the synthesis report indicates the maximum clock frequency of 549.75 MHz with the total equivalent gate count of 31,159 is a marked and significant improvement over Radix 2 FFT butterfly. In comparison with the conventional butterfly architecture, the design that can only run at a maximum clock frequency of 198.987 MHz and the conventional multiplier can only run at a maximum clock frequency of 220.160 MHz, the proposed system exhibits better results. The resulting maximum clock frequency increases by about 276.28% for the FFT butterfly and about 277.06% for the multiplier. It can be concluded that on-chip implementation of pipeline digit-slicing multiplier-less butterfly for FFT structure is an enabler in solving problems that affect communications capability in FFT and possesses huge potentials for future related works and research areas.
研究动机与目标
- 解决因高复杂度蝶形计算导致的基于FFT通信系统性能瓶颈问题。
- 通过在FFT蝶形阶段消除乘法器,降低硬件复杂度和功耗。
- 通过在基-2 DIT FFT架构中应用流水线化数位切片技术,提升速度与硅片面积效率。
- 通过Verilog HDL和Xilinx Virtex-II FPGA原型设计,证明其实现实时硬件可行。
提出的方法
- 采用数位切片技术,将旋转因子乘法分解为并行位移与加法操作。
- 通过预计算的位片模式实现单常数乘法,从而消除硬件乘法器。
- 在蝶形单元中应用流水线架构,以提高吞吐量和时钟频率。
- 在关键路径中使用Kogge-Stone并行前缀加法器,实现高速加法运算。
- 采用Verilog HDL设计蝶形结构,并使用Xilinx ISE中的XST进行综合,目标为Virtex-II FPGA。
- 通过MATLAB仿真和Virtex-II FG456原型板上的硬件测试验证功能正确性。
实验结果
研究问题
- RQ1数位切片能否在不牺牲精度或性能的前提下替代FFT蝶形单元中的传统乘法器?
- RQ2无乘法器FFT蝶形结构中,流水线对最大可实现时钟频率有何影响?
- RQ3所提出的数位切片无乘法器架构在面积与速度之间存在何种权衡?
- RQ4与传统蝶形结构及基于乘法器的实现相比,该设计在性能上表现如何?
- RQ5数位切片方法能否在FPGA平台上有效实现,以支持实时信号处理?
主要发现
- 所提出的流水线化数位切片无乘法器蝶形结构在Virtex-II FPGA上实现了549.75 MHz的最大时钟频率。
- 相较于传统蝶形结构(198.987 MHz)提升了276.28%。
- 数位切片单常数无乘法器设计的频率甚至达到609.980 MHz。
- 总等效门数为31,159,表明在获得显著速度提升的同时,仅带来适度的面积开销。
- 硬件综合与仿真结果证实了功能正确性,并在实时FFT处理中表现出高性能。
- 该设计在无线通信系统中具有实现高速、低功耗应用的强潜力。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。