April 24, 2009

FFT256 DIF

by Oleg Dzhimiev

1. [Done] Wrote code for the FFT256 DIF.

  • Resources

FFT is performed in 8 conveyor organized stages. Each stage is similar to as described here. So far each stage uses 2 BRAM ports (A for 16-bit Re-part and B for 16-bit Im-part) and the 1 MULT18X18 (for the “butterfly”) – 4 BRAMs + 6 MULT18X18s (MULTs are not used in the last 2 stages). Plus other logic – 1 FFT256 uses 18% of FPGA resourses.

  • Performance time

Each BRAM is shared by 2 stages for write and by 2 stages for read (e.g., write – stages 2,3 and read – stages 3,4) – and because the address bus is used for 4 double accesses then each channel double writes/reads every 8 tacts – this results in:

Load time + Computation time + Readout time @10ns Clk ~ 2.5us + 8×10us + 2.5us = 85us =(

1a. [Done] Removed 2 MULTs from the stages 7 & 8 – because the sine and cosine are +/-1 or 0 – no need in multiplication.

2. [In Progress] FFT256 verification – calculated coefficients in OOo Spreadsheet – for tested sequence the results are almost equal – will be better to move it to the testbench.

TODO: (almost the same because initially there was a DIT algorithm and I was writing DIF)

  1. Write a correlation computation block.
  2. Make FFT run faster – get away from full buffering between stages.
  3. Set up memory controller for frames read/write.
  4. Integrate the correlation block to 10359’s firmware.

Leave a Reply

Your email address will not be published. Required fields are marked *


+ one = 4