
Explore commonly used hardware architectures, including RAM behavior, synchronous FIFO design, pipeline fundamentals to maximize clock frequency, arbiters, CDC and asynchronous FIFO, and select window concepts with exercises.
Understand the behavior of SRAM cells, including single-port, two-port, and dual-port variants, with write and read timing, bypass options, byte enable, and memory compiler considerations.
This lecture covers handshake interface and synchronous fifo, detailing valid and ack signaling, push and pop operations, and implementing depth, full, and empty flags with two pointers.
Calculate fifo depth to smooth bandwidth mismatch between producer and consumer blocks. A 100 mhz example yields a minimum depth of four.
Explain pipeline fundamentals, define the maximum clock frequency and throughput, and show how inserting pipe resistors increases throughput while maintaining efficiency with timing examples.
Explore pipeline design through a BIN2BCD converter, turning an 11-bit two's complement input into a 17-bit BCD output, with enumeration-based architecture and pipelined Verilog implementation considerations.
Design a pipelined set calculation to compute the sum of absolute differences for 60 by 60 eight-bit pixels, using 256 parallel subtractors and staged accumulation for high clock frequency.
Design arbiter fundamentals, explaining function and two common implementations, fixed priority and pseudo round-robin, then develop real round-robin and extend to arbitrary arbitration algorithms.
Explore an alternative hardware architecture for pseudo random arbitration using a single fixed-priority arbiter with four selectors to realize four priority settings, plus a trace-back mechanism for input IDs.
Explore the real round-robin arbiter hardware that uses a priority register to encode, shift, and select per-request priority for four inputs, with fixed-priority and bandwidth guarantee variants.
Explore architectures that omit an arbiter by scheduling non-overlapping read/write windows for four blocks within a single frame, using a one-hot decoder to generate enables and selects.
Explore clock domain crossing concepts, including synchronous versus asynchronous clocks, metastability, and designing single-bit and multi-bit CDC circuits, such as asynchronous FIFOs, to preserve data integrity.
demonstrates a 1-bit clock-domain crossing circuit using two cascaded d flip-flops to transfer a rising edge across asynchronous clocks while analyzing metastability, setup and hold violations, and timing budgets.
Master multi-bit cdc circuits by using two-stage and one-stage synchronizers with handshake signals to safely transfer data between clock zero and clock one, addressing metastability and ensuring data validity.
Design a multi-bit synchronizer with a data synchronization kernel to cross clock domains using a handshake interface, choosing between one-pass or two-pass architectures, and validate with automated test patterns.
Explore the reference design for a multi-bit synchronizer handshake using x, y, and z, then design your own, compare to the code, and seek clarification on Udemy if needed.
Explore asynchronous and synchronous fifo designs with a dual-port ram, grey-coded multi-bit pointers, and cross clock domain synchronization for accurate empty and full flag generation.
Explore verifying clock domain crossing circuits with RTL and gate-level simulations, highlight metastability limits, and apply setup/hold checks and timing constraints to ensure CDC reliability.
Explore a reference design for an async fifo within vlsi/fpga design, highlighting how common hardware architectures enable robust data handling.
Explore clock domain crossing and metastability, and learn how setup time, theta term, and df timing affect data integrity across synchronization stages.
Explore how ping-pong architecture uses dual buffers to balance data throughput between producer and consumer blocks, enabling simultaneous operation and smoother matrix transposition.
Engage in practice time with the reference code of matrix_trans to explore common used hardware architectures in VLSI/FPGA design.
Design pipelines with flow control using handshake interfaces to ensure flexible data transfer and feedback. Evaluate one-register, global enable, and two-depth fifo architectures to optimize throughput and break combinational paths.
Discover the reference code for sad_cal_ctrl in the vlsi/fpga design course, highlighting common used hardware architectures and practical design patterns.
Identify pipeline hazards: data dependency, control dependency, and hardware structure hazards. Learn how forward paths resolve data dependencies, reduce bubbles, and leverage branch prediction.
Study reference code for swapping items in an array within the context of common hardware architectures used in VLSI/FPGA design.
Explore slide window architectures for data reuse in hardware, using 1D and 2D cases, shift registers and line buffers with boundaries and padding for a 5x5 gas filter.
Design and implement a 2d gauss filter with a 5x5 kernel for the luma component in yuv 4:2:2 video, bypass uv, and verify with a testbench.
Engage in practice time with reference code for a 2D Guass Filter to empower hands-on learning in VLSI/FPGA architectures.
Design a 2d dma controller ip block that copies data between system addresses in an fpga project, showcasing resume-ready skills as dma drives a 2d game on a touch screen.
Please contact SKY (DM or E-mail to siliconthink@126.com) for special offer of $12.99 USD.
In this chapter I will introduce common used hardware architectures, including:
1: Behavior of SRAM and usage suggestions;
2: Handshake interface and synchronous FIFO;
3: Pipeline to maximal clock frequency;
4: Arbiter;
5: Cross clock domain (CDC) and asynchronous FIFO;
6: Ping-Pong;
7: Pipeline with control (feedback);
8: Pipeline with hazard and forward path;
9: Slide window;
These are useful architectures engineer used to deal with complex designs, such as RISC-V CPU core, AI accelerator and so on. To help you mastering them, I will assign a coding exercise after each section.
This is chapter 3 of whole Digital IC and FPGA design course.
In the whole course, I will introduce fundamentals of digital IC and FPGA design, with 12+ coding exercises and 3 course projects.
Theory part: MOS transistor -> logic cells -> arithmetic data path -> Verilog language -> common used HW function blocks and architecture -> STA -> on-chip-bus(APB/AHB-Lite/AXI4) -> low power design -> DFT -> SOC(MCU level).
Function blocks and architecture: FSM, pipeline, arbiter, CDC, sync_fifo, async_fifo, ping-pong, pipeline with control, slide window, pipeline hazard and forward path, systolic.
Project: SHA-256 algorithm with simple interface, SHA-256 with APB/AXI interface, 2D DMA controller with APB/AXI interface.
After explaining of each HW architecture, I will give you a coding exercise, with reference code. Coding difficulty will begin from several lines to fifty lines, more than 100 lines, then around 200 lines. While the final big project will be 1000+ lines.
I suppose these should be essential knowledge and skills you need master to enter this area.
I will try my best to explain what-> how-> why and encourage you to do it better in this course.
Please browse to my homepage on Udemy to obtain information about each chapter of this course.