Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

VLSI/FPGA Design P3: Common Used Hardware Architectures

Name: VLSI/FPGA Design P3: Common Used Hardware Architectures
Rating: 4.2 (9 reviews)

A Big Step Towards Complex IP Design

Created bySKY SiliconThink

Last updated 3/2025

English

What you'll learn

Behavior of SRAM and usage suggestions
Handshake interface and synchronous FIFO
Pipeline to maximal clock frequency
Arbiter
Cross clock domain (CDC) and asynchronous FIFO
Ping-Pong
Pipeline with control (feedback)
Pipeline with hazard and forward path
Slide window

Course content

10 sections • 30 lectures • 10h 29m total length

Introduction2:21
Explore commonly used hardware architectures, including RAM behavior, synchronous FIFO design, pipeline fundamentals to maximize clock frequency, arbiters, CDC and asynchronous FIFO, and select window concepts with exercises.

Handshake Interface and Sync_FIFO27:59
This lecture covers handshake interface and synchronous fifo, detailing valid and ack signaling, push and pop operations, and implementing depth, full, and empty flags with two pointers.
Depth Calculation for FIFO14:04
Calculate fifo depth to smooth bandwidth mismatch between producer and consumer blocks. A 100 mhz example yields a minimum depth of four.
Design and Verification of Sync_FIFO

Pipeline Fundamental19:58
Explain pipeline fundamentals, define the maximum clock frequency and throughput, and show how inserting pipe resistors increases throughput while maintaining efficiency with timing examples.
Pipeline Design Example: BIN2BCD44:47
Explore pipeline design through a BIN2BCD converter, turning an 11-bit two's complement input into a 17-bit BCD output, with enumeration-based architecture and pipelined Verilog implementation considerations.
BIN2BCD Convertor Design
Pipeline Design Example: SAD_Cal48:11
Design a pipelined set calculation to compute the sum of absolute differences for 60 by 60 eight-bit pixels, using 256 parallel subtractors and staged accumulation for high clock frequency.
Coding Exercise: SAD_CAL Design

Basic Arbiter Design38:28
Design arbiter fundamentals, explaining function and two common implementations, fixed priority and pseudo round-robin, then develop real round-robin and extend to arbitrary arbitration algorithms.
Other HW Arch. for Pseudo Round-Robin7:20
Explore an alternative hardware architecture for pseudo random arbitration using a single fixed-priority arbiter with four selectors to realize four priority settings, plus a trace-back mechanism for input IDs.
Real Round-Robin Arbiter20:56
Explore the real round-robin arbiter hardware that uses a priority register to encode, shift, and select per-request priority for four inputs, with fixed-priority and bandwidth guarantee variants.
Archtecture without Arbiter4:33
Explore architectures that omit an arbiter by scheduling non-overlapping read/write windows for four blocks within a single frame, using a one-hot decoder to generate enables and selects.

Phenomenon of Metastable and Goals of CDC Circuit25:27
Explore clock domain crossing concepts, including synchronous versus asynchronous clocks, metastability, and designing single-bit and multi-bit CDC circuits, such as asynchronous FIFOs, to preserve data integrity.
1bit CDC Circuit34:40
demonstrates a 1-bit clock-domain crossing circuit using two cascaded d flip-flops to transfer a rising edge across asynchronous clocks while analyzing metastability, setup and hold violations, and timing budgets.
Multi-Bit CDC Circuits33:08
Master multi-bit cdc circuits by using two-stage and one-stage synchronizers with handshake signals to safely transfer data between clock zero and clock one, addressing metastability and ensuring data validity.
Practice time: Multi-Bit Synchronizer Design3:39
Design a multi-bit synchronizer with a data synchronization kernel to cross clock domains using a handshake interface, choosing between one-pass or two-pass architectures, and validate with automated test patterns.
Multi-Bit Synchronizer Design
Reference Design for Multi-Bit Synchronizer6:15
Explore the reference design for a multi-bit synchronizer handshake using x, y, and z, then design your own, compare to the code, and seek clarification on Udemy if needed.
Async_FIFO Design36:39
Explore asynchronous and synchronous fifo designs with a dual-port ram, grey-coded multi-bit pointers, and cross clock domain synchronization for accurate empty and full flag generation.
Verify and Timing Constraint for CDC Circuit15:26
Explore verifying clock domain crossing circuits with RTL and gate-level simulations, highlight metastability limits, and apply setup/hold checks and timing constraints to ensure CDC reliability.
Practice Time: Async_FIFO Design
Reference Design for Async_FIFO10:48
Explore a reference design for an async fifo within vlsi/fpga design, highlighting how common hardware architectures enable robust data handling.
Extra: Why 100% Guarantee of Data Integrity20:56
Explore clock domain crossing and metastability, and learn how setup time, theta term, and df timing affect data integrity across synchronization stages.

Ping-Pong Architecture40:50
Explore how ping-pong architecture uses dual buffers to balance data throughput between producer and consumer blocks, enabling simultaneous operation and smoother matrix transposition.
Practice Time: matrix_trans design
Practice Time: Reference Code of matrix_trans8:21
Engage in practice time with the reference code of matrix_trans to explore common used hardware architectures in VLSI/FPGA design.

Pipeline with Flow Control31:10
Design pipelines with flow control using handshake interfaces to ensure flexible data transfer and feedback. Evaluate one-register, global enable, and two-depth fifo architectures to optimize throughput and break combinational paths.
Practice Time: Modify sad_cal with handshake interface
Reference Code of sad_cal_ctrl9:10
Discover the reference code for sad_cal_ctrl in the vlsi/fpga design course, highlighting common used hardware architectures and practical design patterns.
Pipeline Hazard and Forward Path26:46
Identify pipeline hazards: data dependency, control dependency, and hardware structure hazards. Learn how forward paths resolve data dependencies, reduce bubbles, and leverage branch prediction.
Paractice Time: Swap Items in an Array
Reference Code of Swap Items in An Array7:34
Study reference code for swapping items in an array within the context of common hardware architectures used in VLSI/FPGA design.

Principle of Slide Window19:32
Explore slide window architectures for data reuse in hardware, using 1D and 2D cases, shift registers and line buffers with boundaries and padding for a 5x5 gas filter.
Chapter Exercise: 2D Guass Filter Design6:05
Design and implement a 2d gauss filter with a 5x5 kernel for the luma component in yuv 4:2:2 video, bypass uv, and verify with a testbench.
Paractice Time: Design 2D Guass Filter
Paractice Time: Reference code of 2D Guass Filter11:55
Engage in practice time with reference code for a 2D Guass Filter to empower hands-on learning in VLSI/FPGA architectures.

Requirements

Basic knowledge of digital fundamental
Basic C or C++ programing language
Basic Verilog Language

Description

Please contact SKY (DM or E-mail to siliconthink@126.com) for special offer of $12.99 USD.

In this chapter I will introduce common used hardware architectures, including:

1: Behavior of SRAM and usage suggestions;

2: Handshake interface and synchronous FIFO;

3: Pipeline to maximal clock frequency;

4: Arbiter;

5: Cross clock domain (CDC) and asynchronous FIFO;

6: Ping-Pong;

7: Pipeline with control (feedback);

8: Pipeline with hazard and forward path;

9: Slide window;

These are useful architectures engineer used to deal with complex designs, such as RISC-V CPU core, AI accelerator and so on. To help you mastering them, I will assign a coding exercise after each section.

This is chapter 3 of whole Digital IC and FPGA design course.

In the whole course, I will introduce fundamentals of digital IC and FPGA design, with 12+ coding exercises and 3 course projects.

Theory part: MOS transistor -> logic cells -> arithmetic data path -> Verilog language -> common used HW function blocks and architecture -> STA -> on-chip-bus(APB/AHB-Lite/AXI4) -> low power design -> DFT -> SOC(MCU level).

Function blocks and architecture: FSM, pipeline, arbiter, CDC, sync_fifo, async_fifo, ping-pong, pipeline with control, slide window, pipeline hazard and forward path, systolic.

Project: SHA-256 algorithm with simple interface, SHA-256 with APB/AXI interface, 2D DMA controller with APB/AXI interface.

After explaining of each HW architecture, I will give you a coding exercise, with reference code. Coding difficulty will begin from several lines to fifty lines, more than 100 lines, then around 200 lines. While the final big project will be 1000+ lines.

I suppose these should be essential knowledge and skills you need master to enter this area.

I will try my best to explain what-> how-> why and encourage you to do it better in this course.

Please browse to my homepage on Udemy to obtain information about each chapter of this course.

Who this course is for:

Senior undergraduate students of EE or higher
IC design/verification engineers with 0~2 year experience

VLSI/FPGA Design P3: Common Used Hardware Architectures

What you'll learn

Explore related topics

Course content

Introduction1 lecture • 2min

SRAM1 lecture • 51min

Handshake Interface and Synchronous FIFO2 lectures • 42min

Pipeline Design3 lectures • 1hr 53min

Arbiter Design4 lectures • 1hr 11min

Clock Domain Cross Design (CDC)9 lectures • 3hr 7min

Ping-Pong Architecture2 lectures • 49min

Advanced Topics of Pipeline4 lectures • 1hr 15min

Slide Window3 lectures • 38min

SP: VLSI/FPGA Design Resume Project: 2D DMA Controller with APB+AXI Interface1 lecture • 2min

Requirements

Description

Who this course is for: