Udemy
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
Development
Web Development Data Science Mobile Development Programming Languages Game Development Database Design & Development Software Testing Software Engineering Development Tools No-Code Development
Business
Entrepreneurship Communications Management Sales Business Strategy Operations Project Management Business Law Business Analytics & Intelligence Human Resources Industry E-Commerce Media Real Estate Other Business
Finance & Accounting
Accounting & Bookkeeping Compliance Cryptocurrency & Blockchain Economics Finance Finance Cert & Exam Prep Financial Modeling & Analysis Investing & Trading Money Management Tools Taxes Other Finance & Accounting
IT & Software
IT Certification Network & Security Hardware Operating Systems Other IT & Software
Office Productivity
Microsoft Apple Google SAP Oracle Other Office Productivity
Personal Development
Personal Transformation Personal Productivity Leadership Career Development Parenting & Relationships Happiness Esoteric Practices Religion & Spirituality Personal Brand Building Creativity Influence Self Esteem & Confidence Stress Management Memory & Study Skills Motivation Other Personal Development
Design
Web Design Graphic Design & Illustration Design Tools User Experience Design Game Design Design Thinking 3D & Animation Fashion Design Architectural Design Interior Design Other Design
Marketing
Digital Marketing Search Engine Optimization Social Media Marketing Branding Marketing Fundamentals Marketing Analytics & Automation Public Relations Advertising Video & Mobile Marketing Content Marketing Growth Hacking Affiliate Marketing Product Marketing Other Marketing
Lifestyle
Arts & Crafts Beauty & Makeup Esoteric Practices Food & Beverage Gaming Home Improvement Pet Care & Training Travel Other Lifestyle
Photography & Video
Digital Photography Photography Portrait Photography Photography Tools Commercial Photography Video Design Other Photography & Video
Health & Fitness
Fitness General Health Sports Nutrition Yoga Mental Health Dieting Self Defense Safety & First Aid Dance Meditation Other Health & Fitness
Music
Instruments Music Production Music Fundamentals Vocal Music Techniques Music Software Other Music
Teaching & Academics
Engineering Humanities Math Science Online Education Social Science Language Teacher Training Test Prep Other Teaching & Academics
AWS Certification Microsoft Certification AWS Certified Solutions Architect - Associate AWS Certified Cloud Practitioner CompTIA A+ Cisco CCNA CompTIA Security+ Amazon AWS AWS Certified Developer - Associate
Graphic Design Photoshop Adobe Illustrator Drawing Digital Painting InDesign Character Design Canva Figure Drawing
Life Coach Training Neuro-Linguistic Programming Mindfulness Personal Development Personal Transformation Meditation Life Purpose Emotional Intelligence Neuroscience
Web Development JavaScript React CSS Angular PHP WordPress Node.Js Python
Google Flutter Android Development iOS Development Swift React Native Dart Programming Language Mobile Development Kotlin SwiftUI
Digital Marketing Google Ads (Adwords) Social Media Marketing Google Ads (AdWords) Certification Marketing Strategy Internet Marketing YouTube Marketing Email Marketing Google Analytics
SQL Microsoft Power BI Tableau Business Analysis Business Intelligence MySQL Data Analysis Data Modeling Big Data
Business Fundamentals Entrepreneurship Fundamentals Business Strategy Online Business Business Plan Startup Freelancing Blogging Home Business
Unity Game Development Fundamentals Unreal Engine C# 3D Game Development C++ 2D Game Development Unreal Engine Blueprints Blender
30-Day Money-Back Guarantee
Development Programming Languages CUDA

CUDA programming Masterclass with C++

Learn parallel programming on GPU's with CUDA from basic concepts to advance algorithm implementations.
Bestseller
Rating: 4.4 out of 54.4 (691 ratings)
4,438 students
Created by Kasun Liyanage
Last updated 9/2020
English
English
30-Day Money-Back Guarantee

What you'll learn

  • All the basic knowladge about CUDA programming
  • Ability to desing and implement optimized parallel algorithms
  • Basic work flow of parallel algorithm design
Curated for the Udemy for Business collection

Course content

8 sections • 83 lectures • 10h 47m total length

  • Preview07:48
  • Preview08:50
  • Preview07:19
  • Let's investigate some background.
    3 questions
  • How to install CUDA toolkit and first look at CUDA program
    06:12
  • Basic elements of CUDA program
    16:50
  • Organization of threads in a CUDA program - threadIdx
    08:38
  • Organization of thread in a CUDA program - blockIdx,blockDim,gridDim
    06:14
  • Programming exercise 1
    00:29
  • Unique index calculation using threadIdx blockId and blockDim
    09:20
  • Unique index calculation for 2D grid 1
    05:53
  • Unique index calculation for 2D grid 2
    05:09
  • Memory transfer between host and device
    11:13
  • Programming exercise 2
    01:04
  • Sum array example with validity check
    09:13
  • Sum array example with error handling
    04:32
  • Sum array example with timing
    08:18
  • Extend sum array implementation to sum up 3 arrays
    1 question
  • Device properties
    05:30
  • Summary
    04:17

  • Preview08:46
  • All about warps
    09:43
  • Warp divergence
    12:28
  • Resource partitioning and latency hiding 1
    05:35
  • Resource partitioning and latency hiding 2
    10:41
  • Occupancy
    11:16
  • Profile driven optimization with nvprof
    12:04
  • Parallel reduction as synchronization example
    19:08
  • Parallel reduction as warp divergence example
    10:11
  • Parallel reduction with loop unrolling
    07:03
  • Parallel reduction as warp unrolling
    06:48
  • Reduction with complete unrolling
    04:09
  • Performance comparison of reduction kernels
    05:18
  • CUDA Dynamic parallelism
    10:03
  • Reduction with dynamic parallelism
    05:33
  • Summary
    04:36

  • CUDA memory model
    06:49
  • Different memory types in CUDA
    09:04
  • Memory management and pinned memory
    07:19
  • Zero copy memory
    08:45
  • Unified memory
    04:39
  • Global memory access patterns
    12:55
  • Global memory writes
    03:53
  • AOS vs SOA
    06:03
  • Matrix transpose
    19:34
  • Matrix transpose with unrolling
    06:21
  • Matrix transpose with diagonal coordinate system
    08:36
  • Summary
    03:00

  • Introduction to CUDA shared memory
    09:04
  • Shared memory access modes and memory banks
    09:06
  • Row major and Column major access to shared memory
    08:51
  • Static and Dynamic shared memory
    04:19
  • Shared memory padding
    05:44
  • Parallel reduction with shared memory
    04:44
  • Synchronization in CUDA
    03:38
  • Matrix transpose with shared memory
    11:53
  • CUDA constant memory
    13:10
  • Matrix transpose with Shared memory padding
    05:47
  • CUDA warp shuffle instructions
    14:59
  • Parallel reduction with warp shuffle instructions
    03:50
  • Summary
    02:10

  • Preview06:25
  • How to use CUDA asynchronous functions
    07:10
  • How to use CUDA streams
    10:28
  • Overlapping memory transfer and kernel execution
    05:23
  • Stream synchronization and blocking behavious of NULL stream
    06:57
  • Explicit and implicit synchronization
    02:31
  • CUDA events and timing with CUDA events
    06:03
  • Creating inter stream dependencies with events
    04:31

  • Preview04:01
  • Floating point operations
    06:46
  • Standard and Instrict functions
    08:29
  • Atomic functions
    08:22

  • Scan algorithm introduction
    05:38
  • Simple parallel scan
    08:24
  • Work efficient parallel exclusive scan
    09:33
  • Work efficient parallel inclusive scan
    07:41
  • Parallel scan for large data sets
    04:52
  • Parallel Compact algorithm
    07:49

  • Introduction part 1
    08:04
  • Introduction part 2
    11:41
  • Digital image processing
    09:39
  • Digital image fundametals : Human perception
    11:10
  • Digital image fundamentals : Image formation
    15:22
  • OpenCV installation
    06:28

Requirements

  • Basic C or C++ programming knowladge
  • How to use Visual studio IDE
  • CUDA toolkit
  • Nvidia GPU

Description

This course is all about CUDA programming. We will start our discussion by looking at basic concepts including CUDA programming model, execution model, and memory model. Then we will show you how to implement advance algorithms using CUDA. CUDA programming is all about performance. So through out this course you will learn multiple optimization techniques and how to use those to implement algorithms. Also we will extensively discuss profiling techniques and some of the tools including nvprof, nvvp, CUDA Memcheck, CUDA-GDB tools in the CUDA toolkit. This course contains following sections.

                                             Introduction to CUDA programming and CUDA programming model

                                             CUDA Execution model

                                             CUDA memory model-Global memory

                                             CUDA memory model-Shared and Constant memory

                                             CUDA streams

                                             Tuning CUDA instruction level primitives

                                             Algorithm implementation with CUDA

                                             CUDA tools

With this course we include lots of programming exercises and quizzes as well. Answering all those will help you to digest the concepts we discuss here.

This course is the first course of the CUDA master class series we are current working on. So the knowledge you gain here is essential of following those course as well.

Who this course is for:

  • Any one who wants to learn CUDA programming from scartch to intermidiate level

Featured review

Rohit Singh
Rohit Singh
7 courses
4 reviews
Rating: 5.0 out of 5a year ago
The course is really well structured and covers the concepts at just the right pace and with the right combination of theoretical background and implementation in practice. This is a good place to start learning parallel programming.

Instructor

Kasun Liyanage
Software engineer & founder of intellect, co founder at cpphive
Kasun Liyanage
  • 4.3 Instructor Rating
  • 1,633 Reviews
  • 12,970 Students
  • 3 Courses

Software engineer with years of experience in industry with c++ and java programming language. And entrepreneur and founder of intellect. creator of GPU MLIB library which provides GPU optimized parallel implementation of machine learning algorithms. My current project include fashion design framework which allows user to get the live fit on room experience. I am graduate on electrical and information engineering and i currently reading for master in artificial intelligence. 

  • Udemy for Business
  • Teach on Udemy
  • Get the app
  • About us
  • Contact us
  • Careers
  • Blog
  • Help and Support
  • Affiliate
  • Terms
  • Privacy policy
  • Cookie settings
  • Sitemap
  • Featured courses
Udemy
© 2021 Udemy, Inc.