
Illustrate a memory race in kernel programming, where a function returns a pointer; two processes see null and allocate memory, overwriting each other and causing memory loss; session discusses remedies.
Explore concurrency and context switching, distinguishing the illusion of parallelism on a single core from true parallelism on multi-core systems, and learn to identify cores and processor usage.
Explain how multiprocessors evolved from private per-CPU operating systems to a single symmetric multiprocessing kernel with per-region locks, solving system-call bottlenecks and avoiding the big kernel lock.
Preemption forcefully switches running processes between user space and kernel space, making user programs preemptible and avoiding kernel lockups when loops run in kernel space with the config_preempt option.
Trigger kernel preemption when returning to kernel space from an interrupt handler, or when a kernel task calls schedule or blocks and calls schedule, causing a context switch.
Explore the kernel control path, system calls and interrupts, and how the linux kernel remains re-entrant to support concurrent kernel mode execution on a uniprocessor, using locking for shared data.
Learn how synchronization prevents race conditions in the Linux kernel by protecting global data in critical regions, with examples of non re-entrant functions and preemption.
Identify the main causes of concurrency in the Linux kernel, including interrupts, softirq and tasklets, preemption, sleeping, and symmetric multiprocessing.
Determine the maximum number of CPUs the smp kernel can support using the nr_cpus variable and kernel configuration, override it with a kernel parameter, and check online CPUs via num_online_cpus.
Identify the processor running the kernel control path by using SNP_processor_id to obtain the current processor number, print it, and verify it against user-space observations.
Explore how a Linux kernel thread prints its processor id across init, thread function, and exit, revealing scheduler-driven switching between processors.
Explore how a uniprocessor system handles kernel and user processes through a linux kernel module example, illustrating proc cpuinfo shows one processor, and how scheduling and preemption enable apparent multi-tasking.
Explore per CPU variables as a simple, efficient synchronization technique by giving each CPU its own array element, preventing race conditions and aligning with main memory to avoid cache issues.
Explore how per CPU variables are implemented with a proc file per CPU, using read and write handlers, get_cpu and put_cpu to disable and enable preemption while updating values.
The 2.6 kernel introduces a per-cpu interface to simplify per-cpu data. It explains compile-time static definitions, get and put interfaces, preemption control, and l-values.
Define a per-CPU variable, assign an initial value of five, disable kernel preemption to obtain an lvalue, increment it, re-enable preemption, and verify the value increments from five to six.
Use per cpu variables and cpu id to access other processors with locking, initialize per cpu counters, and increment to ten while reflecting updates across online cpus.
Allocate per cpu data at runtime using the per cpu wrapper, returning a void pointer for each processor. Access the data by the pointer, disabling preemption and then re-enabling it.
Explore the problems with per-cpu variables, including lack of protection against asynchronous functions and interrupt handler interactions. Apply additional synchronization primitives to safely share data across interrupt handlers and CPUs.
Two kernel threads increment a shared global variable using non-atomic read-modify-write operations, causing race conditions and inconsistent results due to non-atomic memory access and bus arbitration.
Learn how atomic operators in the Linux kernel ensure race-free read-modify-write sequences using atomic_t, atomic.h, and lock instructions, with differences on SMP versus uniprocessor kernels.
Explore atomic operations in Linux kernel programming, using atomic.h macros such as atomic_init, atomic_inc, atomic_dec, atomic_set, atomic_read, atomic_add, and atomic_sub to ensure read-modify-write safety across CPUs.
Understand architecture-dependent atomic operations in the Linux kernel, including decrement, increment, subtract, and add with test variants, illustrated through atomic.h usage in arch and include directories.
Explore atomic add, subtract and return to perform modification and read the latest value in a single atomic call, building on atomic add, increment, decrement, and test APIs.
Explore 64-bit atomic operations in Linux kernel programming, using atomic64_t and atomic_64 APIs, with hashed spinlocks as fallback for unsupported architectures and applications in the performance counter subsystem.
Explore how the Linux kernel provides architecture-specific atomic bitwise operations via header implementations, using generic pointers to set, clear, and toggle bits (0–31 or 0–63) with practical examples.
Explore atomic bitwise operations in linux kernel programming, including test and set bit, test and clear bit, and test and change bit, all returning the old value (0 or 1).
Explore non-atomic bitwise operations in Linux kernel programming, comparing non-atomic versions to atomic operations to understand when non-atomic may be faster, depending on processor single instruction cycles.
Limit atomic operations to word or doubleword sizes; custom structures or shared data cannot be updated atomically, while spin locks protect short critical sections by allowing only one CPU.
Learn how spinlocks in the Linux kernel protect short critical sections to ensure atomicity and prevent race conditions by using spin_lock and spin_unlock with spinlock_t.
learn how to initialize a spinlock at runtime using malloc, choosing between static or dynamic allocation. initialize to unlock, then lock, unlock, and free the memory.
Demonstrates a spinlock controlling access to a shared counter between two kernel threads, forming a critical section where only one CPU updates the counter at a time.
In the Linux kernel, spinlocks are not recursive; acquiring a spinlock already held by the same CPU causes busy spinning and deadlock, potentially stalling the CPU.
Explore implementing a busy loop using a spinlock in a character device driver to guard a shared buffer with lock in open and unlock in Linux kernel programming.
Use spinlocks with irqsave to protect resources shared by process and interrupt contexts; disable interrupts, enter a critical section, and restore prior interrupt state after unlock.
Learn how kernel preemption interacts with spinlocks: preemption is disabled in the spinlock critical region and re-enabled on unlock, with uniprocessor and multiprocessor implications.
This lecture demonstrates calling sleep (msleep) inside a spinlock with preemption disabled, explaining potential deadlock and why sleep in a spinlock is not recommended.
Explain spinlock behavior on uniprocessor systems, showing how preemption settings turn spinlocks into empty operations, especially when used between interrupts or in process context.
Explain how spinlocks implement mutual exclusion using a lock bit in a two-state model (locked and unlocked), with busy-wait loops and architecture-specific atomic operations in the linux code.
Learn how a Linux kernel semaphore uses an integer value and two operations, P and V, to control entry into critical sections; P blocks when zero, and V wakes waiters.
Explore whether counting semaphores can be used in a critical section and learn that binary semaphores are used in the kernel for mutual exclusion.
Learn the Linux semaphore API and its kernel implementation, including struct semaphore with a spinlock, usage count, and wait list, plus dynamic and static initialization using down and up.
Allocate memory for a semaphore with malloc and initialize it to one to create a binary semaphore. Decrement to enter the critical section and up to end it.
Linux kernel module example using semaphore down and up to control access to a critical region, showing initialization, decrementing and incrementing the count, and printed values.
Demonstrate a Linux kernel module example of calling down twice on a binary semaphore, managing entry to a critical region and queuing when blocked, and examining uninterruptible task state.
Explore the distinction between down and down_interruptible in linux kernel synchronization: how interruptible sleep lets a waiting process receive signals and return, unlike uninterruptible sleep.
Explore the semaphore down_trylock API: it acquires when available and returns non-zero if not. Starting with value one, down makes it zero, and a second down blocks.
Explore how down_killable restricts signal delivery to fatal signals in Linux kernel programming, while down_interruptible allows any signal to be delivered, as shown with SIGKILL.
Use semaphores for long-held locks, not short ones, due to queueing and sleeping overhead. They cannot run in interrupt context, do not disable preemption, and offer better utilization than spinlocks.
Demonstrates how to use a mutex to protect a critical section by including the header, allocating, initializing, locking, unlocking, and freeing the mutex.
Illustrate a Linux kernel module example using the mutex API with static initialization. Defining a mutex eliminates extra initialization steps, demonstrating how static initialization secures synchronization.
Explore why recursive mutex locks trigger errors in Linux kernel synchronization, inspect debug mutexes, spinlocks, and semaphore behavior, and understand the owner field role in mutex structures.
Explore the mutex is locked api that reports whether a mutex is locked or unlocked and helps avoid recursive mutex before locking. Return 1 when locked, 0 when unlocked.
Compare spinlocks and mutexes by evaluating logging overhead and lock hold time, and decide based on interrupt context and whether sleeping is required; spinlocks are needed in interrupt contexts.
Demonstrates a mutex-protected shared counter in a proc file with read and write operations, showing how a single lock blocks concurrent reads.
Introduce the read-write spinlock API, including rw_lock_t structures and read_lock/read_unlock and write_lock/write_unlock, and compare it to spinlock reader and writer variants.
Demonstrate a Linux kernel module using a rw spinlock with two readers and one writer, illustrating read and write locks, contention, and the impact of delays.
Acquire a read lock then request a write lock and see that upgrading does not occur here, causing a deadlock as the write lock waits for the reader to unlock.
Explore how a read lock allows multiple readers to proceed while a writer waits for exclusive access, illustrating first-in, first-out fairness and writer starvation avoidance in the Linux kernel.
Illustrates a Linux kernel module example using rwlocks with three kernel threads performing read and write locks, showing how lock contention and access order evolve during sleeps.
Examine how four kernel threads contend for read and write access with rwlocks, highlighting non-deterministic ordering due to spin locks and the observed race in lock acquisition.
Explore how read-write semaphores in the Linux kernel use binary mutexes for writer-only mutual exclusion, with the rw semaphore structure and an initial zero value.
Demonstrates how read and write locks are acquired and released using down_read, up_read, down_write, up_write, with upgrade and downgrade, including an uninterruptible sleep when locking.
Learn how down_read_trylock and down_write_trylock in Linux kernel synchronization work: they return 1 when the lock is acquired, unlike the normal semaphore behavior which returns 0 on success.
Downgrade_write converts an acquired write lock to a read lock, enabling a quick write followed by longer read access in Linux kernel synchronization.
Enable fast, lock-free access for many readers; sequence locks added in Linux 2.6 permit writers to modify data during reads, while readers verify data validity with a sequence counter.
Learn how a sequence lock uses a sequence counter and a spinlock to sync readers and writers, with spinlocks for writes and an initial zero value.
Explain how a write operation uses a write sequence lock to ensure mutual exclusion with a spin lock, incrementing the sequence number on both lock and unlock, starting from zero.
Explore the Linux kernel sequence lock concept, using a sequence counter to guard reads and writes, illustrating retry on data invalidation and the behavior without locks.
Explore how sequence locks protect a 64-bit uptime value in the Linux kernel, with examples showing their use across kernel time operations and related functions.
Explore sequence locks and their interrupt-ready variants, and learn how to handle contexts that run in both interrupts and process context by saving and restoring state.
Discover read-copy-update (RCU) in the Linux kernel: enable a single writer to update pointer-based data structures without blocking readers, by copying the structure and swapping the pointer after readers finish.
Demonstrates lock-free linked-list deletion by bypassing node B (set A's next to C), waiting for readers to move on, and safely freeing B after no readers remain.
Demonstrate read and write threads using RCU to update a global pointer, with read side critical section and no locks, illustrating RCU’s speed.
Explore rcu memory management: removal and reclamation occur after pre-existing readers finish, with reads inside rcu read side critical sections and pointers updated by rcu assign.
Block the calling process until all pre-existing read-side critical sections on all CPUs complete, then safely reuse or remove the old memory after synchronize_rcu returns.
Use call_rcu to defer work until all read-side critical sections finish. Embed an rcu head, register rcu_free with call_rcu, and use container_of.
Shows that RCU read-side critical sections can be nested when there is no blocking or sleeping, using memory barriers instead of locks.
Explore how synchronize_rcu works internally by tracking completion of read side critical sections through read lock, read unlock, and preemption, using context switches as the signal that readers finished.
Learn how to implement a lock-free linked list using RCU variants for list APIs, addressing race conditions between readers and writers with RCU assigned pointers.
Explore synchronization in Linux kernel programming by comparing rcu and sequence lock, discussing readers and writers, grace period, copy operations, and memory costs.
Update: Sep 15: Added RCU Section
What you will learn in this course
Various concepts related to concurrency like: preemption, context switch, reentrancy, critical section, race condition
Various Synchronization techniques
Per CPU Variables
Atomic Variables
Spinlocks
Semaphores
Mutexes
Read Write Locks
Sequence Locks
Read Copy Update(RCU)
API's/Macros/Structures:
spinlock_t, DEFINE_SPINLOCK, spin_lock, spin_unlock, spin_trylock, spin_lock_irqsave, spin_unlock_irqrestore,spin_lock_irq, spin_unlock_irq
atomic_t, atomic64_t, ATOMIC_INIT, atomic_inc, atomic_dec, atomic_set, atomic_read, atomic_add, atomic_sub,
atomic_dec_and_test, atomic_inc_and_test, atomic_sub_and_test, atomic_add_negative,atomic_add_return, atomic_sub_return, atomic_inc_return, atomic_dec_return,atomic_fetch_add, atomic_fetch_sub, atomic_cmpxchg, atomic_xchg,set_bit, clear_bit, change_bit, test_and_set_bit, test_and_clear_bit, test_and_change_bit,
NR_CPUS,num_online_cpus,smp_processor_id,get_cpu,put_cpu,DEFINE_PER_CPU,get_cpu_var, put_cpu_var, per_cpu, for_each_online_cpu, alloc_percpu, free_percpu, per_cpu_ptr
rcu_read_lock, rcu_read_unlock, synchronize_rcu, call_rcu, rcu_assign_pointer, rcu_dereference
seqlock_t, seqcount_t, DEFINE_SEQLOCK, seqlock_init, write_seqlock, write_sequnlock
struct rw_semaphore, DECLARE_RWSEM, init_rwsem, down_read, up_read, down_write, up_write, down_read_trylock, down_write_trylock, downgrade_write
struct rwlock_t, DEFINE_RWLOCK, rwlock_init, read_lock, read_unlock, write_lock, write_unlock
struct mutex, DEFINE_MUTEX, mutex_init, mutex_lock, mutex_unlock, mutex_trylock, mutex_lock_interruptible, mutex_unlock_interruptible, mutex_is_locked
struct semaphore, sema_init, DEFINE_SEMAPHORE, down, up, down_interruptible, down_trylock, down_timeout, down_killable
Commands used in the course
nproc
ps -eaF
ps aux