
In this lecture:
Introduction and course overview
Parsing pipeline
Tokenizer, tokens
Abstract Syntax Trees (AST)
Static/Compile time vs. Runtime
Interpreters vs. Compilers
AST-interpreters (recursive)
Bytecode-interpreters
Ahead-of-time (AOT) compilers
Just-in-time (JIT) compilers
Transpilers
You will learn:
AST interpreters
AST explorer
JavaScript AST example
Bytecode interpreters
Stack machine vs. Register machine
Stack VM bytecode evaluation example
Compiler explorer
Java bytecode example
Python bytecode example
Register VM bytecode evaluation example
You will learn:
Ahead-of-time (AOT) compiler
Just-in-time (JIT) compiler
LLVM
Intermediate representation (IR)
Bytecode
Clang
x64 Assembly
AST-transformer (Transpiler)
You will learn:
AST formats
S-expression
Eva programming language
Eval -- the core of an interpreter
Statement vs. Expression
Closures
Lambda functions
IILE: Immediately-invoked lambda expressions
You will learn:
Eval -- the core of an interpreter
Self-evaluating expressions
Numbers and Strings
TDD (Test-driven development)
Addition operation
Assignments:
Implement and fix complex addition: (+ (+ 2 3) 5)
You will learn:
Recursive nature of the interpreter
Environment: repository of variables
Environment Record
Define, Assign, Lookup
The Global Environment
Example: environments in JavaScript: http://dmitrysoshnikov.com/ecmascript/javascript-the-core-2nd-edition/#environment
You will learn:
Blocks: groups of expressions
Block scope
Identifier resolution
Scope chain traversal
Variable assignment
You will learn:
Program control-flow
Branches: if-expression
Cycles: while-loops
Assignments:
Implement ++ operator: (++ foo)
Implement logical operators: (or foo default), (and x y), (not value)
You will learn:
S-expressions
BNF (Backus-Naur form) grammars
Syntax tool: language-agnostic parser generator
Atoms and Lists
LALR(1) parser creation
Assignments:
Add support for floating point numbers parsing: (+ 0.5 1.42)
You will learn:
Built-in functions
Native functions
Function calls
Global environment
You will learn:
User-defined functions
Closures: capturing environment
Function calls
Activation Environments and Activation Records
Static scope vs. Dynamic scope
Free variables
You will learn:
Lambda functions
Anonymous function expressions
Callbacks
IILE: Immediately-involved lambda expressions
Function declaration
Syntactic sugar
You will learn:
Call-stack
Recursive calls
Heap-environments
Execution context
Closures
Tail calls
Assignments:
Implement Execution context stack. A stack frame is pushed every time a function is called. The stack should contain a reference to the Environment, and store the called function name
Implement print_stack_trace function which prints the all stack frames info with function names
You will learn:
Transpilers
Nicer syntax, same semantics
Switch-expression
For-loops
JIT-translation
Assignments:
Implement for-loop translation to while-loop (see description in the lecture)
Implement ++, --, +=, -=, etc syntactic sugar operators, translating to get / set instructions
You will learn:
Object-oriented programming
Class-based semantics
Prototype-based semantics
Named environments
Instances
New expression
Property access
You will learn:
Class inheritance
Child and Parent classes
Super calls
You will learn:
Module expressions
Named environments
First-class module objects
Import and export expressions
Property access
Assignments:
Implement named imports: (import (abs, square) Math)
Implement module exports: (exports (abs, square))
Implement inline exports: (export (def square (x) (* x x)))
You will learn:
Eva interpreter
CLI support
Evaluating expressions and files
Further data structures
Objects, prototypes, lists, async functions
Eva specification
Instructions format, examples and spec
Building a Virtual Machine
Assignments:
Implement Eva REPL: read any expression, evaluate, print result, loop back to reading
How programming languages work under the hood? What’s the difference between compiler and interpreter? What is a virtual machine, and JIT-compiler? And what about the difference between functional and imperative programming?
There are so many questions when it comes to implementing a programming language!
The problem with “compiler classes” in school is they usually are presented as some “hardcore rocket science” which is only for advanced engineers.
Moreover, classic compiler books start from the least significant topic, such as Lexical analysis, going right away deep down to the theoretical aspects of formal grammars. And by the time of implementing a first Tokenizer module, students simply lose an interest to the topic, not having a chance to actually start implementing a programing language itself. And all this is spread to a whole semester of messing with tokenizers and BNF grammars, without understanding an actual semantics of programming languages.
I believe we should be able to build and understand a full programming language semantics, end-to-end, in 4-6 hours — with a content going straight to the point, showed in live coding session as pair-programming, and described in a comprehensible way.
In the Essentials of Interpretations class we focus specifically on runtime semantics, and build a interpreter for a programming language very similar to JavaScript or Python.
Implementing a programing language would also make your practical usage level of other programming languages more professional.
Who this class is for?
This class is for any curious engineer, who would like to gain skills of building complex systems (and building a programming language is really a pretty advanced engineering task!), and obtain a transferable knowledge for building such systems.
If you are interested specifically in compilers, interpreters, and source code transformation tools, then this class is also for you.
The only pre-requisite for this class is basic data structures and algorithms: trees, lists, traversal.
What is used for implementation?
Since we build a language very similar in semantics to JavaScript or Python (the two most popular programming languages today) we use specifically JavaScript — its elegant multi-paradigm structure which combines functional programming, class-based, and prototype-based OOP fits ideal for that.
Many engineers are familiar with JavaScript so it should be easier to start coding right away. However in implementation we don’t rely on too specific to JS constructs, and the code from the course is easily portable to TypeScript, Python, Java, C++, Rust, and any other language of your taste.
Note: we want our students to actually follow, understand and implement every detail of the interpreter themselves, instead of just copy-pasting from final solution. The full source code for the language is available in video lectures, showing and guiding how to structure specific modules.
What’s specific in this class?
The main features of these lectures are:
Concise and straight to the point. Each lecture is self-sufficient, concise, and describes information directly related to the topic, not distracting on unrelated materials or talks.
Animated presentation combined with live-editing notes. This makes understanding of the topics easier, and shows how (and when at time) the object structures are connected. Static slides simply don’t work for a complex content.
Live coding session end-to-end with assignments. The full source code, starting from scratch, and up to the very end is presented in the video lectures class. In the course we implement a full AST interpreter for our programming language.
Reading materials
As further reading and additional literature for this course the following books are recommended:
Structure and Interpretation of Computer Programs (SICP) by Harold Abelson and Gerald Jay Sussman
Programming Languages: Application and Interpretation (PLAI) by Shriram Krishnamurthi
What is in the course?
The course is divided into four parts, in total of 18 lectures, and many sub-topics in each lecture. Please address curriculum for detailed lectures descriptions.
PART 1: COMPILERS CRASH COURSE
In this part we describe different compilation and interpretation pipelines, see the difference between JIT-compilers and AOT-compilers, talk about what is a Virtual machine and Bytecode-interpreter, and how it difference from an AST-interpreter, show examples of native code, and LLVM IR, and other topics.
PART 2: INTERPRETERS: BASIC EXPRESSIONS AND VARIABLES
In this part we start building our programming language, and consider basic expressions, such as numbers, strings, talk about variables, scopes, and lexical environments, control structures, and touching parser generator.
PART 3: FUNCTIONS AND FUNCTIONAL PROGRAMMING
In this part we start talking and implementing function abstraction, and function calls. We describe concept of closures, lambda function, and IILEs (Immediately-invoked lambda expressions). In addition, we touch topics of Call-stack, recursion, and syntactic sugar.
PART 4: OBJECT-ORIENTED PROGRAMMING
The final part of the course is devoted to the object-oriented support in our language. We describe the class-based, and prototype-based approaches, implement concept of classes, instance and modules.
I hope you’ll enjoy the class, and will be glad to discuss any questions and suggestion in comments.
- Dmitry Soshnikov