
What this course covers (and what it doesn’t)
How developers should approach AI security
Real-world consequences of insecure GenAI
Tokens, context windows, inference
Why LLMs are probabilistic, not logical
Non-determinism explained with examples
Comparison: Traditional software vs LLM interface
Why validation, auth, and logic checks break
A new foundation for AI Security
Prompt-only apps
RAG systems
Tool-using agents
Autonomous agents
Prompt → Model → Tool → Action
Where humans exit the loop
Hidden control paths
User input vectors
Model behavior vectors
Tool & data vectors
Direct vs indirect injection
Why “ignore previous instructions” works
System, developer, user, tool messages
How hierarchy collapses
System prompt leakage
Policy bypass
Tool override
Chunking, embeddings, retrieval
Where security breaks
Data poisoning
Malicious document injection
Context hijacking
Filtering & validation
Metadata enforcement
Context boundaries
Recognize emergent agent behavior
Why agents behave unexpectedly
Collusion
Recursive loops
Goal hijacking
Action budgets
Execution limits
Kill switches
Training vs inference
Memorization risks
Privacy failures
Prompt & output risks
GDPR, CCPA
AI regulations overview
Chain of custody
Fine-tuning risks
Backdoors
Model versioning
Vendor risk vectors
Smoke bomb test
AI-First Threat Modeling (20 min)
AIA STRIDE framework
Abuse cases vs misuse cases
Prompt isolation
Deterministic outputs
Defense-in-depth for AI
A Prompt Injection Vulnerability occurs when user prompts alter the LLM’s behavior or output in unintended ways. These inputs can affect the model even if they are imperceptible to humans, therefore prompt injections do not need to be human-visible/readable, as long as the content is parsed by the model.
Sensitive information can affect both the LLM and its application context. This includes personal identifiable information (PII), financial details, health records, confidential business data, security credentials, and legal documents. Proprietary models may also have unique training methods and source code considered sensitive, especially in closed or foundation models.
LLM supply chains are susceptible to various vulnerabilities, which can affect the integrity of training data, models, and deployment platforms. These risks can result in biased outputs, security breaches, or system failures. While traditional software vulnerabilities focus on issues like code flaws and dependencies, in ML the risks also extend to third-party pre-trained models and data.
These external elements can be manipulated through tampering or poisoning attacks.
Data poisoning occurs when pre-training, fine-tuning, or embedding data is manipulated to introduce vulnerabilities, backdoors, or biases. This manipulation can compromise model security, performance, or ethical behavior, leading to harmful outputs or impaired capabilities. Common risks include degraded model performance, biased or toxic content, and exploitation of downstream systems.
Improper Output Handling refers specifically to insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream to other components and systems. Since LLM-generated content can be controlled by prompt input, this behavior is similar to providing users indirect access to additional functionality. Improper Output Handling differs from Overreliance in that it deals with LLM-generated outputs before they are passed downstream whereas Overreliance focuses on broader concerns around overdependence on the accuracy and appropriateness of LLM outputs. Successful exploitation of an Improper Output Handling vulnerability can result in XSS and CSRF in web browsers as well as SSRF, privilege escalation, or remote code execution on backend systems. The following conditions can increase the impact of this vulnerability:
Excessive Agency is the vulnerability that enables damaging actions to be performed in response to unexpected, ambiguous or manipulated outputs from an LLM, regardless of what is causing the LLM to malfunction. Common triggers include:
hallucination/confabulation caused by poorly-engineered benign prompts, or just a poorly-performing model;
direct/indirect prompt injection from a malicious user, an earlier invocation of a malicious/compromised extension, or (in multi-agent/collaborative systems) a malicious/compromised peer agent.
The system prompt leakage vulnerability in LLMs refers to the risk that the system prompts or instructions used to steer the behavior of the model can also contain sensitive information that was not intended to be discovered. System prompts are designed to guide the model’s output based on the requirements of the application, but may inadvertently contain secrets. When discovered, this information can be used to facilitate other attacks.
Vectors and embeddings vulnerabilities present significant security risks in systems utilizing Retrieval Augmented Generation (RAG) with Large Language Models (LLMs). Weaknesses in how vectors and embeddings are generated, stored, or retrieved can be exploited by malicious actions (intentional or unintentional) to inject harmful content, manipulate model outputs, or access sensitive information.
Retrieval Augmented Generation (RAG) is a model adaptation technique that enhances the performance and contextual relevance of responses from LLM Applications, by combining pre-trained language models with external knowledge sources.Retrieval Augmentation uses vector mechanisms and embedding. (Ref #1)
Unbounded Consumption refers to the process where a Large Language Model (LLM) generates outputs based on input queries or prompts. Inference is a critical function of LLMs, involving the application of learned patterns and knowledge to produce relevant responses or predictions.
Attacks designed to disrupt service, deplete the target’s financial resources, or even steal intellectual property by cloning a model’s behavior all depend on a common class of security vulnerability in order to succeed. Unbounded Consumption occurs when a Large Language Model (LLM) application allows users to conduct excessive and uncontrolled inferences, leading to risks such as denial of service (DoS), economic losses, model theft, and service degradation. The high computational demands of LLMs, especially in cloud environments, make them vulnerable to resource exploitation and unauthorized usage.
Generative AI has changed how software is built — but it has also introduced entirely new security failures that traditional AppSec and cloud security models were never designed to handle.
This course is a deep, hands-on journey into the real security risks of modern GenAI systems, from prompt injection and RAG poisoning to tool abuse and autonomous agent failures. It is designed for software engineers, security engineers, architects, and AI practitioners who need to move beyond theory and understand how GenAI systems actually fail in production — and how to secure them properly.
Unlike high-level AI safety courses, this program is practical, adversarial, and systems-focused. You’ll break real GenAI workflows, observe emergent failures, and then implement concrete defenses using industry-aligned patterns.
By the end of this course, you won’t just understand GenAI security — you’ll know how to design, test, and govern AI systems safely at scale.
What You’ll Learn
Core Concepts
Why GenAI security is fundamentally different from traditional AppSec
How non-determinism breaks existing security assumptions
Where trust boundaries actually exist in AI systems
Why “prompt security” alone is insufficient
Hands-On Skills
Exploit prompt injection and instruction hierarchy failures
Poison RAG pipelines and observe real-world impact
Abuse tool calling and function execution
Trigger unintended behavior in multi-agent systems
Implement real mitigations using policies, constraints, and governance
Defensive Architecture
Secure RAG design patterns
Tool and function authorization models
Agent guardrails and bounded autonomy
Policy enforcement outside the model
Safe failure and human-in-the-loop design
What Makes This Course Different
Hands-on labs, not slides
Real failure modes, not hypothetical risks
Agentic AI coverage (rare and critical)
Security-first design mindset
Aligned with OWASP LLM Top 10 & MAESTRO
Built for production engineers, not researchers
Each week includes:
Conceptual video lessons
Attack walkthroughs
Jupyter-based labs
Defensive redesigns
Reflection and threat modeling exercises
Who This Course Is For
Software Engineers building AI-powered applications
Security Engineers responsible for AI risk
AI/ML Engineers deploying LLM systems
Architects designing agent-based workflows
Security leaders evaluating GenAI risk exposure
No prior AI security experience required — but comfort with APIs and basic Python is recommended.
Final Outcome
After completing this course, learners will be able to:
Identify real GenAI security risks
Design secure AI architectures
Prevent prompt, RAG, and tool-based attacks
Safely deploy agentic systems
Evaluate AI products with a security-first lens