Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
AI Engineer Core Track: LLM Engineering, RAG, QLoRA, Agents
Bestseller
Highest Rated
Rating: 4.7 out of 5(36,372 ratings)
257,914 students

AI Engineer Core Track: LLM Engineering, RAG, QLoRA, Agents

Become an LLM Engineer in 8 weeks: Build and deploy 8 LLM apps, mastering Generative AI, RAG, LoRA and AI Agents.
Last updated 6/2026
English

What you'll learn

  • Project 1: Make AI-powered brochure generator that scrapes and navigates company websites intelligently.
  • Project 2: Build Multi-modal customer support agent for an airline with UI and function-calling.
  • Project 3: Develop Tool that creates meeting minutes and action items from audio using both open- and closed-source models.
  • Project 4: Make AI that converts Python code to optimized C++, boosting performance by 60,000x!
  • Project 5: Build AI knowledge-worker using RAG to become an expert on all company-related matters.
  • Project 6: Capstone Part A – Predict product prices from short descriptions using Frontier models.
  • Project 7: Capstone Part B – Execute Fine-tuned open-source model to compete with Frontier in price prediction.
  • Project 8: Capstone Part C – Build Autonomous multi agent system collaborating with models to spot deals and notify you of special bargains.
  • Compare and contrast the latest techniques for improving the performance of your LLM solution, such as RAG, fine-tuning and agentic workflows
  • Weigh up the leading 10 frontier and 10 open-source LLMs, and be able to select the best choice for a given task

Course content

8 sections210 lectures33h 27m total length
  • Day 1 - Running Your First LLM Locally with Ollama and Open Source Models10:44

    Fully refreshed as of 2026.

    __

    If you want to learn:


    - How to run LLMs locally on your own computer without relying on cloud services?

    - What is Ollama and how can you use it to deploy AI models on your machine?

    - How to install and interact with open source LLM models like Gemma, Phi-3, and GPT-OSS?

    - What are the differences between running small and large language models locally?

    - How to get started with local AI using simple command line tools?

    - How to set up your first local LLM workflow in just minutes?


    Then this lecture is for you!


    This hands-on lecture teaches you how to run LLMs locally using Ollama, a user-friendly tool that lets you deploy AI models directly on your machine. You'll learn how to install Ollama on Mac, PC, or Linux, and immediately start running open source models without any complex setup. The lecture walks you through downloading and running multiple LLM models of different sizes, from Google's lightweight Gemma 3.0 (270M parameters) to Microsoft's Phi-3 and larger models like GPT-OSS. You'll discover how to use the command line to interact with local AI, understand the differences between smaller models and larger models, and learn which models work best on consumer hardware. By the end of this lecture, you'll be able to run llms locally, experiment with different specialized models including multimodal models, and understand how to optimize your local llm workflow. This practical introduction to running local llms with Ollama provides the foundation for managing models on your local machine and building AI applications that don't depend on external APIs like OpenAI. Perfect for getting started with local AI development using Ollama's straightforward approach to deploying llms locally.

  • Day 1 - Spanish Tutor Demo with Open-Source Models & Course Overview8:54

    If you want to learn:


    How can I run powerful AI language models locally on my computer using Ollama?

    What's the difference between open source LLMs and frontier models like GPT-4o?

    How do I get started with local LLMs and build real-world AI applications?

    What will I learn in a comprehensive LLM engineering course over 8 weeks?

    How can I use local AI models to create practical applications like a Spanish tutor?

    What tools and frameworks do professional LLM engineers use in production?


    Then this lecture is for you!


    In this introductory lecture, you'll experience a live demonstration of Ollama building a Spanish tutor application, showcasing the practical capabilities of running LLMs locally with Ollama. You'll see firsthand how to use local AI models like Phi-3, Gemma, and LLaMA to create real conversational applications on your local machine. The lecture provides a comprehensive overview of an 8-week LLM engineering curriculum covering frontier models, open source models, model selection, RAG (Retrieval-Augmented Generation), fine-tuning techniques, and agentic AI workflows. You'll learn about essential tools and frameworks including Hugging Face, Gradio, LangChain, Weights & Biases, and Modal.com for deploying LLMs locally and in production. The course is designed for both beginners getting started with AI and experienced developers looking to optimize their workflow with specialized models. You'll discover how to install Ollama, run different models locally, use the Ollama API, and build commercial-grade multimodal applications using consumer hardware. The lecture emphasizes hands-on learning through building practical projects, from user-friendly consumer applications to technical implementations, preparing you to become a proficient LLM engineer capable of running and managing local LLMs effectively.

  • Your Path to Becoming a Proficient AI Engineer3:42
  • Day 1 - Setting Up Your LLM Development Environment with Cursor and UV5:54

    If you want to learn:


    How do I set up a professional LLM development environment for AI projects?

    What is Cursor IDE and how does it compare to VS Code for AI development?

    How do I configure OpenAI API keys and manage Python virtual environments?

    What is UV and why is it better than Anaconda for Python package management?

    How do I clone a GitHub repository and work with Jupyter Notebooks in Cursor?

    What are the common troubleshooting steps when setting up an LLM workflow?


    Then this lecture is for you!


    **SEO-Infused Lecture Description:**


    This lecture guides you through setting up a complete LLM development environment using Cursor IDE and UV package manager. You'll learn the five essential steps to configure your workspace for building AI solutions with frontier models like OpenAI, Anthropic Claude, Azure OpenAI, Gemini, and local models through Ollama.


    The setup process covers cloning the GitHub repository, installing and configuring Cursor IDE (an AI-powered alternative to VS Code), and using UV to manage your Python virtual environment with bulletproof dependency management. You'll create and configure your OpenAI API key, set up the .env file correctly to avoid common pitfalls, and prepare Jupyter Notebooks for data science and data analysis workflows.


    The lecture includes platform-specific instructions for both Windows and Mac users, comprehensive troubleshooting guidance for common issues like the Windows 260 character limit, and alternatives for working with LLMs without spending money. You'll also learn how to leverage AI tools like ChatGPT and Claude for debugging setup problems, navigate repository files, use the command palette in Cursor, and establish a professional coding workflow for the next eight weeks of LLM-powered development. Complete setup instructions and documentation are provided to ensure a smooth configuration experience.

  • Day 1 - Setting Up Your PC Development Environment with Git and Cursor9:50

    If you want to learn:


    How do I set up a PC development environment for AI and LLM projects?

    What is the difference between Git and GitHub, and how do I install them on Windows?

    How do I clone a repository from GitHub to my local computer?

    What is Cursor IDE and how does it compare to VS Code for AI-powered coding?

    How do I configure my workspace for working with OpenAI API, Jupyter Notebooks, and Python?

    What are common PC setup gotchas like VPN issues and the Windows 260 character limit?


    Then this lecture is for you!


    This lecture provides a complete walkthrough for PC users to set up their development environment for LLM engineering and AI projects. You'll learn how to install and configure Git on Windows using PowerShell, navigate the command line to create a Projects directory, and clone the LLM Engineering repository from GitHub to your local machine. The lecture covers installing Cursor IDE, an AI-powered development environment that's a fork of VS Code, and demonstrates how to properly open your project root to begin working with Python, Jupyter Notebooks, and AI tools. You'll discover essential troubleshooting tips for PC-specific issues including VPN conflicts and the Windows 260 character limit. The step-by-step workflow prepares you to work with OpenAI API, Anthropic, Azure OpenAI, and local models like Ollama for data science and AI development. By the end, you'll have a fully configured workspace ready for LLM-powered coding, data analysis, and building AI applications with proper repository files navigation and version control using Git and GitHub.

  • Day 1 - Mac Setup: Installing Git, Cloning the Repo, and Cursor IDE9:47

    If you want to learn:


    How do I install Git on Mac and clone a GitHub repository for AI development?

    What is Cursor IDE and how does it compare to VS Code for LLM engineering?

    How do I set up my Mac development environment for working with AI and LLMs?

    What are the best practices for organizing Python projects and repositories on macOS?

    How do I configure Cursor IDE for AI-powered coding with OpenAI and other LLM APIs?


    Then this lecture is for you!



    This lecture guides Mac users through the essential first steps of setting up a professional AI development environment for LLM engineering. You'll learn how to install Git, verify your installation using the terminal command line, and clone the LLM Engineering repository from GitHub to your local machine. The tutorial covers creating a proper project directory structure in your home folder while avoiding common pitfalls like iCloud synchronization issues with Desktop and Documents folders.


    You'll discover how to download and configure Cursor IDE, an AI-powered fork of VS Code that offers advanced features for working with OpenAI API, Anthropic, and other LLM providers. The lecture demonstrates how to open your cloned repository as a project in Cursor IDE and verify the correct folder structure containing all course materials from Week 1 through Week 8.


    The instructor provides detailed troubleshooting tips and references comprehensive setup instructions in the GitHub repository, including guides on Git vs GitHub, command line basics for beginners, and common gotchas specific to Mac users. You'll also learn about alternative IDEs like PyCharm and Windsurf, and understand how Cursor's AI features can enhance your workflow with Jupyter Notebooks and Python development for data science and AI tools integration.

  • Day 1 - Installing UV and Setting Up Your Cursor Development Environment7:52

    If you want to learn:


    How do I set up Cursor IDE for AI development and LLM engineering?

    What is UV and why is it better than Anaconda for Python package management?

    How do I install UV on Windows and Mac for my development workflow?

    How do I configure a terminal in Cursor and manage virtual environments?

    What are the fastest ways to set up a data science environment with Python?

    How do I sync dependencies and create a bulletproof development setup for AI projects?


    Then this lecture is for you!


    This lecture guides you through setting up your Cursor IDE development environment and installing UV, a fast Python package manager that's revolutionizing AI and data science workflows. You'll learn how to navigate the Cursor IDE interface, access repository files, and view Markdown documentation using the Explorer and Preview features. The lecture demonstrates how to open and manage multiple terminal windows within Cursor using keyboard shortcuts (Control + backtick), essential for efficient coding with AI tools.


    You'll discover how to install UV on both Mac and Windows systems, troubleshoot common installation issues, and verify your setup using command-line commands. The lecture covers the UV sync command, which builds your complete Python virtual environment with all necessary dependencies for LLM-powered development in minutes—dramatically faster than traditional tools like Anaconda. You'll understand why UV has become the standard for AI development, used by frameworks like CrewAI and MCP, thanks to its speed (written in Rust), reliability, and ease of use.


    By the end, you'll have a fully configured Cursor IDE workspace with a synchronized UV environment, ready for working with OpenAI API, Anthropic, Jupyter notebooks, and other AI and LLM tools. This setup provides the foundation for professional AI engineering, data analysis, and Python development workflows with local models and cloud-based APIs.

  • Day 1 - Setting Up Your OpenAI API Key and Environment Variables12:28

    If you want to learn:


    How do I set up an OpenAI API key for my development environment?

    What's the difference between ChatGPT and the OpenAI API?

    How do I configure environment variables in Cursor IDE for AI development?

    What are the steps to create and secure a .env file for API keys?

    How much does it cost to get started with OpenAI API for LLM projects?

    What should I do if my OpenAI payment gets declined?


    Then this lecture is for you!


    In this comprehensive tutorial, you'll learn how to set up your OpenAI API key and configure environment variables in Cursor IDE for LLM-powered development. This lecture walks you through creating an OpenAI platform account at platform.openai.com, understanding the pay-as-you-go billing model with a minimum $5 deposit, and troubleshooting common payment issues. You'll discover how to generate a secure API key through the OpenAI dashboard, create a properly formatted .env file in your project root, and safely store your credentials using the OPENAI_API_KEY environment variable. The tutorial covers essential security practices, including why Cursor IDE disables AI features for files containing secrets, and explains alternative options like using Gemini, Ollama, or Azure OpenAI for those who prefer free or different LLM providers. You'll also learn the critical distinction between ChatGPT as a product and the OpenAI API for developers, ensuring you understand the workflow for connecting your Python code to powerful AI models. By the end of this lecture, you'll have a fully configured development environment ready for building LLM applications with proper API authentication and secure credential management.

  • Day 1 - Installing Cursor Extensions and Setting Up Your Jupyter Notebook9:04

    If you want to learn:


    - How do I install Python and Jupyter extensions in Cursor IDE?

    - What's the difference between traditional coding and working with Jupyter notebooks?

    - How do I configure my Python environment and select the right kernel in Cursor?

    - What are the essential setup steps for starting LLM development with OpenAI API?

    - How can I troubleshoot common issues when setting up Jupyter notebooks in Cursor IDE?

    - What workflow should I follow to prepare my development environment for AI projects?


    Then this lecture is for you!


    In this hands-on lecture, you'll complete the final setup steps for your AI development environment by installing essential Cursor IDE extensions and configuring your first Jupyter notebook. You'll learn how to install Python extensions (available from both AnySphere and Microsoft) and the Jupyter extension through Cursor's extension marketplace. The lecture walks you through selecting and configuring the correct Python kernel (.venv with Python 3.12) for your virtual environment, ensuring your notebook is ready for LLM-powered development. You'll open your first .ipynb file, understand the structure of Jupyter notebooks and their cells, and learn best practices for working with these interactive data science tools. The lecture also covers essential troubleshooting resources and introduces you to the course's practical approach: building real LLM projects starting with a web scraping and summarization tool using OpenAI API. You'll discover how to navigate between code and formatted text in notebooks, access supplementary guides for Git, GitHub, command line, and Python foundations, and explore alternative options like Gemini, Azure OpenAI, Ollama, and local models for your AI workflow. By the end, you'll have a fully configured Cursor IDE workspace with VS Code-like functionality, ready to begin hands-on coding with AI tools and LLMs for practical data analysis and AI application development.

  • Day 1 - Running Your First OpenAI API Call and System vs User Prompts11:41

    If you want to learn:


    - How do I make my first OpenAI API call using Python?

    - What's the difference between system prompts and user prompts in prompt engineering?

    - How do I set up and use the OpenAI Python library with API keys?

    - What is the chat completions API and how does it work?

    - How can I control the tone and behavior of ChatGPT responses?

    - What are best practices for structuring messages in OpenAI API calls?


    Then this lecture is for you!


    In this hands-on lecture, you'll run your first OpenAI API call using the OpenAI Python library and master the fundamentals of prompt engineering. You'll start by setting up your development environment, configuring your API key securely using environment variables, and executing your first chat completion request to GPT models including gpt-5-nano.


    The lecture walks you through the essential structure of OpenAI API calls, demonstrating how to format messages as Python dictionaries with role and content parameters. You'll learn the critical distinction between system prompts and user prompts: system prompts frame the overall task, set the assistant's tone, and provide context, while user prompts contain the actual input from end users that the LLM responds to.


    Through practical examples, you'll see how different system prompts—from "helpful assistant" to "snarky assistant"—dramatically change the output and behavior of the same user message. The lecture includes a real-world use case of web scraping combined with AI text generation, where you'll use chat completions to analyze and summarize website content in Markdown format.


    You'll gain hands-on experience with key concepts including the chat completions API, token management, multi-turn conversations, JSON formatting, and parameter configuration. By the end, you'll understand how to structure effective prompts for various use cases and apply tips and tricks for working with the OpenAI API as a developer.

  • Day 1 - Building a Website Summarizer with OpenAI Chat Completions API10:14

    If you want to learn:


    - How do I use the OpenAI API to build real applications?

    - What is the Chat Completions API and how does it work?

    - How can I create effective system prompts and user prompts for GPT models?

    - What are the best practices for prompt engineering with OpenAI?

    - How do I structure messages for multi-turn conversations using the OpenAI Python library?

    - Can I build a website summarizer using GPT-4 and Python?


    Then this lecture is for you!


    In this hands-on lecture, you'll build a complete website summarizer application using the OpenAI Chat Completions API and Python. You'll learn how to structure messages as a list of dictionaries containing system prompts and user prompts, implement the openai.chat.completions.create method with proper parameters, and work with the GPT-4o-mini model for text generation. The lecture covers essential prompt engineering techniques, including how to craft effective system prompts that control the assistant's behavior and tone, format user messages with dynamic input content, and prevent unwanted output formatting. You'll discover best practices for using the OpenAI Python library, managing API keys securely, and handling tokens efficiently. The tutorial demonstrates practical use cases by fetching website contents, passing them as input to the chat completion endpoint, and processing the output for display. You'll also explore advanced concepts like adjusting prompts to change response tone (from helpful to snarky), applying the same techniques to different websites and use cases, and understanding how this fundamental pattern applies to real-world business applications including translation, content analysis, and data summarization. By the end, you'll have a working application and the knowledge to extend it for your own projects using the OpenAI API.

  • Day 1 - Hands-On Exercise: Building Your First OpenAI API Call from Scratch5:34

    If you want to learn:


    - How do I make my first OpenAI API call from scratch?

    - What's the difference between a system prompt and a user prompt in OpenAI?

    - How can I use the OpenAI API for real business tasks like email summarization?

    - What are the essential parameters needed for chat completions API?

    - How do I structure messages when using the OpenAI Python library?

    - What are practical tips and tricks for prompt engineering with GPT models?


    Then this lecture is for you!


    This hands-on exercise guides you through building your first OpenAI API call from the ground up. You'll learn to craft effective system prompts and user prompts for practical use cases like email summarization and subject line generation. The lecture walks you through structuring messages as a list of dictionaries with role and content parameters, implementing the openai.chat.completions.create method, and handling the response output. You'll gain experience with the OpenAI Python library, understand best practices for prompt engineering, and learn how to parse tokens from chat completion responses. The exercise includes practical examples of multi-turn conversations and text generation, with optional advanced challenges using Selenium or Playwright for web scraping integration. By the end, you'll have hands-on experience calling GPT models through the chat completions API, understanding input and output structures, and applying LLM capabilities to real-world developer tasks. The lecture also covers JSON formatting, API key usage, and few-shot prompting techniques for better results with ChatGPT and other OpenAI models.

  • Day 2 - LLM Engineering Building Blocks: Models, Tools & Techniques9:22

    If you want to learn:


    How do you choose the right LLM model for your specific use case?

    What are the essential building blocks every LLM engineer needs to master?

    Which tools and frameworks should you use to build production-ready LLM applications?

    What techniques like RAG, fine-tuning, and agentic AI can transform your AI projects?

    How can you go from beginner to LLM engineering master with a practical roadmap?


    Then this lecture is for you!


    This lecture covers the three core dimensions of LLM engineering: models, tools, and techniques. You'll learn how to recognize and select frontier models—both open-source and closed-source—for specific tasks, including multimodal models for image and audio generation. The session explores essential frameworks and libraries like Hugging Face, LangChain, Gradio, Weights & Biases, and Modell that power production-ready LLM applications.


    You'll discover practical techniques including API integration, multi-shot prompt engineering, retrieval augmented generation (RAG), fine-tuning, and agentic AI—the hottest topic in generative AI today. This hands-on session builds on Day 1's foundation with Ollama, OpenAI integration, and system versus user prompts, advancing your understanding of transformer architecture and LLM deployment strategies.


    The lecture provides a clear LLM engineer roadmap designed for intermediate Python developers, though complete beginners can succeed using the provided self-study guides. You'll learn best practices for building real-world AI applications with commercial impact, whether you're at a startup or Fortune 500 enterprise. The session emphasizes practical coding exercises, experimentation with embeddings and vector operations, and applying LLM inference techniques to solve concrete business problems. By following this roadmap, you'll develop the skills to evaluate, deploy, and scale large language models in production environments.

  • Day 2 - Your 8-Week Journey: From Chat Completions API to LLM Engineer2:51

    If you want to learn:


    - What is the complete roadmap to becoming an LLM engineer in just 8 weeks?

    - How do you progress from using the Chat Completions API to building production-ready AI applications?

    - What are the essential skills needed to work with large language models, from APIs to fine-tuning?

    - How can you build real-world LLM applications including RAG systems and AI agents?

    - What's the best way to transition from beginner to production-ready LLM engineer?


    Then this lecture is for you!


    This lecture provides a comprehensive overview of your 8-week journey to becoming an LLM engineer. You'll discover the complete roadmap that takes you from foundational concepts like the Chat Completions API through advanced topics including RAG (Retrieval Augmented Generation), fine-tuning, and agentic AI platforms.


    Week by week, you'll explore frontier model APIs and multimodality (Week 2), dive deep into open-source LLMs using Hugging Face and Ollama (Week 3), and learn how to select the right large language models for your projects (Week 4). The journey continues with building a knowledge worker expert using retrieval augmented generation techniques (Week 5), mastering data curation and preparation for ML applications (Week 6), and fine-tuning open source models for specific business tasks (Week 7).


    The specialization culminates in Week 8 with building a production-ready agentic platform that solves real-world commercial problems. Throughout this LLM engineer roadmap, you'll gain hands-on experience with Python, prompt engineering, embeddings, vector databases, transformer architecture, and deployment best practices. You'll receive a code cookbook with reusable components for your own AI applications, ensuring you're equipped with practical tools for building scalable, production-ready LLM solutions.

  • Day 2 - Frontier Models: OpenAI GPT, Claude, Gemini & Grok Compared6:14

    If you want to learn:


    • What are frontier models and how do they differ from open-source LLMs in 2025?

    • Which AI model is best for your needs: OpenAI GPT, Claude, Gemini, or Grok?

    • How do the top large language models compare in real-world performance and capabilities?

    • What makes Claude 3.7 Sonnet, GPT-5, Gemini 2.5 Pro, and Grok 3 stand out in the AI landscape?

    • Why are companies like OpenAI, Anthropic, Google, and x.ai leading the LLM revolution?

    • What's the difference between closed-source frontier models and open-source alternatives like LLaMA?


    Then this lecture is for you!


    This lecture provides a comprehensive comparison of the best LLMs in 2025, focusing on the four dominant frontier models that are reshaping the AI landscape. You'll explore OpenAI's GPT-5, understanding how it evolved from ChatGPT and the convergence of the O-series models into the latest release. The lecture delivers a detailed comparison of Claude 4.5 Sonnet by Anthropic, examining why this AI model has become a favorite among developers and its role in powering Claude Code for coding tasks.


    You'll discover how Google's Gemini 2.5 Pro transformed from the failed Bard experiment into a competitive large language model, with insights into the upcoming Gemini 3 release. The deep dive includes Grok from x.ai, completing the analysis of the top four proprietary AI models. This session clarifies the distinction between closed-source frontier models and open-source alternatives like Meta's LLaMA, explaining concepts like "open weight" versus true open-source models.


    The lecture covers real-world use cases, business value, and practical considerations for selecting the right LLM for specific tasks. You'll gain understanding of multimodal capabilities, context windows, token usage, and API access across these language models. Additional mentions of emerging players like Mistral AI, Cohere, and Perplexity provide a complete picture of the 2025 LLM leaderboard and AI tools ecosystem.

  • Day 2 - Open-Source LLMs: LLaMA, Mistral, DeepSeek, and Ollama12:03

    If you want to learn:


    - What are the best open-source LLMs in 2025 and how do they compare to proprietary AI models?

    - How does Meta's LLaMA 3.3 and LLaMA 4 differ from other large language models like Mistral and DeepSeek?

    - What makes DeepSeek V3 revolutionary in terms of training cost efficiency compared to GPT models?

    - How can you run AI models locally on your computer using Ollama and what are small language models (SLMs)?

    - What's the difference between using LLMs through APIs versus direct inference with open-source models?

    - How do mixture of experts models like Mixtral work and what are the real-world use cases for different open-source language models?


    Then this lecture is for you!


    This comprehensive lecture provides a detailed comparison of the top open-source large language models in 2025, including Meta's LLaMA 3.3 and LLaMA 4, Mistral AI's Mixtral, Alibaba Cloud's Qwen, Google's Gemma 2, Microsoft's Phi-4, and the groundbreaking DeepSeek V3. You'll discover why DeepSeek revolutionized the AI landscape by achieving frontier-level capabilities at a fraction of OpenAI's training costs ($4 million versus $100+ million), and explore OpenAI's recent GPT-OSS open-source release. The lecture covers practical implementation methods including running models locally through Ollama with GGUF files, using the Hugging Face Transformers Library for direct inference, and leveraging cloud APIs through services like Bedrock, Vertex AI, and OpenRouter. You'll learn about model distillation techniques, understand the difference between small language models (1-3 billion parameters) and large language models (671 billion parameters), explore multimodal capabilities, and discover how to choose between packaged products like ChatGPT and Claude versus direct API integration for coding, content creation, and AI agents. The session includes hands-on demonstrations in Cursor, comparing context windows, token efficiency, and real-world use cases across the best LLMs available for startups and enterprise applications, with a deep dive into the proprietary versus open-source model debate shaping AI in 2025.

  • Day 2 - Chat Completions API: HTTP Endpoints vs OpenAI Python Client10:11

    If you want to learn:


    - What is the Chat Completions API and why has it become the standard for interacting with LLMs?

    - How do you call OpenAI's API using HTTP endpoints directly with POST requests?

    - What's the difference between using raw HTTP endpoints versus the OpenAI Python client library?

    - How do you structure API requests with headers, payloads, and authentication tokens?

    - Why do developers prefer Python client libraries over manual HTTP requests for API calls?

    - How do you parse JSON responses from chat completion endpoints to extract AI-generated content?


    Then this lecture is for you!


    This lecture demonstrates two fundamental approaches to calling the Chat Completions API: direct HTTP endpoint requests and the OpenAI Python client library. You'll start by understanding what the Chat Completions API is and why it has become the ubiquitous standard across all AI model providers. The lecture walks through making raw HTTP POST requests to OpenAI's chat completions endpoint, showing you how to structure headers with authorization tokens, create JSON payloads with model parameters and message arrays, and parse the response to extract AI-generated content. You'll see a live example calling GPT-4o-mini and navigating through the JSON response structure including the choices array and message content fields. The lecture then explains why Python client libraries exist as convenient wrappers around HTTP endpoints, eliminating the need to manually construct requests and parse JSON dictionaries. You'll learn how these open-source client libraries transform API calls into clean, elegant Python code while performing the same underlying HTTP operations. The session includes practical setup steps like configuring your Python environment, loading API keys from .env files, and verifying your OpenAI credentials before making requests.

  • Day 2 - Using the OpenAI Python Client with Multiple LLM Providers7:41

    If you want to learn:


    - How does the OpenAI Python client actually work under the hood?

    - Can I use the OpenAI library to connect to other AI providers like Ollama or Gemini?

    - What is OpenAI compatibility and why do multiple LLM providers support it?

    - How do I switch between different AI model endpoints using the same Python code?

    - What's the difference between using the OpenAI API directly versus using a client wrapper?

    - How can I run AI models locally with Ollama using OpenAI-compatible APIs?


    Then this lecture is for you!


    In this hands-on lecture, you'll discover how the OpenAI Python client functions as a lightweight wrapper around HTTP API calls to chat completion endpoints. You'll learn to create an OpenAI client instance that automatically uses your API key from environment variables, then make chat completion requests using Python objects instead of raw JSON. The lecture demonstrates how OpenAI compatibility has become the standard interface across AI providers, with Gemini, Anthropic, and Ollama all offering OpenAI-compatible endpoints. You'll see practical examples of switching between different model providers by simply changing the base_url parameter while keeping your code identical. The tutorial covers installing and configuring the OpenAI Python client, understanding how chat completion requests work, examining response objects and message content, and seamlessly switching from OpenAI's GPT models to Google's Gemini or locally-hosted Ollama models. You'll gain a clear understanding of how API endpoints, authentication headers, and client libraries work together, enabling you to use multiple LLM providers with minimal code changes. By the end, you'll be able to leverage OpenAI-compatible APIs across different AI platforms, understand the relationship between HTTP requests and Python client methods, and confidently switch between cloud-based and open-source AI models in your applications.

  • Day 2 - Running Ollama Locally with OpenAI-Compatible Endpoints10:28

    If you want to learn:


    - How to run AI models locally on your computer without API costs?

    - What is an OpenAI-compatible endpoint and how does it work with Ollama?

    - How to switch between cloud-based and local AI models using the same code?

    - How to install Ollama and use open-source models like Llama and DeepSeek locally?

    - What are the trade-offs between frontier models and local open-source AI models?

    - How to ensure complete data privacy by running AI models offline?


    Then this lecture is for you!


    This lecture demonstrates how to run Ollama locally with OpenAI-compatible endpoints, enabling you to use open-source AI models on your computer. You'll learn to install Ollama, download models like Llama 3.2 and DeepSeek-R1, and configure the OpenAI Python client library to connect to your local Ollama instance using the localhost:11434/v1 base URL. The lecture covers switching between cloud-based APIs and local models by simply changing the base_url parameter, eliminating API charges while maintaining the same chat completion interface. You'll explore practical examples of generating responses using local models, understand the benefits of data privacy and offline functionality, and compare the performance differences between frontier models and smaller open-source alternatives. The session includes hands-on demonstrations of model distillation concepts, parameter variations (1b and 1.5b models), and streaming responses. You'll complete a homework assignment that combines Day 1's web page summarization with local Ollama models, reinforcing your understanding of OpenAI compatibility, endpoint configuration, and the flexibility of using the same API wrapper for both cloud and local AI applications.

  • Day 3 - Base, Chat, and Reasoning Models: Understanding LLM Types10:44

    If you want to learn:


    - What's the difference between ChatGPT, Claude, and Gemini AI models?

    - How do base models, chat models, and reasoning models actually work?

    - When should you use Claude vs Gemini vs ChatGPT for different use cases?

    - What are reasoning models and why do leading AI models like GPT and Gemini use them?

    - How do multimodal AI models decide when to "think" before responding?

    - Which AI model is best for coding tasks versus creative writing?


    Then this lecture is for you!


    This lecture breaks down the three fundamental types of large language models that power today's leading AI tools. You'll discover how base models work as the foundation of predictive text and AI assistants, then explore how OpenAI transformed GPT into ChatGPT through chat model training. Learn the evolution from simple completion models to sophisticated reasoning models that think step-by-step before responding.


    Compare how Claude, Gemini, and ChatGPT handle different scenarios, from coding tasks to creative writing. Understand the breakthrough of hybrid models like Gemini 2.5 and GPT-5 that dynamically adjust their reasoning effort based on question complexity. Discover practical techniques like chain-of-thought prompting and budget forcing that make AI models more powerful.


    You'll learn when to choose chat models for fast, interactive conversations versus reasoning models for complex problem-solving. Explore real-world use cases for each AI model type and understand why selecting the right LLM matters for your specific needs. Master the fundamentals of generative AI architecture that will help you integrate AI into your business effectively and choose the best AI tool for coding, analysis, or content generation tasks.

  • Day 3 - Frontier Models: GPT, Claude, Gemini & Their Strengths and Pitfalls12:56

    If you want to learn:


    - What are the leading AI models like ChatGPT, Claude, and Gemini, and how do they compare?

    - Which AI model is best for coding tasks versus creative writing?

    - What are the strengths and pitfalls of frontier LLMs like GPT-5, Claude 4.5, and Gemini 2.5?

    - How do you choose the best AI tool for your specific use case?

    - Why do AI models hallucinate and how can you work with them effectively?

    - What's the difference between ChatGPT vs Claude vs Gemini for real-world applications?


    Then this lecture is for you!


    This lecture provides a comprehensive comparison of leading AI models including OpenAI's GPT-5 and GPT-4.1, Anthropic's Claude (Haiku, Sonnet, and Opus variants), Google Gemini 2.5, x.ai's Grok, and DeepSeek AI. You'll explore the specific strengths of each multimodal AI model, from ChatGPT's reasoning capabilities to Claude Sonnet's coding excellence. The lecture covers practical use cases for each AI tool, including content synthesis, creative writing, coding tasks, and problem-solving. You'll learn how these large language models excel at generating structured answers, debugging code, and building project frameworks, while understanding their critical limitations including knowledge cutoffs, hallucinations, and confidence biases. The session examines why these powerful AI assistants have replaced traditional resources like StackOverflow for developers and how to integrate AI into your business workflow effectively. Through real-world examples, you'll discover why selecting the right AI model matters and how to supervise generative AI tools to avoid common pitfalls. Whether comparing Claude vs Gemini for multimodal tasks or evaluating ChatGPT vs Claude for coding, you'll gain practical insights into choosing and working with these foundation models. The lecture emphasizes ethical AI usage and best practices for leveraging LLMs as supervised assistants rather than autonomous decision-makers, ensuring you can harness the power of these AI assistants while maintaining quality control.

  • Day 3 - Testing ChatGPT-5 and Frontier LLMs Through the Web UI9:43

    If you want to learn:


    How does ChatGPT-5 compare to other top AI models like Claude, Gemini, Grok, and DeepSeek in 2025?


    What are the best use cases for different LLMs and which AI model should you choose for your specific needs?


    How do frontier AI models handle complex questions, self-reflection, and challenging reasoning tasks?


    What are the strengths and limitations of ChatGPT vs Claude vs Gemini vs Grok vs DeepSeek?


    How can you test and evaluate AI models through their web interfaces before integrating them via APIs?


    Then this lecture is for you!


    This lecture provides hands-on testing of ChatGPT-5 and leading frontier LLMs including Claude, Gemini 2.5 Pro, Grok, and DeepSeek through their web UI interfaces. You'll explore how to evaluate whether a business problem is suitable for an LLM solution and learn what types of questions each AI model excels at answering. The lecture demonstrates ChatGPT-5's capabilities with structured explanations, multi-step reasoning, and self-aware responses about its strengths in synthesis across domains and challenges with fresh information and mathematical precision. You'll see comparative analysis of how different AI models complement each other—Claude for human-like reasoning and long-context tasks, Gemini for real-time multimodal capabilities, and specialized models from OpenAI, Anthropic, and xAI. Through practical examples including emotional intelligence questions, meta-reasoning challenges, and classic AI test cases, you'll understand how generative AI has evolved and learn to identify which top AI tools are best suited for coding, deep research, AI agents, and various artificial intelligence applications in 2025.

  • Day 3 - Testing Claude, Gemini, Grok & DeepSeek with ChatGPT Deep Research11:33

    If you want to learn:


    • How do the top AI models like ChatGPT, Claude, Gemini, Grok, and DeepSeek compare in real-world testing scenarios?

    • Which AI model performs best for coding, creative writing, and complex reasoning tasks in 2025?

    • What is ChatGPT Deep Research and how can AI agents automate research work for you?

    • How do Claude vs ChatGPT vs Gemini vs Grok vs DeepSeek stack up when handling challenging prompts?

    • What are the key differences between GPT-5, Claude Sonnet 4.5, Gemini 2.5 Pro, Grok-4, and DeepSeek's latest models?

    • Which generative AI tools offer the most powerful capabilities for multimodal tasks and long-form content generation?


    Then this lecture is for you!


    This lecture provides hands-on testing and comparison of the top AI models available in 2025, including ChatGPT (GPT-5), Claude (Sonnet 4.5 and Opus 4.1), Gemini 2.5 Pro, Grok-4, and DeepSeek. You'll see live demonstrations testing each AI model with identical prompts to evaluate their strengths in creative writing, logical reasoning, and self-awareness. The lecture explores Claude's exceptional performance in nuanced reasoning and long-form writing, GPT-5's accuracy in meta-questions, Gemini's reasoning capabilities, Grok's fast processing from xAI, and DeepSeek's deep thinking mode. You'll discover how each LLM handles complex queries differently, from describing abstract concepts like color to solving self-referential puzzles. The session also introduces ChatGPT Deep Research, an agentic AI feature that automates comprehensive research tasks by conducting multiple searches and synthesizing information over extended periods. You'll learn how to leverage AI agents to delegate research work, ask clarifying questions, and generate detailed reports with cited sources. This practical comparison helps you understand which AI tool—whether from OpenAI, Anthropic, Google, or xAI—best suits your specific needs for coding, content generation, artificial intelligence research, or business applications.

  • Day 3 - Agentic AI in Action: Deep Research, Claude Code, and Agent Mode11:22

    If you want to learn:


    - What is agentic AI and how does it work autonomously to complete complex tasks?

    - How can AI agents like ChatGPT's Deep Research and Agent Mode perform real-world tasks for you?

    - What makes Claude Code different from traditional coding assistants and how can it solve programming challenges?

    - How do current AI models like Claude Sonnet 4, Gemini 2.5 Pro, and GPT-4 compare in agentic capabilities?

    - Can AI agents actually browse the web, make reservations, and write working code without human intervention?

    - What are the practical applications of agentic AI systems in 2025 for coding, research, and task automation?


    Then this lecture is for you!


    This lecture demonstrates three powerful agentic AI systems in action, showing how AI agents can work autonomously to complete complex, multi-step tasks. You'll witness ChatGPT's Deep Research feature conducting comprehensive analysis by autonomously searching and synthesizing information from multiple sources while you work on other tasks. The lecture showcases Agent Mode navigating real websites, interacting with reservation systems like Resy, and completing real-world tasks such as finding restaurants with specific criteria—all without manual intervention.


    You'll see a live demonstration of Claude Code (Claude Sonnet 4) integrated within Cursor, where the AI agent reads existing Jupyter notebooks, understands coding challenges, and autonomously writes a complete Python solution using Ollama and Llama 3.2. The lecture illustrates how Claude Code analyzes project context, comprehends requirements, and generates executable code that runs successfully with uvrun—solving in minutes what would typically require manual coding effort.


    Through these demonstrations, you'll understand the current state of agentic AI capabilities in 2025, including how large language models can reason through complex problems, execute code iteratively, leverage real-time data sources, and perform software engineering tasks. The lecture highlights the practical differences between AI assistants and true agentic systems, showing how these AI agents can autonomously handle coding tasks, conduct research, and interact with real-world applications to deliver better results while working independently in the background.

  • Day 3 - Frontier Models Showdown: Building an LLM Competition Game10:14

    If you want to learn:


    How do frontier AI models like Claude 4, GPT-5, and Gemini 2.5 Pro compare in real-world coding tasks and agentic capabilities? What makes certain large language models better at reasoning through complex problems than others? How can you leverage AI agents to autonomously compete and collaborate in multi-step workflows? What are the key differences between top AI models in 2025 when it comes to speed, price, and performance? How do you build an agentic system that allows LLMs to interact, strategize, and make decisions independently?


    Then this lecture is for you!


    This lecture explores the current state of frontier AI models through a hands-on demonstration of building an LLM competition game called "Outsmart." You'll gain deep insight into how leading large language models—including GPT-5, Claude Sonnet 4.5, Grok 4, and open-source alternatives—perform in real-world scenarios requiring strategic reasoning and agentic AI capabilities.


    The lecture showcases a practical coding project where AI agents compete autonomously in a multi-step game, exchanging messages, forming alliances, and making strategic decisions. You'll see how to leverage different AI models through API integration, compare their performance in real-time, and understand the trade-offs between intelligence, speed, and cost when choosing the right language model for enterprise AI applications.


    Key topics include understanding agentic systems, implementing AI code execution workflows, and analyzing how models like Claude 4, Gemini 2.5 Pro, and GPT-5 handle complex problem-solving tasks. The demonstration uses Streamlit for the UI and provides open-source code on GitHub for you to explore. You'll learn how to prompt AI models to articulate their reasoning strategies, creating better results through iterative, multi-step processes.


    By examining how these AI agents interact, strategize, and outperform each other in coding tasks and decision-making scenarios, you'll develop practical understanding of agentic capabilities and how to use AI effectively in software engineering workflows for 2025 and beyond.

  • Day 4 - Understanding Transformers: The Architecture Behind GPT and LLMs12:46

    If you want to learn:


    - What is transformer architecture and why does it power GPT, ChatGPT, and modern LLMs?

    - How did the "Attention Is All You Need" paper revolutionize AI and large language models?

    - What makes transformers different from traditional neural networks and deep learning models?

    - How do LLMs like GPT-4, Claude, and DeepSeek actually work under the hood?

    - Why are transformers so efficient for training large language models at scale?

    - What are the alternatives to transformer architecture and how do they compare?


    Then this lecture is for you!


    This lecture explores the transformer architecture that powers modern large language models like GPT, ChatGPT, and Claude. You'll discover the history behind the groundbreaking 2017 "Attention Is All You Need" paper and understand how self-attention mechanisms revolutionized AI. The lecture covers the evolution from GPT-1 to GPT-5, explaining how transformers differ from traditional neural networks and why they enable efficient scaling of LLMs. You'll learn about key concepts including tokens, context windows, parameters, and API costs, while gaining insight into how models process sequences through attention layers. The lecture also examines alternative architectures like state space models and mixture-of-experts (MoE) systems, comparing their performance to standard transformer models. Through practical examples and model comparisons using tools like OpenAI and Ollama, you'll understand why transformer architecture remains the dominant approach for training large language models, how RLHF (reinforcement learning from human feedback) improved ChatGPT, and what makes transformers an optimization breakthrough rather than a fundamental requirement for AI. Perfect for understanding how LLMs work, model performance factors, and the computational efficiency that makes modern AI applications possible.

  • Day 4 - From LSTMs to Transformers: Attention, Emergent Intelligence & Agentic A9:08

    If you want to learn:


    - How did we evolve from LSTMs to transformer architecture and why did transformers revolutionize AI?

    - What makes attention mechanisms so powerful that "Attention Is All You Need" became the foundation of modern LLMs?

    - Why do large language models like GPT and ChatGPT produce not just plausible responses, but accurate and intelligent ones?

    - What is emergent intelligence and how does it arise from scaling transformer models?

    - How has AI evolved from prompt engineering to context engineering and agentic AI systems?

    - What makes agentic AI with autonomous LLMs in loops the hottest topic in artificial intelligence today?


    Then this lecture is for you!


    This lecture explores the revolutionary shift from recurrent neural networks (LSTMs) to transformer architecture that powers modern large language models like GPT, ChatGPT, and DeepSeek. You'll understand why the transformer model's parallelization capabilities overcame the limitations of LSTMs, and how the attention mechanism became the cornerstone of state-of-the-art AI systems.


    Discover the phenomenon of emergent intelligence—why LLMs don't just predict likely tokens, but generate accurate, intelligent responses that surprise even frontier lab researchers. Learn how transformer blocks, self-attention layers, and multi-head attention enable language models to process input sequences and generate contextually relevant outputs.


    The lecture traces the evolution of working with LLMs: from prompt engineering techniques to context engineering strategies that optimize model performance through better input sequences and embeddings. You'll explore how to set up LLMs for success by providing business-specific information and structuring prompts effectively.


    Finally, dive into agentic AI—the cutting-edge approach where LLMs operate in loops with tool access, demonstrating autonomy in workflow control. Understand the two primary definitions of agentic systems: LLMs controlling workflows and calling other models, and LLMs operating in iterative loops with tools. See real-world examples like Claude and GitHub Copilot that showcase how humans and AI collaborate, with LLMs making autonomous decisions about next actions through intelligent token prediction and optimization.

  • Day 4 - Parameters: From Millions to Trillions in GPT, LLaMA & DeepSeek8:26

    If you want to learn:


    - What are parameters in large language models and why do they matter?

    - How did GPT models scale from 117 million to 1.76 trillion parameters?

    - What's the difference between training time scaling and inference time scaling?

    - How do open source models like LLaMA, DeepSeek, and GPT-OSS compare in parameter count?

    - Why can smaller models like Gemma outperform larger models like GPT-2?

    - What are mixture-of-experts (MoE) models and how do they work?


    Then this lecture is for you!



    This lecture explores the evolution of parameters in transformer architecture, from traditional machine learning's 20-200 parameters to today's frontier models with tens of trillions. You'll discover how GPT-1's 117 million parameters grew exponentially through GPT-2 (1.5 billion), GPT-3 (175 billion), and GPT-4 (1.76 trillion), and understand why parameter count directly impacts model intelligence and training capacity.


    The lecture examines training time scaling versus inference time scaling—two orthogonal approaches to improving LLM performance. You'll learn how training time scaling involves larger models with more parameters that absorb more training data, while inference time scaling uses techniques like reasoning prompts and RAG to enhance model performance during inference without changing the underlying architecture.


    You'll compare parameter counts across state-of-the-art open source models including LLaMA 3.2 (3 billion), LLaMA 3.1 (8 billion), LLaMA 3.3 (70 billion), LLaMA 4 (245 billion multimodal), GPT-OSS (120 billion), and DeepSeek (671 billion). The lecture explains mixture-of-experts architecture used in large language models, where multiple smaller models activate based on specific queries to optimize computation and model capacity.


    You'll understand the Chinchilla scaling laws that correlate parameter count with training data absorption, and discover why modern optimization techniques allow smaller models to outperform older, larger ones—demonstrating how efficiency improvements in transformer models enable more powerful AI with fewer parameters.

  • Day 4 - What Are Tokens? From Characters to GPT's Tokenizer4:02

    If you want to learn:


    - What are tokens and how do they work in large language models?

    - How does tokenization differ from character-by-character and word-by-word processing in AI?

    - Why do GPT and other LLMs use tokenization instead of processing full words?

    - What is the difference between tokens and embeddings in neural networks?

    - How can you visualize and understand GPT's tokenizer in action?

    - What makes subword tokenization the most efficient method for training language models?


    Then this lecture is for you!


    This lecture explores the fundamental concept of tokenization in large language models and LLMs like GPT and ChatGPT. You'll discover why modern AI systems use tokens—chunks of text that can represent words, word fragments, or character combinations—instead of processing individual characters or complete words. The lecture traces the evolution from early character-level neural networks through word-based approaches to today's subword tokenization methods, explaining how this compromise solution enables efficient training and processing while maintaining a manageable vocabulary size.


    You'll learn the key differences between token IDs and embeddings, understanding where tokens fit in the neural network architecture as the very first input layer. The lecture demonstrates OpenAI's GPT tokenizer using the platform.openai.com/tokenizer tool, giving you hands-on insight into how text gets converted into tokens. You'll understand why tokenization works so effectively for language models, including how it handles word stems, rare words, and proper nouns while keeping token counts and vocabulary size optimized. This foundational knowledge is essential for working with GPT-4, ChatGPT, and other LLMs, helping you grasp how tokenization methods like BPE (Byte Pair Encoding) and tools like tiktoken enable AI to process and generate human language efficiently.

  • Day 4 - Understanding Tokenization: How GPT Breaks Down Text into Tokens8:13

    If you want to learn:


    How does GPT break down text into tokens and why does it matter for AI applications?


    What is tokenization and how do language models like ChatGPT process your input text?


    Why do some words become single tokens while others split into multiple fragments?


    How can you estimate token counts for your prompts and understand token limits in LLMs?


    What tools can you use to visualize how the GPT tokenizer converts text into tokens?


    How do tokenization methods affect the performance of large language models?


    Then this lecture is for you!


    This lecture provides a hands-on exploration of tokenization in GPT and other large language models. You'll discover how the GPT tokenizer breaks down text into tokens using OpenAI's tokenizer interface at platform.openai.com/tokenizer. The lecture demonstrates how common words map to single tokens while rare or complex words split into subword tokenization fragments. You'll learn the practical difference between beginning-of-word tokens and mid-word tokens, see how numbers are tokenized into three-digit sequences, and understand why this matters for AI model performance. The lecture covers real examples showing how 50-66 characters convert to 9-18 tokens, introduces the rule of thumb that 1,000 tokens equals approximately 750 words, and explains how tokenization works differently for natural language versus code. You'll explore the tiktoken library, understand vocabulary size implications, and learn to estimate token counts for managing token limits in GPT-4, ChatGPT, and other LLMs. By the end, you'll have practical knowledge of how tokenization methods like BPE (Byte Pair Encoding) enable neural networks to process embeddings efficiently, plus hands-on experience using the GPT tokenizer to analyze your own text.

  • Day 4 - Tokenizing with tiktoken and Understanding the Illusion of Memory10:56

    If you want to learn:


    How does tokenization work in GPT models and language models?

    What is tiktoken and how do you use it to tokenize text?

    Why do LLMs seem to remember conversations when they're actually stateless?

    How do token counts affect API costs when using OpenAI and ChatGPT?

    What's the difference between token IDs and how does the GPT tokenizer break down words?

    How do you build conversation context for large language models using Python?


    Then this lecture is for you!


    In this hands-on lecture, you'll learn practical tokenization using tiktoken, OpenAI's official tokenizer library for GPT models. You'll discover how the GPT-4 tokenizer converts text into tokens by encoding strings into token IDs and decoding them back, understanding how subword tokenization breaks words into fragments. Through live Python coding demonstrations, you'll explore the tokenization method used by language models, experiment with vocabulary size and token limits, and learn to count tokens for managing API costs.


    The lecture then reveals a critical concept: the illusion of memory in LLMs. You'll understand why large language models like ChatGPT appear to remember previous messages when they're actually completely stateless. Through practical code examples using the OpenAI Python client, you'll learn how to build conversation context by passing the entire message history with each API call, including system, user, and assistant roles. You'll discover why token counts accumulate with each message, how this affects embeddings and neural network processing, and why input tokens cost money. This fundamental understanding of how tokenization works and how LLMs process conversation context is essential for anyone building AI applications with GPT, GPT-4, or other language models.

  • Day 4 - Context Windows, API Costs, and Token Limits in LLMs10:49

    If you want to learn:


    - What are context windows in LLMs and why do they matter for AI applications?

    - How do token limits affect your conversations with ChatGPT and Claude?

    - What are the real API costs when working with large language models?

    - How does token counting work for both input prompts and output responses?

    - Why do some LLMs handle million tokens while others max out at 100K?

    - How can you optimize your AI workflow to manage context length and reduce costs?


    Then this lecture is for you!


    This lecture provides a comprehensive breakdown of context windows, token limits, and API costs in large language models. You'll discover how context windows determine the maximum input an LLM can process, including the entire conversation history and generated tokens. Learn why models like Gemini offer million token context windows while others cap at 200K-400K tokens, and how this impacts techniques like RAG, multi-shot prompting, and retrieval-augmented workflows.


    The lecture explains API cost structures for OpenAI, Claude, and other LLMs, covering per-token pricing for both input and output, including hidden reasoning tokens in modern AI models. You'll explore the Vellum leaderboard to compare context length and costs across GPT-5, Claude Sonnet, and Gemini models, understanding how scaling from frontier models to nano versions affects pricing—from $10 to under $1 per million tokens.


    Discover practical insights on caching strategies to reduce costs, how chunking strategies and semantic search optimize token usage, and why understanding token limits is essential for building scalable AI applications. You'll gain clarity on when context window constraints require summarization, truncate methods, or agentic workflows to handle large documents efficiently.

  • Day 5 - Building a Sales Brochure Generator with OpenAI Chat Completions API9:03

    If you want to learn:


    - How to build a sales brochure generator using the OpenAI Chat Completions API?

    - What is one-shot prompting and how can it improve your AI outputs?

    - How to chain multiple LLM calls together to create commercial AI solutions?

    - How to use streaming responses and JSON output with ChatGPT API?

    - What makes a verticalized AI product valuable even when using GPT under the hood?

    - How to parse and filter website links intelligently using AI models?


    Then this lecture is for you!


    In this hands-on coding session, you'll build a complete sales brochure generator that transforms company websites into professional marketing materials. You'll master the OpenAI Chat Completions API by implementing one-shot prompting techniques, where you provide examples to guide AI output quality. The workflow involves chaining two LLM calls: first extracting and filtering relevant links from a target website, then generating a comprehensive sales brochure from multiple pages. You'll learn to implement streaming responses for real-time typewriter effects and work with both Markdown and JSON output formats. The lecture demonstrates how to use AI for nuanced content understanding—distinguishing relevant from irrelevant links and parsing web content intelligently. You'll write production-ready code using Python, Beautiful Soup for web scraping, and the gpt-5-nano model. This step-by-step guide shows you how to create a scalable, commercial AI tool that goes beyond simple prompt engineering, teaching you to think like an AI engineer building verticalized products. By the end, you'll understand how to apply these patterns to real-world business problems, from resume parsing to review analysis, and why carefully crafted AI workflows deliver commercial value even when built on foundation models like GPT.

  • Day 5 - Building JSON Prompts and Using OpenAI's Chat Completions API10:43

    If you want to learn:


    - How do I create effective JSON prompts for AI models like ChatGPT, Claude, and Gemini?

    - What is the best way to structure prompts for OpenAI's Chat Completions API?

    - How can I use JSON format to get consistent, structured output from AI tools?

    - What is one-shot prompting and how does it improve AI responses?

    - How do I build a complete workflow using natural language and structured JSON prompts?

    - What are the practical steps to integrate prompt engineering with API calls?


    Then this lecture is for you!


    In this comprehensive guide, you'll master the art of building structured JSON prompts and implementing OpenAI's Chat Completions API in real-world workflows. Learn how to craft effective system and user prompts using JSON notation—a format that AI models like ChatGPT, Claude, and Gemini naturally understand from their training data. Discover the power of one-shot prompting by providing example outputs that guide AI to generate exactly what you need.


    This step-by-step tutorial walks you through creating a practical AI tool that extracts and categorizes website links. You'll learn how to structure JSON prompt templates, use the response_format parameter to enforce valid JSON output, and understand how AI models constrain token generation at inference time. The lecture covers essential prompt engineering techniques including iterative refinement, handling edge cases, and building scalable AI workflows.


    By the end, you'll have hands-on experience with the Chat Completions API, understand how to work with structured JSON prompts versus natural language, and know how to build better prompts that produce consistent, parseable results. Perfect for developers looking to integrate AI into their applications using Python and JavaScript, this complete guide provides real-world examples and case studies for mastering JSON-based prompt generation with language models.

  • Day 5 - Chaining GPT Calls: Building an AI Company Brochure Generator9:07

    If you want to learn:


    - How to chain multiple GPT-4 calls together to build complex AI applications?

    - How to create an AI-powered brochure generator that analyzes company websites automatically?

    - How to convert GPT responses into structured data formats like JSON for real-time processing?

    - How to use prompt engineering strategies to extract relevant information from web content?

    - How to build an enterprise-grade AI tool that generates marketing materials using ChatGPT?

    - How to implement streaming data workflows with OpenAI's API for generative AI applications?


    Then this lecture is for you!


    In this hands-on lecture, you'll build a complete AI brochure maker using GPT-4 that automatically generates professional company brochures. You'll learn to chain multiple GPT calls together—first using AI to intelligently select relevant links from a website, then using those results to generate a comprehensive marketing brochure. The lecture walks through the entire process: scraping website content, crafting effective system and user prompts for ChatGPT, converting GPT output from text to JSON format using Python, and implementing an agentic workflow that combines AI with traditional code. You'll discover how to structure prompts that enable GPT to analyze data, make nuanced decisions about content relevance, and generate formatted output in Markdown. By the end, you'll have created a real-time generative AI tool that fetches company information, processes it through multiple AI calls, and produces easy-to-use marketing materials—demonstrating practical machine learning applications for enterprise use cases. This lecture provides insight into building sophisticated AI generators that go beyond simple chat interactions to create genuine business value.

  • Day 5 - Building a Brochure Generator with GPT-4 and Streaming Results11:17

    If you want to learn:


    - How to build an AI brochure maker using GPT-4 and streaming data in real-time?

    - What's the difference between using GPT-4 and GPT-4o mini for generating marketing content?

    - How to implement real-time generative AI with streaming results and typewriter animations?

    - How to send data to GPT models and customize prompts for enterprise brochure generation?

    - What are the best practices for using OpenAI's chat completions API with stream parameters?

    - How to convert website data into professional marketing brochures using AI?


    Then this lecture is for you!


    In this hands-on tutorial, you'll build a complete AI brochure generator using GPT-4 and OpenAI's streaming API. You'll learn how to implement real-time generative AI by setting stream=True in chat completions, enabling a typewriter-style output that displays results token by token as the model generates them. The lecture demonstrates using GPT-4.1 mini for brochure generation and GPT-5 nano for intelligent link parsing, showing you how to craft effective system prompts and user prompts to control AI output. You'll discover how to upload and send website data to GPT models, parse relevant URLs, and generate professional marketing brochures for customers, investors, and recruits. The tutorial includes practical examples using Hugging Face as a case study, demonstrating how to iterate on prompts to create different brochure styles—from professional to humorous. You'll master the chunk.choices[0].delta.content pattern for handling streaming data, learn to display real-time Markdown updates, and understand how to build an easy-to-use AI tool that converts raw website information into polished marketing materials within seconds.

  • Day 5 - Business Applications, Challenges & Building Your AI Tutor9:56

    If you want to learn:


    How can you integrate generative AI into real business workflows to automate content creation and solve practical problems?


    What are the common pitfalls when building AI applications with large language models like GPT-4?


    How do you create an agentic workflow using multiple LLM calls to synthesize information and generate business documents?


    What's the best way to build a personalized AI tutor that explains technical concepts tailored to your learning style?


    How can you use prompt engineering and multi-shot prompting to evaluate and improve AI solution outputs?


    Then this lecture is for you!


    This lecture guides you through practical generative AI applications for business, focusing on building real AI use cases that transform your workflow. You'll explore how to create an agentic workflow by chaining multiple LLM calls to synthesize data and generate business content like brochures, tutorials, and email campaigns. Learn to apply these AI tools to automate content creation in your organization while understanding common pitfalls when integrating generative AI solutions.


    The lecture challenges you to build your own AI tutor using OpenAI and Ollama, implementing prompt engineering techniques to create personalized learning experiences. You'll experiment with different language models including GPT-4, Llama, Qwen, and DeepSeek to evaluate which AI model produces the best outputs for technical explanations. Master multi-shot prompting by providing examples that improve AI chatbots' responses over time.


    Discover how to use generative AI tools in a notebook environment for rapid prototyping of AI projects, embracing experimentation to refine your AI system. You'll learn to leverage natural language processing capabilities of large language models to distill information, synthesize insights, and produce targeted outputs. The lecture covers practical genai in business applications including translation workflows, document generation, and productivity automation—providing a springboard for developing your own generative AI solution tailored to specific use cases in your industry.

Requirements

  • Familiarity with Python. This course will not cover Python basics and is completed in Python.
  • A PC with an internet connection is required. Either Mac (Linux) or Windows.
  • We recommend that you allocate around $5 for API costs to work with frontier models. However, you can complete the course using open-source models if you prefer.

Description

Mastering Generative AI and LLMs: An 8-Week Hands-On Journey


Accelerate your career in AI with practical, real-world projects led by industry veteran Ed Donner. Build advanced Generative AI products, experiment with over 20 groundbreaking models, and master state-of-the-art techniques like RAG, QLoRA, and Agents.


What you’ll learn


Build advanced Generative AI products using cutting-edge models and frameworks.

Experiment with over 20 groundbreaking AI models, including Frontier and Open-Source models.

Develop proficiency with platforms like HuggingFace, LangChain, and Gradio.

Implement state-of-the-art techniques such as RAG (Retrieval-Augmented Generation), QLoRA fine-tuning, and Agents.

Create real-world AI applications, including:

• A multi-modal customer support assistant that interacts with text, sound, and images.

• An AI knowledge worker that can answer any question about a company based on its shared drive.

• An AI programmer that optimizes software, achieving performance improvements of over 60,000 times.

• An ecommerce application that accurately predicts prices of unseen products.

Transition from inference to training, fine-tuning both Frontier and Open-Source models.

Deploy AI products to production with polished user interfaces and advanced capabilities.

Level up your AI and LLM engineering skills to be at the forefront of the industry.

About the Instructor


I’m Ed Donner, an entrepreneur and leader in AI and technology with over 20 years of experience. I’ve co-founded and sold my own AI startup, started a second one, and led teams in top-tier financial institutions and startups around the world. I’m passionate about bringing others into this exciting field and helping them become experts at the forefront of the industry.


Projects:

Project 1: AI-powered brochure generator that scrapes and navigates company websites intelligently.

Project 2: Multi-modal customer support agent for an airline with UI and function-calling.

Project 3: Tool that creates meeting minutes and action items from audio using both open- and closed-source models.

Project 4: AI that converts Python code to optimized C++, boosting performance by 60,000x!

Project 5: AI knowledge-worker using RAG to become an expert on all company-related matters.

Project 6: Capstone Part A – Predict product prices from short descriptions using Frontier models.

Project 7: Capstone Part B – Fine-tuned open-source model to compete with Frontier in price prediction.

Project 8: Capstone Part C – Autonomous agent system collaborating with models to spot deals and notify you of special bargains.


Why This Course?


Hands-On Learning: The best way to learn is by doing. You’ll engage in practical exercises, building real-world AI applications that deliver stunning results.

Cutting-Edge Techniques: Stay ahead of the curve by learning the latest frameworks and techniques, including RAG, QLoRA, and Agents.

Accessible Content: Designed for learners at all levels. Step-by-step instructions, practical exercises, cheat sheets, and plenty of resources are provided.

No Advanced Math Required: The course focuses on practical application. No calculus or linear algebra is needed to master LLM engineering.


Course Structure


Week 1: Foundations and First Projects


• Dive into the fundamentals of Transformers.

• Experiment with six leading Frontier Models.

• Build your first business Gen AI product that scrapes the web, makes decisions, and creates formatted sales brochures.


Week 2: Frontier APIs and Customer Service Chatbots


• Explore Frontier APIs and interact with three leading models.

• Develop a customer service chatbot with a sharp UI that can interact with text, images, audio, and utilize tools or agents.


Week 3: Embracing Open-Source Models


• Discover the world of Open-Source models using HuggingFace.

• Tackle 10 common Gen AI use cases, from translation to image generation.

• Build a product to generate meeting minutes and action items from recordings.


Week 4: LLM Selection and Code Generation


• Understand the differences between LLMs and how to select the best one for your business tasks.

• Use LLMs to generate code and build a product that translates code from Python to C++, achieving performance improvements of over 60,000 times.


Week 5: Retrieval-Augmented Generation (RAG)


• Master RAG to improve the accuracy of your solutions.

• Become proficient with vector embeddings and explore vectors in popular open-source vector datastores.

• Build a full business solution similar to real products on the market today.


Week 6: Transitioning to Training


• Move from inference to training.

• Fine-tune a Frontier model to solve a real business problem.

• Build your own specialized model, marking a significant milestone in your AI journey.


Week 7: Advanced Training Techniques


• Dive into advanced training techniques like QLoRA fine-tuning.

• Train an open-source model to outperform Frontier models for specific tasks.

• Tackle challenging projects that push your skills to the next level.


Week 8: Deployment and Finalization


• Deploy your commercial product to production with a polished UI.

• Enhance capabilities using Agents.

• Deliver your first productionized, agentized, fine-tuned LLM model.

• Celebrate your mastery of AI and LLM engineering, ready for a new phase in your career.

Who this course is for:

  • Aspiring AI engineers and data scientists eager to break into the field of Generative AI and LLMs.
  • Professionals looking to upskill and stay competitive in the rapidly evolving AI landscape.
  • Developers interested in building advanced AI applications with practical, hands-on experience.
  • Individuals seeking a career transition or aiming to enhance productivity through LLM-built frameworks.