How GPT-5 Thinks — OpenAI VP of Research Jerry Tworek
About
No channel description available.
Video Description
What does it really mean when GPT-5 “thinks”? In this conversation, OpenAI’s VP of Research Jerry Tworek explains how modern reasoning models work in practice—why pretraining and reinforcement learning (RL/RLHF) are both essential, what that on-screen “thinking” actually does, and when extra test-time compute helps (or doesn’t). We trace the evolution from O1 (a tech demo good at puzzles) to O3 (the tool-use shift) to GPT-5 (Jerry calls it “03.1-ish”), and talk through verifiers, reward design, and the real trade-offs behind “auto” reasoning modes. We also go inside OpenAI: how research is organized, why collaboration is unusually transparent, and how the company ships fast without losing rigor. Jerry shares the backstory on competitive-programming results like ICPC, what they signal (and what they don’t), and where agents and tool use are genuinely useful today. Finally, we zoom out: could pretraining + RL be the path to AGI? This is the MAD Podcast —AI for the 99%. If you’re curious about how these systems actually work (without needing a PhD), this episode is your map to the current AI frontier. OpenAI Website - https://openai.com X/Twitter - https://x.com/OpenAI Jerry Tworek LinkedIn - https://www.linkedin.com/in/jerry-tworek-b5b9aa56 X/Twitter - https://x.com/millionint FIRSTMARK Website - https://firstmark.com X/Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ X/Twitter - https://twitter.com/mattturck LISTEN ON: Spotify - https://open.spotify.com/show/7yLATDSaFvgJG80ACcRJtq Apple - https://podcasts.apple.com/us/podcast/the-mad-podcast-with-matt-turck/id1686238724 00:00 - Intro 01:01 - What Reasoning Actually Means in AI 02:32 - Chain of Thought: Models Thinking in Words 05:25 - How Models Decide Thinking Time 07:24 - Evolution from O1 to O3 to GPT-5 11:00 - Before OpenAI: Growing up in Poland, Dropping out of School, Trading 20:32 - Working on Robotics and Rubik's Cube Solving 23:02 - A Day in the Life: Talking to Researchers 24:06 - How Research Priorities Are Determined 26:53 - Collaboration vs IP Protection at OpenAI 29:32 - Shipping Fast While Doing Deep Research 31:52 - Using OpenAI's Own Tools Daily 32:43 - Pre-Training Plus RL: The Modern AI Stack 35:10 - Reinforcement Learning 101: Training Dogs 40:17 - The Evolution of Deep Reinforcement Learning 42:09 - When GPT-4 Seemed Underwhelming at First 45:39 - How RLHF Made GPT-4 Actually Useful 48:02 - Unsupervised vs Supervised Learning 49:59 - GRPO and How DeepSeek Accelerated US Research 53:05 - What It Takes to Scale Reinforcement Learning 55:36 - Agentic AI and Long-Horizon Thinking 59:19 - Alignment as an RL Problem 1:01:11 - Winning ICPC World Finals Without Specific Training 1:05:53 - Applying RL Beyond Math and Coding 1:09:15 - The Path from Here to AGI 1:12:23 - Pure RL vs Language Models
Up Your AI Game: Must-Have Tools
AI-recommended products based on this video

Skytech Archangel Gaming PC Desktop – AMD Ryzen 5 3600 3.6 GHz, NVIDIA RTX 3060, 1TB NVME SSD, 16GB DDR4 RAM 3200, 600W Gold PSU, 11AC Wi-Fi, Windows 11 Home 64-bit

Skytech Blaze 3.0 Gaming PC Desktop – Intel Core i5 12400F 2.5 GHz, NVIDIA RTX 3060, 500GB NVME SSD, 16GB DDR4 RAM 3200, 600W Gold PSU, 11AC Wi-Fi, Windows 11 Home 64-bit

MSI NVIDIA GeForce RTX 3050 Ventus 2X XS 8G OC Graphics Card - 8 GB GDDR6, 1807 MHz, PCI Express Gen 4, 128 Bits, DP v 1.4a, DL DVI-D, HDMI 2.1 (Supports 4K at 120Hz)

Asus Dual NVIDIA GeForce RTX 3050 6GB OC Edition Gaming Graphics Card - PCIe 4.0, 6GB GDDR6 Memory, HDMI 2.1, DisplayPort 1.4a, 2-Slot Design, Axial-tech Fan Design, 0dB Technology, Steel Bracket

Intel Core i9-12900KF Desktop Processor 16 (8P+8E) Cores up to 5.2 GHz Unlocked LGA1700 600 Series Chipset 125W

Crucial 64GB DDR5 RAM, 5600MHz (or 5200MHz or 4800MHz) Desktop Memory Kit, UDIMM 288-Pin, Compatible with 13th Gen Intel Core and AMD Ryzen 7000 - CT2K32G56C46U5

Samsung 990 EVO Plus - 4TB PCIe Gen4. X4, Gen5. X2 NVMe 2.0 - M.2 Internal SSD, Speed Up to 7,250 MBs, Upgrade Storage for PC-Laptops, HMB Technology and Intelligent Turbowrite (MZ-V9S4T0B/AM)
![SAMSUNG 870 EVO SATA SSD 500GB 2.5” Internal Solid State Drive, Upgrade PC or Laptop Memory and Storage for IT Pros, Creators, Everyday Users, MZ-77E500B/AM [Canada Version]](https://m.media-amazon.com/images/I/911ujeCkGfL._AC_UL960_FMwebp_QL65_.jpg)
SAMSUNG 870 EVO SATA SSD 500GB 2.5” Internal Solid State Drive, Upgrade PC or Laptop Memory and Storage for IT Pros, Creators, Everyday Users, MZ-77E500B/AM [Canada Version]
![SAMSUNG 870 EVO SATA III SSD 4TB 2.5” Internal Solid State Drive, Upgrade PC or Laptop Memory and Storage for IT Pros, Creators, Everyday Users, MZ-77E4T0B/AM [Canada Version]](https://m.media-amazon.com/images/I/71W2nK7LUrL._AC_UL960_FMwebp_QL65_.jpg)
SAMSUNG 870 EVO SATA III SSD 4TB 2.5” Internal Solid State Drive, Upgrade PC or Laptop Memory and Storage for IT Pros, Creators, Everyday Users, MZ-77E4T0B/AM [Canada Version]
![SAMSUNG EVO Select Micro SD-Memory-Card + Adapter, 128GB microSDXC 160MB/s Full HD & 4K UHD, UHS-I, U3, A2, V30, for Android Smartphones, Tablets, Nintendo-Switch (MB-ME128SA/AM) [Canada Version]](https://m.media-amazon.com/images/I/71lzXt4djxL._AC_UY654_FMwebp_QL65_.jpg)
SAMSUNG EVO Select Micro SD-Memory-Card + Adapter, 128GB microSDXC 160MB/s Full HD & 4K UHD, UHS-I, U3, A2, V30, for Android Smartphones, Tablets, Nintendo-Switch (MB-ME128SA/AM) [Canada Version]

Anker 332 USB-C Hub (5-in-1) with 4K HDMI Display, 5Gbps - and 2 5Gbps USB-A Data Ports and for MacBook Pro, MacBook Air, Dell XPS, Lenovo Thinkpad, HP Laptops and More

Logitech M185 Wireless Mouse, 2.4GHz with USB Mini Receiver, 12-Month Battery Life, 1000 DPI Optical Tracking, Ambidextrous, Compatible with PC, Mac, Laptop - Black

Logitech G203 Wired Gaming Mouse, 8,000 DPI, Rainbow Optical Effect LIGHTSYNC RGB, 6 Programmable Buttons, On-Board Memory, Screen Mapping, PC/Mac Computer and Laptop Compatible - Black

Logitech G305 Lightspeed Wireless Gaming Mouse, Hero 12K Sensor, 12,000 DPI, Lightweight, 6 Programmable Buttons, 250h Battery Life, On-Board Memory, PC/Mac - Black

Logitech G502 Hero High Performance Wired Gaming Mouse, Hero 25K Sensor, 25,600 DPI, RGB, Adjustable Weights, 11 Programmable Buttons, On-Board Memory, PC/Mac, Black

Apple 2025 MacBook Air 13-inch Laptop with M4 chip: Built for Apple Intelligence, 16GB Unified Memory, 256GB SSD Storage, Touch ID; Sky Blue - English Keyboard

NEEWER Advanced 18 inch LED Ring Light for Phone, LCD Touch Screen, 2.4G Remote Lights Control, 3200-5600K, Tripod Light for iPhone Action Camera, for Studio Makeup TikTok YouTube Video Salon (Black)

Bose QuietComfort Wireless Noise Cancelling Headphones, Bluetooth Over Ear Headphones with Up to 24 Hours of Battery Life, Moonlight Grey - Limited Edition

Wireless Charger for Samsung - NANAMI 3 in 1 Charging Station for Multiple Devices,Fast Charger Stand Dock for Galaxy S25 S24 S23 Ultra S22 S21 S20 Z Flip Fold,Galaxy Watch 5/5 Pro/4/3,Galaxy Buds Pro



















