How to Train LLMs to "Think" (o1 & DeepSeek-R1)
About
No channel description available.
Latest Posts
Video Description
š¤ Work with me: https://www.shawhintalebi.com Here, I discuss the technical details behind the recent āadvanced reasoningā models trained on large-scale reinforcement learning i.e. o1 and DeepSeek-R1. š° Read more: https://shawhin.medium.com/how-to-train-llms-to-think-like-o1-deepseek-r1-eabc21c8842d?source=friends_link&sk=ec3e7ca77cd47f76ce38015c87ba5084 References [1] https://openai.com/index/learning-to-reason-with-llms/ [2] arXiv:2501.12948 [cs.CL] [3] https://youtu.be/7xTGNNLPyMI [4] https://huggingface.co/datasets/open-r1/OpenR1-Math-220k [5] https://discovery.ucl.ac.uk/id/eprint/10045895/1/agz_unformatted_nature.pdf Intro - 0:00 OpenAI's o1 - 0:33 Test-time Compute - 1:33 "Thinking" Tokens - 3:50 DeepSeek Paper - 5:58 Reinforcement Learning - 7:22 R1-Zero: Prompt Template - 9:28 R1-Zero: Reward - 10:53 R1-Zero: GRPO (technical) - 12:53 R1-Zero: Results - 20:00 DeepSeek R1 - 23:32 Step 1: SFT with CoT - 24:47 Step 2: R1-Zero Style RL - 26:14 Step 3: SFT with Mixed Data - 27:03 Step 4: RL & RLHF - 28:26 Accessing DeepSeek Models - 29:18 Conclusions - 30:10
Essential Camera Gear for Adventure
AI-recommended products based on this video

SmallRig Half Cage for Canon R5, R5 C & R6 - Camera Cage with Cable Clamp - 3656

NEEWER a6700 L Plate Compatible with Sony Alpha 6700, Quick Switch Aluminum Extendable Side Plate & Anti Twist Baseplate Compatible with Arca Swiss QR System DJI RS 2 RSC 2 RS 3 Pro Gimbals, CA048

SmallRig ZV-E10 II Base Plate for Sony Alpha ZV-E10 II, Bottom Mount Plate with Quick-Release Plate for Arca, QD Socket, Hand Strap Hole, 1/4"-20 Threaded Hole - 4950

SmallRig Baseplate for Sony Alpha 6700, Bottom Mount Plate Built-in Quick Release Plate for Arca, Supporting Quick Switch Between Tripod and Stabilizer (for DJI RS 2 / RSC 2 / RS 3 / RS 3 Pro) - 4338

SmallRig ZV-E1 Camera Baseplate for Sony ZV-E1, Built-in Cold Shoe Mount and QR Plate (for Arca) Quick Switch Between Tripod and Stabilizer for DJI RS 2 / RSC 2 / RS 3 / RS 3 Pro - 4314

NEEWER a6700 L Plate Compatible with Sony Alpha 6700, Quick Switch Aluminum Extendable Side Plate & Anti Twist Baseplate Compatible with Arca Swiss QR System DJI RS 2 RSC 2 RS 3 Pro Gimbals, CA048

SmallRig Baseplate for Sony Alpha 6700, Bottom Mount Plate Built-in Quick Release Plate for Arca, Supporting Quick Switch Between Tripod and Stabilizer (for DJI RS 2 / RSC 2 / RS 3 / RS 3 Pro) - 4338

NEXPOW Car Jump Starter,Car Battery Jump Starter Pack 1500A Peak Q10S for Up to 7.0L Gas and 5.5L Diesel Engine12V Auto Battery Booster,Jumper Cables,Portable Lithium Jump Box with LED Light/USB QC3.0

TERRAMASTER F8 SSD Plus NAS - 8Bay All SSD NAS Storage Core i3 8-Core 8-Thread CPU, 16GB DDR5 RAM, 10GbE Port, 8 Heat Sinks Included, Palm-Sized Network Attached Storage Peak Performance (Diskless)

ZDZA Electric Bike, 1000W Peak Motor, 26ā Electric Mountain Bike, 35km/h &65 km Max Range Ebike for Adults, Electric Bicycle for Commute, Colorful LCD Display & 7 Speed

Fewlby Kids Rain Suit for Boys Girls Toddler Raincoats One Piece Hoodie Cartoon Coverall Waterproof Rainwear M Size

SanDisk 64GB Extreme PRO SDXC UHS-I Memory Card - C10, U3, V30, 4K UHD, SD Card - SDSDXXU-064G-GN4IN




















