HomeWatch

THIS is the REAL DEAL 🤯 for local LLMs

Alex Ziskind
382.4K views2 months ago
10.9K

Description

This is the stack that gets me over 4000 tokens per second locally. Download Docker Desktop here: https://dockr.ly/4mOdGMO to get up and running with Docker Model Runner quickly. 🛒 Gear Links 🛒 💻☕ Thunderbolt 5 external SSD: https://amzn.to/3XqetZO 💻☕ Favorite 15" display with magnet: https://amzn.to/3zD1DhQ 🎧⚡ Great 40Gbps T4 enclosure: https://amzn.to/3JNwBGW 🛠️🚀 My nvme ssd: https://amzn.to/3YLEySo 📦🎮 My gear: https://www.amazon.com/shop/alexziskind 🎥 Related Videos 🎥 🏆 Skip M3 Ultra & RTX 5090 for LLMs | NEW 96GB KING - https://youtu.be/bAao58hXo9w 💻 Smallest RTX Pro 6000 rig | OVERKILL - https://youtu.be/JbnBt_Aytd0 🔧 Cheap mini runs a 70B LLM 🤯 - https://youtu.be/xyKEQjUzfAk 🌙 RAM torture test on Mac - https://youtu.be/l3zIwPgan7M 🚀 FREE Local LLMs on Apple Silicon | FAST! - https://youtu.be/bp2eev21Qfo 🪞 REALITY vs Apple’s Memory Claims | vs RTX4090m - https://youtu.be/fdvzQAWXU7A 📦 Set up Conda - https://youtu.be/2Acht_5_HTo 🤖 INSANE Machine Learning on Neural Engine - https://youtu.be/Y2FOUg_jo7k -Julia Turk's FP4 video: https://www.youtube.com/watch?v=-cRedoYETzQ -NVIDIA post on quantization: https://developer.nvidia.com/blog/optimizing-llms-for-performance-and-accuracy-with-post-training-quantization/ * 🛠️ Developer productivity Playlist - https://www.youtube.com/playlist?list=PLPwbI_iIX3aQCRdFGM7j4TY_7STfv2aXX 🔗 AI for Coding Playlist: 📚 - https://www.youtube.com/playlist?list=PLPwbI_iIX3aSlUmRtYPfbQHt4n0YaX0qw — — — — — — — — — ❤️ SUBSCRIBE TO MY YOUTUBE CHANNEL 📺 Click here to subscribe: https://www.youtube.com/@AZisk?sub_confirmation=1 — — — — — — — — — Join this channel to get access to perks: https://www.youtube.com/channel/UCajiMK_CY9icRhLepS8_3ug/join — — — — — — — — — 📱 ALEX ON X: https://twitter.com/digitalix #rtxpro6000 #llm #macbook