About 15 results for "UCtAcpQcYerN8xxZJYTfWBMw"
Featured Results

Mistral / Mixtral Explained: Sliding Window Attention, Sparse Mixture of Experts, Rolling Buffer
Umar JamilNov 25, 2024
Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code.
Umar JamilNov 23, 2024
PT5H46M5S
Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation
Umar JamilAug 7, 2024
PT48M46S
Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math
Umar JamilApr 14, 2024
PT1H14M29S
Mamba and S4 Explained: Architecture, Parallel Scan, Kernel Fusion, Recurrent, Convolution, Math
Umar JamilJan 7, 2024
PT1H12M53S
Distributed Training with PyTorch: complete tutorial with cloud infrastructure and code
Umar JamilDec 19, 2023
PT49M24S
Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)
Umar JamilNov 27, 2023![BERT explained: Training, Inference, BERT vs GPT/LLamA, Fine tuning, [CLS] token](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/90mGPxR2GgY/hqdefault.jpg)
PT54M52S
BERT explained: Training, Inference, BERT vs GPT/LLamA, Fine tuning, [CLS] token
Umar JamilOct 26, 2023
PT1H10M55S
LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU
Umar JamilAug 24, 2023
PT26M55S





