Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math

Umar Jamil April 14, 2024
Video Thumbnail
Umar Jamil Logo

Umar Jamil

View Channel

About

I'm a Machine Learning Engineer from Milan, Italy, teaching complex deep learning and machine learning concepts to my cat, 奥利奥. 我也会中文.

You May Also Like

Data Scientist's Must-Have Tools

AI-recommended products based on this video