Merge LLMs to Make Best Performing AI Model
Maya Akim
@maya-akimAbout
I'm exploring ml/ai world and sharing everything I learn
Latest Posts
Video Description
This video is about mergekit, how to choose and blend models. It's non technical but links to technical papers are included. You need to know how to navigate the terminal but no programming is required. Substack: https://www.mayaakim.com Medium: https://medium.com/@mayaakim X: https://twitter.com/Maya_Akim Discord: https://discord.gg/GGhr7pyTHD To rent a GPU from Massed Compute (mergekit preinstalled) follow the link ⤵️ https://bit.ly/maya-akim Code for 50% discount: MayaAkim All links: mergekit: https://github.com/arcee-ai/mergekit Open LLM Leaderboard https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard my huggingface profile (with model configs you can copy): https://huggingface.co/mayacinka git installation: https://gitforwindows.org/ lfs installation: https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage supported architecture for mergekit: https://github.com/arcee-ai/mergekit/tree/main/mergekit/_data/architectures best blog about mergekit: https://medium.com/towards-data-science/merge-large-language-models-with-mergekit-2118fb392b54 other really good blog about mergekit: https://slgero.medium.com/merge-large-language-models-29897aeb1d1a Charles Goddard’s blog: (author of mergekit) https://goddard.blog/about/ Mona lisa with Mohawk https://www.designboom.com/technology/dalle-2-ai-imagery-tool-transform-famous-paintings-into-different-styles-06-06-2022/ What is YAML: https://www.techtarget.com/searchitoperations/definition/YAML-YAML-Aint-Markup-Language What is Data Contamination: https://bdtechtalks.com/2023/07/17/llm-data-contamination/ Goodharts law https://www.cna.org/reports/2022/09/goodharts-law LazyMergekit: https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing#scrollTo=1Wq4SB9A_9ic Auto evaluation: (requires runpod profile) https://colab.research.google.com/drive/1Igs3WZuXAIv9X0vwqiE90QlEPys8e8Oa?usp=sharing#scrollTo=elyxjYI_rY5W configuration with 14 models merged: https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.1 MoE instructions: https://github.com/arcee-ai/mergekit/blob/main/docs/moe.md higher density - better results https://github.com/arcee-ai/mergekit/issues/26 Model family tree: (visualization) https://colab.research.google.com/drive/1s2eQlolcI1VGgDhqWIANfkfKvcKrMyNr https://huggingface.co/spaces/mlabonne/model-family-tree cost of training mistral: https://www.ft.com/content/387eeeab-1f95-4e3b-9217-6f69aeeb5399 Leaderboard is disgusting: https://www.reddit.com/r/LocalLLaMA/comments/18xbevs/open_llm_leaderboard_is_disgusting/ Merging models with different architectures: https://arxiv.org/pdf/2401.10491.pdf merging models different arch: https://github.com/18907305772/FuseLLM Blending is all you need: https://arxiv.org/pdf/2401.02994.pdf Model soups https://arxiv.org/pdf/2203.05482.pdf Ties-merging research paper: https://arxiv.org/pdf/2306.01708.pdf Dare merge research paper: https://arxiv.org/pdf/2311.03099.pdf Task arithemtic: https://arxiv.org/pdf/2212.04089.pdf Benchmarks Arc benchmarks https://deepgram.com/learn/arc-llm-benchmark-guide https://arxiv.org/pdf/1803.05457.pdf HellaSwag https://arxiv.org/pdf/1905.07830.pdf MMLU https://arxiv.org/pdf/2009.03300.pdf TrithfulQA https://arxiv.org/abs/2109.07958 WinoGrande https://arxiv.org/pdf/1907.10641.pdf GSM8K https://arxiv.org/pdf/2110.14168.pdf overfitting problem Ann Lotz: https://arstechnica.com/tech-policy/2023/04/stable-diffusion-copyright-lawsuits-could-be-a-legal-earthquake-for-ai/ Benchmarks are a problem screenshots: https://analyticsindiamag.com/the-problems-with-llm-benchmarks/ https://www.reddit.com/r/LocalLLaMA/comments/164fivc/llm_benchmarks_are_broken_what_can_we_do_to_fix/ https://www.reddit.com/r/LocalLLaMA/comments/1b933of/llm_benchmarks_are_bullshit/ Attributions: [https://commons.wikimedia.org/wiki/File:Charles_Goodhart_at_the_Financial_Times_Economists'_Christmas_Drinks_Reception,_London._(2015).jpg](https://commons.wikimedia.org/wiki/File:Charles_Goodhart_at_the_Financial_Times_Economists%27_Christmas_Drinks_Reception,_London._(2015).jpg) Timecodes: 0:00 - 1:47 - blending intro 1:48 - 3:36 - promise of blending 3:37 - 4:22 - blending steps and requirements 4:23 - 5:05 - all you need is hardware 5:06 - 5:30 - mergekit installation 5:31 - 9:23 - merge methods 10:48 - 13:31 - configurations and yaml 13:32 - 14:38 - how to run merge 14:39 - 14:42 - upload merged model 14:43 - 16:27 - best merge method 16:28 - 20:16 benchmark problems, overfitting and contamination #mergekit #llm #localmodels
You May Also Like
AI Model Fusion Essentials
AI-recommended products based on this video

Seasonic Focus V4 GX-1000 (ATX3) - 1000W - 80+ Gold - ATX 3.0 & PCIe 5.1 Ready -Full-Modular -ATX Form Factor -Premium Japanese Capacitor -10 Year Warranty -Nvidia RTX 30/40 Super & AMD GPU Compatible

tomtoc 360° Protective Laptop Sleeve for 15-inch MacBook Air M4/A3241 2025, M3/A3114 2024, M2/A2941 2023, 15-inch MacBook Pro A1990 A1707, Dell XPS 15 Plus Laptop, Water-Resistant Computer Case Bag Global Recycled Standard

Replacement for Dell 130W Laptop Charger USB C - XPS 17 15 7590 9700 9500 9510 Precision 5560 3560 5540 5570 5550 3561 3550 5510 5520 Latitude 7410 7310 7210 Type C Computer AC Adapter Power Cord

MOSISO 360 Protective Laptop Bag 15 inch, 15 inch Computer Shoulder Bag Compatible with MacBook Air 15 M4 M3 M2 2025-2023, Dell XPS 15, Side Open Messenger Case &4 Zipper Pockets&Handle, Black Global Recycled Standard

45W 65W AC Adapter Laptop Charger for Dell Inspiron 15 14 13 17 3000 5000 7000 Series 5558 5559 Charger Latitude E6440 E6430 3520 3420 XPS 13 14 9333 9350 Notebook Dell Computer Power Supply Cord

Freenove Ultimate Starter Kit for BBC micro bit (V2 Included), 316-Page Detailed Tutorial, 225 Items, 44 Projects, Blocks and Python Code

Logitech M185 Wireless Mouse, 2.4GHz with USB Mini Receiver, 12-Month Battery Life, 1000 DPI Optical Tracking, Ambidextrous, Compatible with PC, Mac, Laptop - Black

Logitech G203 Wired Gaming Mouse, 8,000 DPI, Rainbow Optical Effect LIGHTSYNC RGB, 6 Programmable Buttons, On-Board Memory, Screen Mapping, PC/Mac Computer and Laptop Compatible - Black

Logitech G305 Lightspeed Wireless Gaming Mouse, Hero 12K Sensor, 12,000 DPI, Lightweight, 6 Programmable Buttons, 250h Battery Life, On-Board Memory, PC/Mac - Black




![All About OpenAI Assistant API [code + no-code] in 15 min](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/qItoyPzz01s/hqdefault.jpg)






![Build Custom Chatbot in 6 min with this Framework [Beginner Friendly]](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/-8HxOpaFySM/hqdefault.jpg)




![Build amazing AI tools in 20 min [no code] | LangChain & GPT4](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/dtmJMLWI91k/hqdefault.jpg)


