ChatGPT is made from 100 million of these [The Perceptron]
Welch Labs
@welchlabsAbout
New Book! The Welch Labs Illustrated Guide to AI is now available for pre-order: https://www.welchlabs.com/resources/ai-book
Latest Posts
Video Description
Go to https://drinkag1.com/welchlabs to subscribe and save $20 off your first subscription of AG1! Thanks to AG1 for sponsoring today's video. Imaginary Numbers book is back in stock! Update at 23:11 https://www.welchlabs.com/resources/imaginary-numbers-book Welch Labs Posters: https://www.welchlabs.com/resources Really nice Perceptron simulators built by viewers: https://github.com/srives/Perceptron https://priyangsubanerjee.github.io/perceptron-simulator/ Special Thanks to Patrons https://www.patreon.com/welchlabs Juan Benet, Ross Hanson, Yan Babitski, AJ Englehardt, Alvin Khaled, Eduardo Barraza, Hitoshi Yamauchi, Jaewon Jung, Mrgoodlight, Shinichi Hayashi, Sid Sarasvati, Dominic Beaumont, Shannon Prater, Ubiquity Ventures, Matias Forti, Brian Henry, Tim Palade, Petar Vecutin, Nicolas baumann, Jason Singh, Robert Riley, vornska, Barry Silverman References Rumelhart, D. E., Mcclelland, J. L. (1987). Parallel Distributed Processing, Volume 1: Explorations in the Microstructure of Cognition: Foundations. United Kingdom: Penguin Random House LLC. Talking Nets: An Oral History of Neural Networks. (2000). United Kingdom: MIT Press. Prince, S. J. (2023). Understanding Deep Learning. United Kingdom: MIT Press. Crevier, D. (1993). AI : the tumultuous history of the search for artificial intelligence. New York: Basic Books. Cat and dog face dataset: https://www.kaggle.com/datasets/andrewmvd/animal-faces?resource=download Minsky, M., Papert, S. (2017). Perceptrons: An Introduction to Computational Geometry. United Kingdom: MIT Press. Widrow, Bernard, and Michael A. Lehr. "30 years of adaptive neural networks: perceptron, madaline, and backpropagation." *Proceedings of the IEEE* 78.9 (1990): 1415-1442. Olazaran, Mikel. "A sociological history of the neural network controversy." *Advances in computers*. Vol. 37. Elsevier, 1993. 335-425. Widrow, Bernard. "Generalization and information storage in networks of adaline neurons." *Self-organizing systems* (1962): 435-461. Widrow, Bernard. "Thinking about thinking: the discovery of the LMS algorithm." *IEEE Signal Processing Magazine* 22.1 (2005): 100-106. Technical Notes Method for counting neurons in ChatGPT: Starting with GPT-2 implementation here: https://github.com/karpathy/build-nanogpt/blob/master/train_gpt2.py - keys, queries, and values are implemented in Linear layers with n_embd inputs and 3*n_embd outputs, where n_embd is the embedding dimension. Output projection layer has n_embd and n_embd outputs. So a single attention layer will have ~4*n_embd neurons. GPT-3 has an embedding dimension of 12,288, so each attention layer has ~49,152 neurons. Each MLP block has n_embd inputs, 4*n_embd hidden units, and n_embd outputs, so ~5*n_embd total neurons, or ~61,440. Total neuron count for GPT-3 is then 96*(49,152+61,440)=10,616,832, ignoring initial embedding and final unembedding. Finally, GPT-4 reportedly has ~1.8 Trillion parameters (https://semianalysis.com/2023/07/10/gpt-4-architecture-infrastructure/), making it ~10x larger than GPT-3. Note that GPT-4 is reportedly a mixture of experts, and not all experts are used for each inference, so it appears that not all 1.8 trillion parameters are used for a given inference call. Assuming that ~10x the parameters means 10x the neurons, then GPT-4 should have ~100M neurons. CFAQJOTYQHT7JYIT
You May Also Like
Essential Perceptron Components
AI-recommended products based on this video

Texas Instruments BA II Plus Financial Calculator, Black

4 Pack LCD Writing Tablet for Kids, 8.5 Inch Colorful Doodle Board Drawing Tablet, Educational Learning Toys Birthday Gifts for Boys Girls Age 3 4 5 6 7 8

Kikidex Girl Toys Age 1-3 Magnetic Drawing Board, Gifts for 1 2 3 Year Old Toddlers Birthday, Doodle Board for Preschool Learning and Educational Erasable Sketching Pad(Soft Pink)

Toys for Girls Boys, Colorful Doodle Board Drawing Tablet, Memo Board, Drawing Pads with Lanyard, Travel Educational Toys Gifts for Boys Girls Age 3 4 5 6 7 8 9 Years (Pink)

Toys for Girls 10 inch Doodle Board Drawing Pad Tablet with Lock Function, Erasable, Portable, Educational Learning Unicorn Toy Gifts for 3 4 5 6 7 8 9 Years Old Girls Toddlers (Pink)

Windshield Repair kit, DIY Glass Cracked Repair Kits, Easy to Use Automotive Glass Nano Repair Fluid, Scratch Chip Cracks Repair Kit (2 Pack) (Black)

Windshield Repair kit, DIY Glass Cracked Repair Kits, Easy to Use Automotive Glass Nano Repair Fluid, Scratch Chip Cracks Repair Kit (2 Pack) (Black)

Windshield Repair kit, DIY Glass Cracked Repair Kits, Easy to Use Automotive Glass Nano Repair Fluid, Scratch Chip Cracks Repair Kit (2 Pack) (Black)

Windshield Crack Repair kit Windshield Repair kit, Glass Repair kit, DIY Glass Cracked Repair Kits, Easy to Use Automotive Glass Nano Repair Fluid, Scratch Chip Cracks Repair Kit 2 Pack (Drakgray)



![Why Deep Learning Works Unreasonably Well [How Models Learn Part 3]](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/qx7hirqgfuU/hqdefault.jpg)
![The Misconception that Almost Stopped AI [How Models Learn Part 1]](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/NrO20Jb-hy0/hqdefault.jpg)
![How DeepSeek Rewrote the Transformer [MLA]](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/0VLAoVGf_74/hqdefault.jpg)
![The Dark Matter of AI [Mechanistic Interpretability]](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/UGO_Ehywuxc/hqdefault.jpg)










![The most beautiful equation in math, explained visually [Euler’s Formula]](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/f8CXG7dS-D0/hqdefault.jpg)
![The moment we stopped understanding AI [AlexNet]](https://imgz.pc97.com/?width=500&fit=cover&image=https://i.ytimg.com/vi/UZDiGooFs54/hqdefault.jpg)
