Build a Large Language Model (From Scratch)
“Build a Large Language Model (From Scratch)” by Sebastian Raschka blew up on Github this week and collected over 5000 stars. These resources are designed to provide hands-on experience and foundational knowledge necessary for building LLMs.
They focus on essential aspects such as data handling, attention mechanisms, and framework familiarity, setting the stage for more advanced topics in upcoming chapters.
Chapter Title | Main Code (for quick access) | All Code + Supplementary |
---|---|---|
Ch 1: Understanding Large Language Models | No code | No code |
Ch 2: Working with Text Data | – ch02.ipynb – dataloader.ipynb (summary) – exercise-solutions.ipynb | ./ch02 |
Ch 3: Coding Attention Mechanisms | – ch03.ipynb – multihead-attention.ipynb (summary) – exercise-solutions.ipynb | ./ch03 |
Ch 4: Implementing a GPT Model from Scratch | – ch04.ipynb – gpt.py (summary) – exercise-solutions.ipynb | ./ch04 |
Ch 5: Pretraining on Unlabeled Data | Q1 2024 | … |
Ch 6: Finetuning for Text Classification | Q2 2024 | … |
Ch 7: Finetuning with Human Feedback | Q2 2024 | … |
Ch 8: Using Large Language Models in Practice | Q2/3 2024 | … |
Appendix A: Introduction to PyTorch | – code-part1.ipynb – code-part2.ipynb – DDP-script.py – exercise-solutions.ipynb | ./appendix-A |
Appendix B: References and Further Reading | No code | |
Appendix C: Exercises | No code |
Join Upaspro to get email for news in AI and Finance