AcademicCodeConceptFEATUREDMachine Learning

Build a Large Language Model (From Scratch)

“Build a Large Language Model (From Scratch)” by Sebastian Raschka blew up on Github this week and collected over 5000 stars. These resources are designed to provide hands-on experience and foundational knowledge necessary for building LLMs.

They focus on essential aspects such as data handling, attention mechanisms, and framework familiarity, setting the stage for more advanced topics in upcoming chapters.

Chapter TitleMain Code (for quick access)All Code + Supplementary
Ch 1: Understanding Large Language ModelsNo codeNo code
Ch 2: Working with Text Data– ch02.ipynb
– dataloader.ipynb (summary)
– exercise-solutions.ipynb
./ch02
Ch 3: Coding Attention Mechanisms– ch03.ipynb
– multihead-attention.ipynb (summary)
– exercise-solutions.ipynb
./ch03
Ch 4: Implementing a GPT Model from Scratch– ch04.ipynb
– gpt.py (summary)
– exercise-solutions.ipynb
./ch04
Ch 5: Pretraining on Unlabeled DataQ1 2024
Ch 6: Finetuning for Text ClassificationQ2 2024
Ch 7: Finetuning with Human FeedbackQ2 2024
Ch 8: Using Large Language Models in PracticeQ2/3 2024
Appendix A: Introduction to PyTorch– code-part1.ipynb
– code-part2.ipynb
– DDP-script.py
– exercise-solutions.ipynb
./appendix-A
Appendix B: References and Further ReadingNo code
Appendix C: ExercisesNo code

Join Upaspro to get email for news in AI and Finance

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses User Verification plugin to reduce spam. See how your comment data is processed.