Build a Large Language Model (From Scratch)

April 2, 2024 admin

“Build a Large Language Model (From Scratch)” by Sebastian Raschka blew up on Github this week and collected over 5000 stars. These resources are designed to provide hands-on experience and foundational knowledge necessary for building LLMs.

They focus on essential aspects such as data handling, attention mechanisms, and framework familiarity, setting the stage for more advanced topics in upcoming chapters.

Chapter Title	Main Code (for quick access)	All Code + Supplementary
Ch 1: Understanding Large Language Models	No code	No code
Ch 2: Working with Text Data	– ch02.ipynb – dataloader.ipynb (summary) – exercise-solutions.ipynb	./ch02
Ch 3: Coding Attention Mechanisms	– ch03.ipynb – multihead-attention.ipynb (summary) – exercise-solutions.ipynb	./ch03
Ch 4: Implementing a GPT Model from Scratch	– ch04.ipynb – gpt.py (summary) – exercise-solutions.ipynb	./ch04
Ch 5: Pretraining on Unlabeled Data	Q1 2024	…
Ch 6: Finetuning for Text Classification	Q2 2024	…
Ch 7: Finetuning with Human Feedback	Q2 2024	…
Ch 8: Using Large Language Models in Practice	Q2/3 2024	…
Appendix A: Introduction to PyTorch	– code-part1.ipynb – code-part2.ipynb – DDP-script.py – exercise-solutions.ipynb	./appendix-A
Appendix B: References and Further Reading	No code
Appendix C: Exercises	No code

Github

Join Upaspro to get email for news in AI and Finance

Build a Large Language Model (From Scratch)

Like this:

Related

Leave a Reply Cancel reply

Share this:

Like this:

Related

You May Also Like

Machine learning: a quick review (part 6)

Normalization vs Standardization: When to use which

Deep dive: LLM priority for RAG, AIOS, More Agents Is All You Need

Leave a Reply Cancel reply