Reading List

Publications and papers worth reading on large language models and the working practice of building with them. Grouped by category. None of this is paid, none is sponsored.

Foundational papers

2017

Attention is All You Need.

Vaswani et al. — the original Transformer paper. Every modern language model traces its architecture to this work.
2022

Training language models to follow instructions with human feedback.

Ouyang et al. — the InstructGPT paper, which set the template for RLHF and the modern instruction-following model.
2020

Language Models are Few-Shot Learners.

Brown et al. — GPT-3 paper. The first scaling result that hinted at what was coming.
2022

Large Language Models are Zero-Shot Reasoners.

Kojima et al. — the "let's think step by step" paper. The origin of chain-of-thought prompting in the form most people know it.
2020

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.

Lewis et al. — the original RAG paper. The architecture under most production LLM systems in 2026.

Industry essays

Book-length reads

Worth following