🧠 Context Engineering — The Complete Guide

Everything you need to know about Context Windows, Prompt Engineering, and Building Better AI Systems

Maintained by Professor Milan Amrut Joshi Professor of Data Science, Northwestern University

A curated, research-backed guide to the emerging discipline of Context Engineering for Large Language Models.

Papers · Videos · Blog Posts · Tools · Techniques · Courses · Roadmap

What is Context Engineering?

Context Engineering is the art and science of designing, managing, and optimizing the information provided to Large Language Models (LLMs) within their context window to maximize the quality, accuracy, and relevance of their outputs.

While prompt engineering focuses on how you ask, context engineering focuses on what information surrounds your ask — the retrieval strategy, the memory architecture, the token budget allocation, the ordering of information, and the system-level design of context pipelines.

┌─────────────────────────────────────────────────────────────┐
│                      CONTEXT WINDOW                         │
│                                                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
│  │   SYSTEM     │  │  RETRIEVED   │  │   CONVERSATION   │  │
│  │   PROMPT     │  │  DOCUMENTS   │  │   HISTORY        │  │
│  │              │  │  (RAG)       │  │                  │  │
│  │  - Role      │  │  - Chunks    │  │  - Past turns    │  │
│  │  - Rules     │  │  - Metadata  │  │  - Summaries     │  │
│  │  - Examples  │  │  - Rankings  │  │  - Key facts     │  │
│  └──────────────┘  └──────────────┘  └──────────────────┘  │
│                                                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
│  │   TOOLS &    │  │  FEW-SHOT    │  │   USER           │  │
│  │   SCHEMAS    │  │  EXAMPLES    │  │   QUERY          │  │
│  │              │  │              │  │                  │  │
│  │  - Functions │  │  - Input/    │  │  - Current       │  │
│  │  - APIs      │  │    Output    │  │    request       │  │
│  │  - Formats   │  │    pairs     │  │  - Constraints   │  │
│  └──────────────┘  └──────────────┘  └──────────────────┘  │
│                                                             │
│           ▼ Token Budget Management ▼                       │
│           ▼ Information Ordering    ▼                       │
│           ▼ Relevance Filtering     ▼                       │
└─────────────────────────────────────────────────────────────┘

Context Engineering vs Prompt Engineering

Dimension	Prompt Engineering	Context Engineering
Focus	Crafting the query/instruction	Designing the entire information environment
Scope	Single prompt	Full context pipeline (retrieval, memory, tools)
Abstraction	Text-level	System-level architecture
Key Question	"How do I phrase this?"	"What information does the model need, and how should it be structured?"
Includes	Instructions, few-shot examples	RAG, memory, tool definitions, token budgets, ordering
Skill Level	🟢 Beginner to Intermediate	🟡 Intermediate to Advanced
Optimization	Wording, formatting, chain-of-thought	Retrieval quality, chunking, compression, caching
Analogies	Writing a good exam question	Designing the entire exam prep system
Dynamic?	Mostly static templates	Dynamic, adapts per query and session
Measurable Impact	Quality of single response	System-level accuracy, cost, latency

Why It Matters in 2025-2026

Context windows are exploding — From 4K tokens (GPT-3) to 2M+ tokens (Gemini). Managing this space effectively is a core engineering challenge.
RAG is now standard — Every production LLM application uses some form of retrieval. Context engineering defines how retrieved data is structured and ranked.
Agentic AI demands it — AI agents that use tools, maintain memory, and plan across steps require sophisticated context management.
Cost optimization — Tokens cost money. Smart context engineering reduces costs by 50-90% while maintaining quality.
Accuracy at scale — The "lost in the middle" problem and context dilution mean that more context is not always better. Engineering is required.

Key Concepts

🔑 Context Window

The fixed-size buffer of tokens an LLM can process in a single forward pass. Everything the model "knows" at inference time must fit within this window: system prompt, retrieved documents, conversation history, tool schemas, and the user query.

🔑 Token Limits & Budget Allocation

Given a finite context window, context engineering involves deciding how many tokens to allocate to each component. A common allocation:

System prompt: 5-10%
Retrieved documents: 40-60%
Conversation history: 15-25%
Few-shot examples: 5-10%
User query + response buffer: 10-20%

🔑 Retrieval-Augmented Generation (RAG)

The pattern of retrieving relevant documents from an external knowledge base and injecting them into the context window. RAG bridges the gap between parametric knowledge (model weights) and non-parametric knowledge (external data).

🔑 Memory Management

Strategies for maintaining information across sessions or long conversations: summarization, key-fact extraction, vector-based episodic memory, and hierarchical memory architectures (short-term, long-term, working memory).

🔑 Context Compression

Techniques to reduce token usage while preserving information: extractive summarization, LLMLingua-style token pruning, semantic deduplication, and information-theoretic compression.

🔑 Information Ordering

The position of information within the context window affects recall. Models exhibit primacy and recency biases. Context engineering accounts for this by placing critical information at the beginning and end of the context.

📋 Context Window Sizes

Model	Context Window	Provider	Year	Notes
GPT-4o	128K tokens	OpenAI	2024	Multimodal, widely deployed
GPT-o3	200K tokens	OpenAI	2025	Reasoning model, extended context
Claude 3.5 Sonnet	200K tokens	Anthropic	2024	Strong long-context performance
Claude Opus 4	200K tokens	Anthropic	2025	Frontier model
Claude Sonnet 4	200K tokens	Anthropic	2025	Balanced performance and speed
Gemini 2.0 Flash	1M tokens	Google	2025	Fast, extended context
Gemini 2.0 Pro	2M tokens	Google	2025	Largest production context window
Llama 3.1 405B	128K tokens	Meta	2024	Open-weight
Llama 4 Maverick	1M tokens	Meta	2025	Open-weight, MoE architecture
Mistral Large 2	128K tokens	Mistral	2024	European AI lab
DeepSeek V3	128K tokens	DeepSeek	2025	MoE, cost-efficient
DeepSeek R1	128K tokens	DeepSeek	2025	Reasoning-focused
Command R+	128K tokens	Cohere	2024	RAG-optimized
Grok-2	128K tokens	xAI	2024	Real-time data access
Qwen 2.5 72B	128K tokens	Alibaba	2024	Multilingual

Note: Context window size alone does not determine quality. Effective utilization across the full window varies significantly between models. See the RULER benchmark and Needle-in-a-Haystack for empirical evaluations.

📚 Research Papers

Full details, abstracts, and annotations available in papers/README.md

Context Window & Long Context

#	Paper	Authors	Year
1	Lost in the Middle: How Language Models Use Long Contexts	Liu et al.	2023
2	Extending Context Window of LLMs via Positional Interpolation	Chen et al.	2023
3	LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens	Ding et al.	2024
4	Ring Attention with Blockwise Transformers	Liu et al.	2023
5	YaRN: Efficient Context Window Extension of LLMs	Peng et al.	2023
6	Effective Long-Context Scaling of Foundation Models	Xiong et al. (Meta)	2023
7	LongLoRA: Efficient Fine-tuning of Long-Context LLMs	Chen et al.	2023
8	RULER: What's the Real Context Size of Your LLM?	Hsieh et al.	2024
9	Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention	Munkhdalai et al. (Google)	2024
10	Data Engineering for Scaling Language Models to 128K Context	Fu et al.	2024

Retrieval-Augmented Generation (RAG)

#	Paper	Authors	Year
11	Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks	Lewis et al.	2020
12	Self-RAG: Learning to Retrieve, Generate, and Critique	Asai et al.	2023
13	RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval	Sarthi et al.	2024
14	Corrective Retrieval-Augmented Generation (CRAG)	Yan et al.	2024
15	Active Retrieval Augmented Generation (FLARE)	Jiang et al.	2023
16	Dense Passage Retrieval for Open-Domain QA	Karpukhin et al.	2020
17	ColBERT: Efficient and Effective Passage Search via Late Interaction	Khattab & Zaharia	2020
18	Adaptive-RAG: Learning to Adapt Retrieval-Augmented LLMs	Jeong et al.	2024
19	Seven Failure Points When Engineering a RAG System	Barnett et al.	2024
20	A Survey on RAG Meets LLMs	Fan et al.	2024

Prompt Engineering & Optimization

#	Paper	Authors	Year
21	Chain-of-Thought Prompting Elicits Reasoning in LLMs	Wei et al.	2022
22	Tree of Thoughts: Deliberate Problem Solving with LLMs	Yao et al.	2023
23	DSPy: Compiling Declarative Language Model Calls	Khattab et al.	2023
24	Automatic Prompt Optimization with Gradient Descent and Beam Search	Pryzant et al.	2023
25	Large Language Models Are Human-Level Prompt Engineers	Zhou et al.	2022
26	Principled Instructions Are All You Need	Bsharat et al.	2023
27	Graph of Thoughts: Solving Elaborate Problems with LLMs	Besta et al.	2023

Memory & Context Management

#	Paper	Authors	Year
28	MemGPT: Towards LLMs as Operating Systems	Packer et al.	2023
29	Reflexion: Language Agents with Verbal Reinforcement Learning	Shinn et al.	2023
30	LLMLingua: Compressing Prompts for Accelerated Inference	Jiang et al.	2023
31	Voyager: An Open-Ended Embodied Agent with LLMs	Wang et al.	2023
32	Cognitive Architectures for Language Agents	Sumers et al.	2023
33	LongMem: Augmenting LLMs with Long-Term Memory	Wang et al.	2023
34	Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading	Chen et al.	2023

📹 YouTube Videos & Talks

Full playlist with timestamps and key takeaways in videos/README.md

#	Title	Channel	Year	Duration	Level
1	Let's Build GPT: From Scratch, In Code	Andrej Karpathy	2023	2h	🔴
2	Intro to Large Language Models	Andrej Karpathy	2023	1h	🟢
3	Attention Is All You Need (Illustrated)	Yannic Kilcher	2021	45m	🟡
4	But What Is a GPT? Visual Intro to Transformers	3Blue1Brown	2024	27m	🟢
5	Visualizing Attention in Transformers	3Blue1Brown	2024	26m	🟢
6	RAG from Scratch (Full Course)	LangChain	2024	2h	🟡
7	Context Engineering for AI Agents	AI Jason	2024	20m	🟡
8	Building Production RAG Systems	AI Engineer	2024	35m	🔴
9	How ChatGPT Works Technically	ByteByteGo	2023	15m	🟢
10	The Illustrated Retrieval Transformer	Jay Alammar	2023	18m	🟡
11	Advanced RAG Techniques	DeepLearning.AI	2024	1h	🟡
12	Prompt Engineering Full Course	freeCodeCamp	2024	4h	🟢
13	Building AI Agents with Long-Term Memory	AI Jason	2024	22m	🟡
14	Stanford CS25 — Transformers United	Stanford Online	2024	1.5h	🔴
15	DSPy Explained: The Framework for Programming LLMs	Connor Shorten	2024	30m	🟡
16	Vector Databases Explained	Fireship	2023	8m	🟢
17	Sam Altman on the Future of AI	Lex Fridman Podcast	2024	2.5h	🟢
18	Dario Amodei on Anthropic's Vision	Lex Fridman Podcast	2024	2.5h	🟢
19	The REAL Problem with RAG (and How to Fix It)	James Briggs	2024	25m	🟡
20	Building Effective Agents	Anthropic	2025	45m	🟡
21	Context Windows Deep Dive	Weights & Biases	2024	40m	🔴
22	Chunking Strategies for RAG	Greg Kamradt	2023	30m	🟡

📝 Blog Posts & Articles

Full list with summaries and key takeaways in blogs/README.md

#	Title	Author / Source	Date	Level
1	Prompt Engineering Guide	Anthropic	2024	🟢
2	OpenAI Prompt Engineering Guide	OpenAI	2024	🟢
3	Building RAG-Based LLM Applications	Anyscale	2024	🟡
4	The Illustrated Transformer	Jay Alammar	2018	🟢
5	Chunking Strategies for LLM Applications	Pinecone	2023	🟡
6	What We Learned from a Year of Building with LLMs	O'Reilly	2024	🔴
7	Patterns for Building LLM-Based Systems	Eugene Yan	2023	🟡
8	RAG Is More Than Just Vector Search	LlamaIndex	2024	🟡
9	Large Language Model Agents (MOOC Materials)	Dawn Song / UC Berkeley	2024	🔴
10	Long Context Prompting for Claude	Anthropic	2024	🟡
11	Prompt Engineering vs Context Engineering	Simon Willison	2025	🟢
12	Building Effective Agents	Anthropic	2024	🟡
13	A Visual Guide to Quantization	Maarten Grootendorst	2024	🟡
14	The Full Stack of AI Engineering	Pragmatic Engineer	2024	🟢
15	Evaluation Driven Development for LLM Apps	Hamel Husain	2024	🔴
16	Understanding Retrieval-Augmented Generation	Lilian Weng	2024	🔴
17	Agents Overview	Lilian Weng	2023	🔴
18	Prompt Engineering (Comprehensive Guide)	Lilian Weng	2023	🟡
19	How to Build an AI Agent	LangChain	2024	🟡
20	The Architecture of a Modern RAG System	LlamaIndex	2024	🔴
21	Context Engineering: The Next Frontier	Latent Space	2025	🟡
22	Embedding Models: From OpenAI to Open Source	Hugging Face	2024	🟡
23	Why RAG Systems Fail and How to Fix Them	Towards Data Science	2024	🟡
24	Structured Output from LLMs	BoundaryML	2024	🟡
25	The Rise of the AI Engineer	Latent Space	2023	🟢
26	A Practitioner's Guide to RAG	Cameron Wolfe	2024	🟡
27	Understanding Mixture of Experts	Hugging Face	2024	🔴

🛠️ Tools & Frameworks

Full comparison with features, pricing, and use cases in tools/README.md

Orchestration Frameworks

Tool	Description	Language	Stars	License
LangChain	Comprehensive LLM application framework	Python/JS	98k+	MIT
LlamaIndex	Data framework for LLM context augmentation	Python	37k+	MIT
DSPy	Programming (not prompting) language models	Python	19k+	MIT
Haystack	End-to-end NLP / RAG framework	Python	17k+	Apache 2.0
Semantic Kernel	Microsoft's LLM orchestration SDK	C#/Python	22k+	MIT
CrewAI	Multi-agent orchestration framework	Python	24k+	MIT
AutoGen	Multi-agent conversation framework	Python	35k+	MIT

Vector Databases

Database	Type	Hosted	Open Source	Key Feature
Pinecone	Cloud-native	Yes	No	Fully managed, enterprise-grade
Weaviate	Hybrid	Yes	Yes	GraphQL API, hybrid search
Chroma	Embedded/Cloud	Yes	Yes	Developer-friendly, lightweight
Milvus	Distributed	Yes	Yes	Billion-scale vector search
Qdrant	Cloud/Self-hosted	Yes	Yes	Rust-based, filtering support
pgvector	PostgreSQL extension	No	Yes	Use existing Postgres infra
FAISS	Library	No	Yes	Meta's similarity search library

Embedding Models

Model	Provider	Dimensions	Context	Notes
text-embedding-3-large	OpenAI	3072	8191	Best commercial embedding
text-embedding-3-small	OpenAI	1536	8191	Cost-effective
embed-v4	Cohere	1024	512	Multilingual, compressed
voyage-3	Voyage AI	1024	32000	Long-context embeddings
BGE-M3	BAAI	1024	8192	Best open-source multilingual
GTE-Qwen2	Alibaba	1536	32000	Long-context open-source
NomicEmbed	Nomic	768	8192	Fully open-source, auditable

Context Management & Agents

Tool	Purpose	Key Feature
MemGPT / Letta	LLM memory management	Virtual context, self-editing memory
Mem0	Memory layer for AI	Personalized memory for agents
LangMem	Long-term memory for LangChain	Persistent conversational memory
Instructor	Structured output from LLMs	Pydantic-based extraction
Guardrails AI	LLM output validation	Structure, type, and quality checks

🎓 Courses & Tutorials

Full list with curriculum details in courses/README.md

#	Course	Provider	Level	Format	Cost
1	ChatGPT Prompt Engineering for Developers	DeepLearning.AI + OpenAI	🟢	Video	Free
2	Building Systems with the ChatGPT API	DeepLearning.AI + OpenAI	🟡	Video	Free
3	LangChain for LLM Application Development	DeepLearning.AI + LangChain	🟡	Video	Free
4	Building and Evaluating Advanced RAG	DeepLearning.AI + LlamaIndex	🟡	Video	Free
5	Stanford CS324: Large Language Models	Stanford	🔴	Lecture	Free
6	Stanford CS25: Transformers United	Stanford	🔴	Seminar	Free
7	Hugging Face NLP Course	Hugging Face	🟢	Interactive	Free
8	Full Stack LLM Bootcamp	FSDL	🟡	Video	Free
9	Practical Deep Learning for Coders	fast.ai	🟡	Video	Free
10	LLM University	Cohere	🟢	Interactive	Free
11	Prompt Engineering Specialization	DeepLearning.AI	🟢	Video	Paid
12	UC Berkeley LLM Agents MOOC	UC Berkeley	🔴	Video	Free

📊 Techniques & Patterns

Full deep-dive with code examples in techniques/README.md

1. Chunking Strategies

The way you split documents determines retrieval quality.

┌──────────────────────────────────────────────────────────────┐
│                    CHUNKING STRATEGIES                        │
├──────────────────┬──────────────────┬────────────────────────┤
│   Fixed-Size     │    Semantic      │    Recursive           │
│                  │                  │                        │
│  Split every N   │  Split by        │  Try large chunks      │
│  tokens with     │  meaning/topic   │  first, then split     │
│  overlap         │  boundaries      │  smaller if needed     │
│                  │                  │                        │
│  ✅ Simple       │  ✅ Coherent     │  ✅ Adaptive           │
│  ✅ Predictable  │  ✅ Better       │  ✅ Respects           │
│  ❌ Breaks       │     retrieval    │     document           │
│     meaning      │  ❌ Expensive    │     structure          │
│                  │  ❌ Complex      │  ❌ More complex       │
└──────────────────┴──────────────────┴────────────────────────┘

2. Retrieval Patterns

┌─────────────┐    ┌──────────────────┐    ┌───────────────────┐
│  Naive RAG  │───>│  Advanced RAG    │───>│  Modular RAG      │
│             │    │                  │    │                   │
│ Query ──>   │    │ Query Rewrite    │    │ Router ──> RAG    │
│ Retrieve -> │    │ ──> HyDE         │    │       ──> Agent   │
│ Generate    │    │ ──> Retrieve     │    │       ──> Direct  │
│             │    │ ──> Rerank       │    │                   │
│             │    │ ──> Generate     │    │ Composable        │
│             │    │                  │    │ pipelines         │
└─────────────┘    └──────────────────┘    └───────────────────┘
     🟢                   🟡                       🔴

3. Context Compression

Reduce token usage while preserving signal:

Extractive: Select only the most relevant sentences/paragraphs
Abstractive: Summarize retrieved chunks before injection
Token-level: Use LLMLingua to prune low-information tokens (up to 20x compression)
Semantic deduplication: Remove redundant information across retrieved chunks

4. Context Caching

Reuse expensive context across requests:

Prefix caching: Cache system prompts and few-shot examples (supported by Anthropic, Google)
KV-cache sharing: Share key-value caches across similar requests
Semantic caching: Cache responses for semantically similar queries

5. Sliding Window Attention

Full Attention (O(n^2)):     Sliding Window (O(n * w)):
┌─────────────┐              ┌─────────────┐
│ █ █ █ █ █ █ │              │ █ █ █ · · · │
│ █ █ █ █ █ █ │              │ · █ █ █ · · │
│ █ █ █ █ █ █ │              │ · · █ █ █ · │
│ █ █ █ █ █ █ │              │ · · · █ █ █ │
│ █ █ █ █ █ █ │              │ · · · · █ █ │
│ █ █ █ █ █ █ │              │ · · · · · █ │
└─────────────┘              └─────────────┘

6. Multi-hop Reasoning

Chain multiple retrieval steps to answer complex questions:

Decompose query into sub-questions
Retrieve for each sub-question independently
Synthesize intermediate answers
Use intermediate answers to refine retrieval
Generate final comprehensive answer

7. Few-Shot Learning Optimization

Dynamic few-shot: Select examples most similar to the current query
Diverse few-shot: Ensure coverage of edge cases and formats
Ordered few-shot: Place most relevant examples closest to the query (recency bias)

8. System Prompt Engineering

┌────────────────────────────────────────────┐
│            SYSTEM PROMPT LAYERS             │
├────────────────────────────────────────────┤
│  1. IDENTITY   │ Role, persona, expertise  │
│  2. CONTEXT    │ Background information    │
│  3. RULES      │ Constraints, boundaries   │
│  4. FORMAT     │ Output structure          │
│  5. EXAMPLES   │ Reference behaviors       │
│  6. FALLBACK   │ Edge case handling        │
└────────────────────────────────────────────┘

🗺️ Roadmap

Context Engineering Learning Path

                    🎯 CONTEXT ENGINEERING MASTERY
                              │
                 ┌────────────┴────────────┐
                 │                         │
            FOUNDATIONS              APPLICATIONS
                 │                         │
         ┌───────┴───────┐          ┌──────┴──────┐
         │               │          │             │
    THEORY          PRACTICE    PRODUCTION    RESEARCH
         │               │          │             │
         ▼               ▼          ▼             ▼

🟢 BEGINNER (Weeks 1-4)
├── Understand transformer attention mechanisms
├── Learn token counting and context window basics
├── Master basic prompt engineering patterns
├── Study the Illustrated Transformer blog post
├── Complete DeepLearning.AI prompt engineering course
└── Build a simple chatbot with system prompts

🟡 INTERMEDIATE (Weeks 5-10)
├── Implement naive RAG with vector database
├── Learn chunking strategies (fixed, semantic, recursive)
├── Study embedding models and similarity search
├── Implement context compression techniques
├── Build an advanced RAG system with reranking
├── Learn evaluation metrics (faithfulness, relevance, recall)
├── Study "Lost in the Middle" paper and information ordering
└── Complete LangChain / LlamaIndex course

🔴 ADVANCED (Weeks 11-16)
├── Design multi-agent systems with shared context
├── Implement hierarchical memory (short/long/working)
├── Build modular RAG pipelines with routing
├── Study DSPy for programmatic prompt optimization
├── Implement context caching and cost optimization
├── Learn to evaluate with RAGAS, DeepEval, or custom evals
├── Study agentic RAG patterns (CRAG, Self-RAG, FLARE)
└── Build a production system with monitoring and fallbacks

⭐ EXPERT (Ongoing)
├── Contribute to open-source context engineering tools
├── Publish research on novel context management techniques
├── Design context architectures for enterprise systems
├── Optimize for cost, latency, and quality simultaneously
└── Mentor others in context engineering practices

Contributing

We welcome contributions from the community. Here is how you can help:

Add a resource — Open a PR with a new paper, video, blog post, or tool
Fix errors — Found a broken link or incorrect information? Open an issue
Improve explanations — Help make the techniques section clearer
Add code examples — Contribute working code for context engineering patterns
Translate — Help translate this guide to other languages

Please read our contribution guidelines before submitting.

How to Contribute

# Fork the repository
git clone https://github.com/mlnjsh/context-engineering.git
cd context-engineering

# Create a feature branch
git checkout -b add-new-resource

# Make your changes and commit
git add .
git commit -m "Add [resource type]: [resource name]"

# Push and create a PR
git push origin add-new-resource

Citation

If you find this resource helpful in your research or work, please consider citing it:

@misc{joshi2025contextengineering,
  title   = {Context Engineering: The Complete Guide},
  author  = {Joshi, Milan Amrut},
  year    = {2025},
  url     = {https://github.com/mlnjsh/context-engineering},
  note    = {A curated guide to context engineering for large language models}
}

License

This work is licensed under the MIT License.

Built with care by Professor Milan Amrut Joshi

Professor of Data Science, Northwestern University

If this resource helped you, please consider giving it a star.

Contributors & Domain Experts

_{Milan Amrut Joshi}
_{Project Author}

_{Simon Willison}
_{LLM context & prompt engineering expert}

_Brex
_{Prompt engineering best practices}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
blogs		blogs
courses		courses
papers		papers
techniques		techniques
tools		tools
videos		videos
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

🧠 Context Engineering — The Complete Guide

Everything you need to know about Context Windows, Prompt Engineering, and Building Better AI Systems

Table of Contents

What is Context Engineering?

Context Engineering vs Prompt Engineering

Why It Matters in 2025-2026

Key Concepts

📋 Context Window Sizes

📚 Research Papers

Context Window & Long Context

Retrieval-Augmented Generation (RAG)

Prompt Engineering & Optimization

Memory & Context Management

📹 YouTube Videos & Talks

📝 Blog Posts & Articles

🛠️ Tools & Frameworks

Orchestration Frameworks

Vector Databases

Embedding Models

Context Management & Agents

🎓 Courses & Tutorials

📊 Techniques & Patterns

1. Chunking Strategies

2. Retrieval Patterns

3. Context Compression

4. Context Caching

5. Sliding Window Attention

6. Multi-hop Reasoning

7. Few-Shot Learning Optimization

8. System Prompt Engineering

🗺️ Roadmap

Context Engineering Learning Path

Contributing

How to Contribute

Citation

License

Contributors & Domain Experts

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages