Best Practices for Large Codebases with Claude Code: Building a Level-Structured Knowledge Base
Abstract: The key to efficiently using Claude Code in large codebases lies in the principle of "Progressive Disclosure." By building a three-tier hierarchical knowledge base, AI can access just the right amount of context when needed, avoiding token waste and context confusion.
Introduction
As AI-assisted programming tools become ubiquitous, developers face a new challenge: how to use Claude Code effectively in large codebases. Placing a massive README file at the root is far from optimal—it leads to token waste, context confusion, and maintenance difficulties.
This article presents a hierarchical knowledge base best practice, using a three-tier documentation structure to ensure Claude Code receives the right information at the right time.
Core Principle: Progressive Disclosure
Progressive Disclosure means providing high-level rules at the root, then offering increasingly specific "why" and "how" details as AI drills down into subdirectories.
Claude Code natively recognizes CLAUDE.md files. We can extend this pattern into a hierarchical approach, aligning documentation structure with the directory tree.
The 3-Tier Hierarchy
Tier 1: Project Root (Global Context)
File Location: /CLAUDE.md
Purpose: The "Constitution"—defining the tech stack, core commands, and non-negotiable standards.
Example Content:
## Build Commands
- Production build: `pnpm build`
- Development mode: `pnpm dev`
## Test Commands
- Unit tests: `pnpm test:unit <file>`
- E2E tests: `pnpm test:e2e`
## Code Style
- Functional components only
- No default exports
- Use Tailwind CSS for styling
## Architecture
- Next.js monorepo
- Shared logic in `/packages/shared`
Tier 2: Module/Domain (Strategic Context)
File Location: /src/features/billing/CONTEXT.md
Purpose: Explaining business logic and invisible data flows that code alone won't reveal.
Example Content:
## Domain Logic
This module handles Stripe integration.
## Data Flow
All payments must trigger the `webhook-handler` before updating the DB.
## Security
- Never expose Secret_Key to the frontend
- Use the proxy in `/api/stripe` for backend calls
## Dependencies
- Depends on UserStore for tax calculations
- Bidirectional sync with OrderService
Tier 3: Leaf/Component (Tactical Context)
File Location: /src/components/DataGrid/NOTES.md
Purpose: Explaining "gotchas" and specific technical debt.
Example Content:
## Performance
- Uses react-virtualized for virtual scrolling
- Do not remove the `rowHeight` prop or it will crash on Safari
## Known Issues
- Sorting toggle has a race condition with the API
- Workaround: use `isLoading` ref to debounce
## TODO
- [ ] Extract sorting logic into a standalone Hook
- [ ] Replace deprecated componentWillReceiveProps
Best Practices for "AI-Friendly" Content
1. Be Explicit, Not Idiomatic
❌ Wrong: Just run the tests
✅ Right: Use npm run test
AI models respond better to explicit instructions. Avoid developer "jargon."
2. Use "Do/Don't" Lists
AI models respond exceptionally well to negative constraints:
## Type Safety
- ❌ Don't use 'any' types
- ✅ Use 'unknown' if the type is truly ambiguous
- ✅ Prefer TypeScript strict mode
3. Reference Specific Files
Let AI know where "The Truth" lives:
## Data Models
- Master schema defined in `/src/db/schema.ts`
- Type exports in `/src/types/index.ts`
4. Keep It Small
Individual context files should be under 50 lines of high-density information. If a file exceeds 10KB, AI may waste too many tokens reading it.
Hierarchical vs. Flat: Comparison
| Feature | Flat (One Big README) | Hierarchical (Level-Structured) |
|---|---|---|
| Token Usage | High (Claude reads everything every time) | Low (Claude only reads relevant folder's notes) |
| Precision | Low (May confuse Webhook rules with UI rules) | High (Specific rules for specific folders) |
| Maintainability | Hard (One file becomes a "junk drawer") | Easy (Small files stay close to the code they describe) |
| Scalability | Poor (Rapidly bloats as project grows) | Good (New modules just add corresponding files) |
Implementation Steps
You don't have to write all of this at once. Adopt a progressive approach:
Phase 1: Root Directory
- Create
/CLAUDE.md - Define tech stack, build commands, code conventions
Phase 2: Core Modules
- Identify 3-5 core business modules
- Create
CONTEXT.mdfor each module - Document data flows, security requirements, dependencies
Phase 3: Complex Components
- Identify components with technical debt or "gotchas"
- Create
NOTES.mddocumenting known issues and workarounds
Phase 4: Let Claude Help
Use Claude Code itself to help generate documentation:
# Ask Claude to analyze a module and generate CONTEXT.md
claude "Analyze the billing module and create a CONTEXT.md
that explains the data flow and security requirements"
Summary
The core value of a hierarchical knowledge base:
- Saves tokens: AI reads only relevant context
- Improves precision: Specific rules apply to specific scopes
- Easy maintenance: Small files stay close to code, low update cost
- Scalable: Grows naturally with the project
When starting, begin with /CLAUDE.md at the root, then expand downward. Remember: The value of documentation lies in being used, not in being finished.
References

