Code Mapping for AI-Assisted Development

@austegard.com

Code Mapping for AI-Assisted Development

The Problem

When working with unfamiliar codebases, Claude needs context about code structure before making changes. Reading every file wastes tokens and time. Asking Claude to "explore the codebase" produces inconsistent results and burns through context windows.

The Solution

The mapping-codebases skill generates static _MAP.md files that provide hierarchical code structure without requiring file reads. Each map shows:

  • Directory statistics (file count, subdirectory count)
  • Subdirectories with navigation links
  • Files with their exports and imports
  • Counts when lists are truncated (e.g., "exports (23)" showing 8 of 23)

Maps use AST parsing (tree-sitter) to extract structural information deterministically. No LLM calls, no hallucination, just facts.

How It Works

  1. Install once per session:
uv pip install tree-sitter==0.21.3 tree-sitter-languages==1.10.2
  1. Generate maps for your codebase:
python scripts/codemap.py /path/to/repo
  1. For large codebases, skip noise directories:
python scripts/codemap.py /path/to/repo --skip locale,migrations,tests

The script walks your directory tree and creates one _MAP.md per directory. Each map links to subdirectory maps, creating a navigable hierarchy.

Example Output

# django/utils/
*Files: 40 | Subdirectories: 1*

## Subdirectories
- [translation/](./translation/_MAP.md)

## Files
- **cache.py** — exports (10): `patch_cache_control, get_max_age`... — imports (11): `time, collections, hashlib`...
- **crypto.py** — exports: `InvalidAlgorithm, salted_hmac, get_random_string, constant_time_compare, pbkdf2` — imports: `hashlib, hmac, secrets, django.conf`
- **timezone.py** — exports (15): `get_fixed_timezone, get_default_timezone_name`... — imports (6): `functools, zoneinfo, contextlib`...

Impact on Development

Before mapping:

  • Claude asks what files exist
  • Reads files blindly
  • Misses relevant modules
  • Burns tokens on exploration

After mapping:

  • Claude reads root map, sees top-level structure
  • Navigates to relevant subdirectory maps
  • Identifies target files by exports
  • Reads only necessary source files

Real-world example (Django):

  • 883 Python files across 2,454 directories
  • Generated 138 maps (with locale/migrations skipped)
  • Root map: 27 lines showing 15 top-level areas
  • Average map: 8 lines
  • Largest map: 50 lines (utils with 40 files)

You only load maps as you navigate. Even massive codebases stay manageable.

Supported Languages

Python, JavaScript, TypeScript, TSX, Go, Rust, Ruby, Java.

Using Maps with Claude

Add this to your CLAUDE.md or project instructions:

## Codebase Navigation

This repository has `_MAP.md` files in each directory providing structural overviews.

When working with code:
1. Start by reading the root `_MAP.md` to understand top-level structure
2. Navigate to relevant subdirectory maps to find target modules
3. Use export information to identify which files contain needed functionality
4. Read actual source files only after identifying targets via maps

Maps show:
- Directory statistics (file/subdirectory counts)
- Subdirectories with links to their maps
- Files with exports and imports
- Truncation counts (e.g., "exports (23)" when showing subset)

Always check maps before exploring directories or reading files.

Maintenance

Maps are static snapshots. Regenerate after structural changes:

python scripts/codemap.py /path/to/repo

Or add a git hook to keep maps fresh automatically:

# .git/hooks/pre-commit
#!/bin/sh
python /path/to/codemap.py . >/dev/null
git add '*/_MAP.md'

Performance

Map generation is fast and cheap:

  • Django (883 files): ~2 seconds, generates 138 maps
  • Click (17 files): <1 second, generates 1 map
  • No LLM calls - uses deterministic AST parsing (tree-sitter)
  • No token cost during generation
  • Maps are static files - reading them uses minimal tokens (typical map: 8-50 lines)

Generated maps are cached on disk. Regenerate only after structural changes.

Key Features

Hierarchical disclosure: Navigate progressively without loading everything at once.

Skip patterns: Exclude directories that add noise (locale with 100+ language subdirs, migrations, test snapshots).

Export/import counts: Quickly assess module complexity. "exports (23)" signals a large API surface, "exports: 2" signals a simple utility.

Statistics: File and subdirectory counts help assess scope at a glance.

When to Use

  • Any codebase where you want Claude to understand structure before making changes
  • Unfamiliar codebases (inherited projects, open source contributions)
  • Before major refactoring
  • When teaching Claude about your project structure

Works for any size - small projects get instant overview, large codebases (500+ files) stay navigable through hierarchical maps.

Limitations

  • Extracts structure only (exports/imports), not semantic descriptions
  • Private symbols (Python _prefix) excluded from exports
  • Requires regeneration after structural changes
  • Maps capture what exists, not what it does

For semantic understanding, read the actual source files. Maps tell you where to look.

Installation

The skill includes:

  • SKILL.md - Instructions for Claude
  • scripts/codemap.py - Map generator
  • CHANGELOG.md - Version history and testing results

Download and add to your Skills directory, or run the script standalone.

austegard.com
Oskar

@austegard.com

oskar @ austegard.com
AI Explorer - caveat vibrans
Yeah not actually verified

Post reaction in Bluesky

*To be shown as a reaction, include article link in the post or add link card

Reactions from everyone (0)