# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Purpose

UnrealDocGenerator is a tool for generating documentation from Unreal Engine C++ header files and exposing it to Claude via an MCP server for item-granularity lookups.

## Current State

Implementation complete. Scripts live in `docgen/`:
- `docgen/ue_parser.py` — Parses UE headers into dataclasses
- `docgen/ue_markdown.py` — Renders parsed data as Markdown (ultra-compact format, documented items only)
- `docgen/generate.py` — CLI entry point; two-pass pipeline (parse-all → build type index → render-all)
- `docgen/ue_mcp_server.py` — MCP server exposing 5 tools for item-granularity doc lookups
- `.mcp.json` — Registers the MCP server with Claude Code (stdio transport)

## Usage

```bash
python docgen/generate.py <input> [input2 ...] <output_dir>

python docgen/generate.py Runtime/Engine/ Runtime/AIModule/ docs/   # multiple directories
python docgen/generate.py Runtime/Engine/Classes/GameFramework/Actor.h docs/  # single file
```

Output: one `.md` per `.h` + `docs/type-index.txt` (compact `TypeName: path/to/File.md` lookup).

The last argument is always the output directory. All preceding arguments are inputs (files or directories, processed recursively).

## MCP Server

`docgen/ue_mcp_server.py` exposes 5 tools Claude can call directly:

| Tool | Purpose |
|---|---|
| `search_types(pattern)` | Regex search over `type-index.txt` — locate a type before fetching details |
| `get_class_overview(class_name)` | Compact view: description + inherits + property/function name lists |
| `get_member(class_name, member_name)` | Full doc for one function or property (all overloads) |
| `get_file(relative_path)` | Full `.md` file — use when you need delegates, enums, or multiple classes |
| `search_source(pattern, path_hint)` | Grep UE source `.h` files via `$UE_ENGINE_ROOT` |

Requires `pip install mcp`. Reads `$UE_DOCS_PATH`; `search_source` also requires `$UE_ENGINE_ROOT`.

Typical query flow: `search_types` → `get_class_overview` → `get_member` (~3 calls, ~15 lines vs ~100 lines for a full file read).

## Architecture Notes

- Parser uses a **position-based scanner**, not line-by-line regex, to handle nested braces correctly.
- `find_matching_close()` extracts balanced `{}`/`()` while skipping `//`, `/* */`, and string literals.
- Only items with actual C++ doc comments (`/** */`) are included in output (`_has_doc()` filter).
- Two-pass pipeline in `generate.py`: Pass 1 parses all files, Pass 2 renders with cross-reference links.
- Multi-input: each `(header, base)` pair tracks its own base dir for correct relative output paths.
- Inline cross-references: base class names linked to their `.md` when in corpus (`_make_type_link()`).
- Deprecated functions excluded from output (`not f.is_deprecated` in visibility filter).
- Enums with no value descriptions use compact inline format (`Values: A, B, C`) instead of a table.
- Placeholder descriptions (`------//`) filtered by `_PLACEHOLDER_RE` in `_fn_body()`.

## Critical Gotchas

- **Never** use `\s` inside a `[...]` character class with `*?` on C++ source — causes catastrophic regex backtracking. Use line-based scanning instead.
- Identifier regex must be `r'\w+'`, **not** `r'[\w:~]+'` — the latter consumes `public:` as one token, breaking access specifier detection (all members become `private`).
- Module name casing: use `_caps_to_camel()` (not `.title()`) — `AIMODULE.title()` → `Aimodule` instead of `AIModule`.