Skip to content

Repoindex and DAG grep

Repoindex is foxctl’s per-workspace code graph. It stores packages, files, symbols, concepts, and typed edges so agents can navigate code by relationships — calls, references, imports — beyond text search.

For the data model details (node kinds, edge kinds, IDs, weights), see repoindex model.

The repo graph is derived from source code and must be built before it can be queried. Builds are incremental by default — only changed files are re-indexed.

For a Go + TypeScript + Elixir project:

Terminal window
foxctl index repo build --workspace . --go --typescript --elixir

For repos without Go source, disable Go indexing:

Terminal window
foxctl index repo build --workspace . --go=false --typescript --elixir

Include Terraform, Kubernetes manifests, and shell scripts when those files matter for navigation:

Terminal window
foxctl index repo build --workspace . --terraform --kubernetes --shell
FlagDefaultPurpose
--gotrueIndex Go source files
--typescriptfalseIndex TypeScript .ts/.tsx files
--elixirfalseIndex Elixir .ex/.exs files
--terraformfalseIndex Terraform .tf files
--kubernetesfalseIndex Kubernetes YAML manifests
--shellfalseIndex shell scripts
--incrementaltrueSkip rebuild when stored file state is current
--include-testsfalseIndex test files
--go-pattern./...Scope Go packages
--include-semantic-anchorsfalseInclude semantic anchor concept nodes and edges
--include-cochangefalseInclude empirical git co-change file edges
--dry-runfalseBuild without writing to the index

Force a full rebuild:

Terminal window
foxctl index repo build --workspace . --go --typescript --incremental=false

View the active database path and build metadata:

Terminal window
foxctl index repo status --workspace .

Generated repoindex databases live under:

~/.foxctl/storage/repoindex/<repo>-repoindex-<hash>.db

These are local artifacts and should not be committed to version control.

Repoindex build does not attach summaries by default. After building the graph, generate and attach stored file and symbol summaries to graph nodes:

Terminal window
# Generate summaries
foxctl index file-summaries --workspace .
foxctl index symbol-summaries --workspace .
# Attach summaries to graph nodes
foxctl index repo enrich summaries --workspace .

Summaries come from the same file summary store that the semantic tree uses. After enrichment, graph search and open output include summary text alongside node metadata.

Equivalent skill wrappers:

Terminal window
foxctl run repo/index_build --input '{"workspace": ".", "include_go": true, "include_typescript": true}'
foxctl run repo/index_enrich_summaries --input '{"workspace": "."}'

After a repoindex schema change, rebuild any affected skill artifacts:

Terminal window
make skill SKILL=repo_index_search

Full-text search over node names, signatures, docs, and summaries using SQLite FTS with BM25 scoring:

Terminal window
foxctl index repo search --workspace . --query "Supervisor" --limit 10

Search fallback behavior:

  1. Raw trimmed query
  2. Quoted query fallback
  3. OR-joined multi-word fallback

This fallback sequence affects seed selection for DAG grep.

Retrieve a single node by its ID:

Terminal window
foxctl index repo open --workspace . --id "<node-id>"

Copy node IDs from search, expand, or dag_grep output. Do not hand-write IDs unless a test owns the fixture.

Run an LLM tool loop over repoindex to answer natural-language questions:

Terminal window
foxctl index repo ask --workspace . --question "Where is repoindex built?"

Starting from a seed node, traverse typed edges to discover connected code:

Terminal window
foxctl index repo expand --workspace . \
--seed "sym:go:github.com/joshka0/foxctl/internal/intelligence/indexing/repoindex:internal/intelligence/indexing/repoindex/builder.go:Builder.addGoReferenceEdges" \
--edge CALLS --edge REFERS_TO \
--direction out --depth 2 --budget 50
ParameterDefaultPurpose
--seedrequiredStarting node ID
--edgenoneEdge types to traverse (repeatable)
--directionoutout (forward) or in (reverse)
--depth2Maximum traversal depth
--budget50Maximum distinct nodes to return
{
"data": {
"result": {
"nodes": [
{
"id": "sym:go:...:Builder.addGoReferenceEdges",
"kind": "symbol",
"file": "internal/intelligence/indexing/repoindex/builder.go",
"name": "Builder.addGoReferenceEdges"
},
{
"id": "sym:go:...:goCallTargetNodeID",
"kind": "symbol",
"file": "internal/intelligence/indexing/repoindex/builder.go",
"name": "goCallTargetNodeID"
}
],
"edges": [
{
"src": "sym:go:...:Builder.addGoReferenceEdges",
"dst": "sym:go:...:goCallTargetNodeID",
"type": "CALLS",
"weight": 1
}
]
}
}
}

DAG grep is an explanation-subgraph query over repoindex. It starts from query seeds, expands a bounded graph neighborhood, and renders a compact layered explanation subgraph.

Use it to answer questions like:

  • “What calls or uses X?”
  • “Show me the nearby graph around Y”
  • “What code and doc relationships surround this topic?”
Terminal window
foxctl run code/dag_grep --input '{
"query": "buildEvidencePack",
"workspace": ".",
"render": "tree",
"edge_sets": ["structural"],
"depth": 2,
"budget": 80,
"k": 5
}'
FieldDefaultPurpose
queryrequiredText query for seed selection
modehybridQuery mode hint: fts, semantic, hybrid
k10Number of top seeds to keep
node_kindsnoneFilter seeds by node kind
edge_typesstructural setExplicit edge types to traverse
directionoutout or in
depth2Traversal depth limit
budget80Maximum distinct nodes in result
per_node_cap20Maximum edges fetched per expanded node
include_anchorstrueAdd package/file anchor context
rendernonetree or mermaid output format
Edge setIncludes
structuralCALLS, REFERS_TO, IMPORTS, CONTAINS, IMPLEMENTS, EMBEDS, TESTS
CustomPass specific edge_types for targeted traversal
  1. Seed selection — Runs FTS search with limit k * 3, normalizes BM25 scores (1 / (1 + bm25)), and keeps the top k results.
  2. Weighted expansion — Traverses edges using a max-heap frontier keyed by relevance score. Score decays with depth: next_score = current_score * edge_weight * 0.85^(next_depth).
  3. Anchoring — When include_anchors is true, adds containing file and package nodes for symbol results.
  4. Layering — Computes a DAG view with forward edges and back-edges. Seeds are layer 0; reachable neighbors get increasing layer numbers.
  5. Output — Returns nodes, edges, DAG layers, forward edges, back-edges, and statistics.
  • direction: "out" — expansion fetches outgoing edges from each node; forward DAG layering treats src -> dst as forward.
  • direction: "in" — expansion fetches incoming edges to each node; DAG layering inverts edge orientation.

The stored edge direction in the graph is unchanged regardless of direction setting.

  • tree — ASCII tree rendering of the explanation subgraph
  • mermaid — Mermaid diagram syntax

A DAG grep result includes:

FieldContents
seedsStarting nodes with normalized scores
graph.nodesAll discovered nodes
graph.edgesAll discovered edges
dag.layersLayer assignments (seeds = layer 0)
dag.edgesForward edges (source layer < destination layer)
dag.back_edgesCross-links, reverse links, same-layer links
statsSeed count, node count, edge count
warningsAny notices about mode fallback or limits

Back-edges are not errors — they capture cross-links, reverse links, and other non-layer-advancing edges.

Weights encode confidence and traversal preference, not cost. Higher weight means stronger preference:

WeightEdge typesInterpretation
1.0Go CALLS, REFERS_TO, IMPORTS, CONTAINS; synthetic anchorsExact structural facts from the indexer
0.9TypeScript heuristic CALLSStrong heuristic, not type-checked
0.85Elixir heuristic REFERS_TOStrong heuristic, name-matched
0.75Doc-to-symbol edges (DOC_RELATED, DOC_FLOW)Contextual hints, not runtime coupling
0.7TypeScript IMPORTSModerate confidence
0.6Concept edges from doc index blocksSoft discoverability hints

Practical rules:

  • Prefer CALLS over REFERS_TO when reasoning about executable flow.
  • Treat TypeScript and Elixir non-1.0 edges as heuristic evidence.
  • Treat doc-derived edges as contextual hints, not proof of runtime coupling.
LanguageNodesEdgesNotes
Gopackages, files, symbolsCONTAINS, IMPORTS, CALLS, REFERS_TOExact typed AST resolution; complete when target symbol is in the graph
TypeScriptpackages, files, symbolsCONTAINS, IMPORTS, heuristic CALLSCall edges are not type-checked; treat as heuristic
Elixirpackages, files, symbolsCONTAINS, heuristic REFERS_TOName-matched references; package identity is directory-based
Terraformpackages, files, conceptsResource, module, provider, variable, output concepts
Kubernetespackages, files, conceptsResource concepts from apiVersion + kind manifests
Shellpackages, files, conceptsCommand and environment variable concepts

For convenience, repoindex operations are also available as skill invocations:

Terminal window
# Build via skill
foxctl run repo/index_build --input '{"workspace": ".", "include_go": true, "include_typescript": true}'
# DAG grep via skill
foxctl run code/dag_grep --input '{
"query": "RetrieveMixed",
"workspace": ".",
"render": "tree",
"edge_sets": ["structural"],
"depth": 2,
"budget": 80
}'
  • Repoindex is derived from source and can be rebuilt at any time. Source files, docs, and stores remain canonical.
  • TypeScript CALLS edges and Elixir REFERS_TO edges are heuristic, not complete language-server-quality call graphs.
  • DAG grep seeds from FTS search, not vector search. The “hybrid” mode currently falls back to FTS behavior.
  • The graph is a directed property graph, not a strict DAG — cycles can exist.
  • Generated repoindex database files are local artifacts and should not be committed to version control.
  • Semantic anchor edges are opt-in (--include-semantic-anchors) and should not be assumed for every graph.

Repoindex databases live under local foxctl storage:

~/.foxctl/storage/repoindex/<repo>-repoindex-<hash>.db

Each workspace gets its own database scoped by repo key. Cross-repo edges are not stored in a single shared graph.

Repoindex queries emit repo_index events into the observability stream. See the storage documentation for the default observability directory.