Agent Testing Guide

Sample prompts and validation checklists for testing the VS Code custom agents.

Prerequisites

Before You Test

The codewiki-mcp MCP server must be running and configured in .vscode/mcp.json (included in this repo).
All 6 .agent.md files must be in .github/agents/.
Test in the VS Code Chat panel (Ctrl+Shift+I), not inline chat. The .agent.md custom agents only activate when invoked via @codewiki in the Chat panel. The runSubagent tool in regular Copilot chat does NOT wire up MCP tools to subagents.
Custom agents only activate via @codewiki in the Chat panel.

Agent Configuration Reference

Current YAML frontmatter for each agent (must match .github/agents/ files):

Master Orchestrator (`codewiki.agent.md`)

name: CodeWiki
description: Master agent that routes your request to the right CodeWiki specialist
model: GPT-5.3-Codex
tools:
  [read, agent, codewiki-mcp/*]
agents:
  [CodeWiki Researcher, CodeWiki Code Review, CodeWiki Architecture Explorer, CodeWiki Comparison, CodeWiki Synthesizer]

⚠️ Model: The master must use a 1× credit model like GPT-5.3-Codex. Free/low-tier models (GPT-5 mini) produce inconsistent routing, truncated results, and skipped delegation.

Why codewiki-mcp/* on the master? The master must declare MCP tools so they are exposed to subagents when spawned. The master itself still acts as a router — it delegates via agent and does not call CodeWiki tools directly.

Subagents

# Researcher, Code Review, Architecture Explorer, Comparison:
model: GPT-5 mini
user-invokable: false
tools:
  [read, codewiki-mcp/*]

# Synthesizer (needs stronger reasoning for multi-repo integration):
model: GPT-5.3-Codex
user-invokable: false
tools:
  [read, codewiki-mcp/*]

Agent File	Name	Specialty
`codewiki-researcher.agent.md`	CodeWiki Researcher	General exploration
`codewiki-reviewer.agent.md`	CodeWiki Code Review	Module/function analysis
`codewiki-architect.agent.md`	CodeWiki Architecture Explorer	System design
`codewiki-comparison.agent.md`	CodeWiki Comparison	Multi-repo comparison
`codewiki-synthesizer.agent.md`	CodeWiki Synthesizer	Combine parts from multiple repos

Routing Quick Reference

User Intent	Subagent	Signal Words
General exploration	CodeWiki Researcher	"what is", "explain", "tell me about", "overview"
Code analysis	CodeWiki Code Review	"review", "analyse", "module", "function", "code"
System design	CodeWiki Architecture Explorer	"architecture", "design", "structure", "hierarchy"
Multi-repo comparison	CodeWiki Comparison	"compare", "vs", "difference", "or"
Multi-repo synthesis	CodeWiki Synthesizer	"combine", "merge", "build using", "take X from A and Y from B"
Unindexed repo	CodeWiki Researcher	Subagent detects NOT_INDEXED and calls `codewiki_request_indexing`

1. CodeWiki Researcher (General Exploration)

Routing trigger: General "what is", "explain", "tell me about" questions.

Sample Prompts

@codewiki What is facebook/prophet and what are its main features?

@codewiki Explain the key concepts behind pallets/flask

@codewiki What topics does CodeWiki have for microsoft/vscode?

# Bare keyword — resolves automatically (v1.2.0+)
@codewiki What is prophet and what are its main features?

Expected Behaviour

Step	What Should Happen
1	Master spawns CodeWiki Researcher via the `agent` tool
2	Researcher calls `codewiki_list_topics` to discover available wiki sections
3	Researcher calls `codewiki_read_structure` and/or `codewiki_read_contents`
4	Researcher synthesises a summary from CodeWiki content

Validation Checklist

Master does not answer from its own knowledge
Master uses the agent tool (not direct tool calls to read/search)
Researcher cites CodeWiki sections in its answer
Response contains real documentation content, not generic descriptions
Master presents the full subagent response (not a brief summary)
If a bare keyword was used, response is prefixed with a resolution note (e.g. > Resolved: keyword "prophet" → facebook/prophet)

2. CodeWiki Code Review (Module / Function Analysis)

Routing trigger: "review", "analyse", "what does module X do", code-level questions.

Sample Prompts

@codewiki Review the forecaster module in facebook/prophet — what does it do?

@codewiki What code patterns are used in the routing module of pallets/flask?

@codewiki Analyse the error handling approach in fastapi/fastapi

Expected Behaviour

Step	What Should Happen
1	Master spawns CodeWiki Code Review via the `agent` tool
2	Reviewer calls `codewiki_search_wiki` to find relevant code documentation
3	Reviewer calls `codewiki_read_contents` for detailed section content
4	Reviewer provides code-level analysis with citations

Validation Checklist

Master delegates to CodeWiki Code Review, not Researcher
Reviewer focuses on code structure, patterns, and implementation details
Response references specific modules, classes, or functions
No hallucinated code — all content sourced from CodeWiki
Master presents the full subagent response (not a brief summary)

3. CodeWiki Architecture Explorer (System Design)

Routing trigger: "architecture", "design", "how is X structured", "component hierarchy".

Sample Prompts

@codewiki Explain the overall architecture of facebook/react

@codewiki How is the plugin system architected in vitejs/vite?

@codewiki Describe the component hierarchy and data flow in vuejs/core

Expected Behaviour

Step	What Should Happen
1	Master spawns CodeWiki Architecture Explorer via the `agent` tool
2	Explorer calls `codewiki_read_structure` to map the documentation tree
3	Explorer calls `codewiki_read_contents` for architecture-related sections
4	Explorer produces a structured architecture overview

Validation Checklist

Master delegates to CodeWiki Architecture Explorer
Response covers high-level design (layers, components, data flow)
Includes or references structural breakdowns from CodeWiki
Does not devolve into code-level details (that's the Reviewer's job)
Master presents the full subagent response (not a brief summary)

4. CodeWiki Comparison (Multi-Repo)

Routing trigger: "compare", "vs", "difference between", "X or Y".

Sample Prompts

@codewiki Compare fastapi/fastapi vs pallets/flask — architecture, performance, and developer experience

@codewiki Compare facebook/react vs vuejs/core in terms of rendering strategy

@codewiki What are the differences between expressjs/express and koajs/koa?

Expected Behaviour

Step	What Should Happen
1	Master spawns CodeWiki Comparison via the `agent` tool
2	Comparison agent calls CodeWiki tools for each repo independently
3	Agent builds a side-by-side analysis from real documentation
4	Agent produces a structured comparison table or narrative

Validation Checklist

Master delegates to CodeWiki Comparison, not Researcher
Agent fetches documentation from both repos (not just one)
Comparison is grounded in CodeWiki content, not generic knowledge
Response includes a structured comparison (table, bullet list, or sections)
Master presents the full subagent response (not a brief summary)

5. Request Indexing (Unindexed Repo — Subagent Handles It)

Routing trigger: Repo that returns NOT_INDEXED from any CodeWiki tool.

New in v1.3.0: The tool now uses MCP Elicitation to confirm with the user before submitting an indexing request.

Sample Prompts

@codewiki Check if Snowflake-Labs/agent-world-model is available on CodeWiki

@codewiki What does CodeWiki have for some-org/obscure-repo?

Expected Behaviour

Step	What Should Happen
1	Master classifies this as a general exploration request
2	Master spawns CodeWiki Researcher via the `agent` tool
3	Researcher calls a CodeWiki tool and gets `NOT_INDEXED` error
4	Researcher calls `codewiki_request_indexing` to submit the repo
5	Server asks user for confirmation via MCP Elicitation (v1.3.0+)
6	Researcher reports back; Master presents the full result to user

Validation Checklist

Master does not call any MCP tools directly (it delegates via agent)
A subagent detects NOT_INDEXED and calls codewiki_request_indexing
User is asked to confirm indexing via elicitation prompt (v1.3.0+)
User is informed the repo has been submitted for indexing
Master presents the full subagent response (not a brief summary)

6. CodeWiki Synthesizer (Multi-Repo Solution Building)

Routing trigger: User wants to BUILD something new by combining parts from multiple repos. Distinct from Comparison which evaluates/contrasts.

Sample Prompts

@codewiki I want to build an API server that uses the routing system from pallets/flask and the async handling from fastapi/fastapi. Help me design it.

@codewiki Take the plugin architecture from vitejs/vite and the component model from vuejs/core — design a new framework that combines both.

@codewiki Combine the authentication approach from supabase/supabase with the event pipeline from apache/kafka into a real-time auth notification system.

# Intentionally vague — Synthesizer should DISCOVER which parts to take
@codewiki Can you combine the best parts from fastapi/fastapi and pallets/flask into a new web framework solution?

Expected Behaviour

Step	What Should Happen
1	Master detects synthesis intent (“build”, “combine”, “take X from A and Y from B”)
2	Master spawns CodeWiki Synthesizer via the `agent` tool
3	Synthesizer researches each repo using CodeWiki tools (`read_structure`, `read_contents`, `search_wiki`)
4	Synthesizer extracts the specific parts the user requested from each repo
5	Synthesizer identifies cross-repo conflicts and proposes adapters
6	Synthesizer delivers a blueprint: architecture diagram, directory structure, integration code, implementation guide
7	For vague requests: Synthesizer shows a “Parts Selected” table explaining WHY it chose each part

Validation Checklist

Master delegates to CodeWiki Synthesizer, not Comparison
Synthesizer fetches documentation from all mentioned repos
Response includes a Parts Extracted table citing source repos
Response includes Compatibility Analysis (conflicts + resolutions)
Response includes Integration Architecture (Mermaid diagram or description)
Response includes Directory Structure for the new project
Response includes Implementation Guide with actionable steps
For vague requests: includes Parts Selected table with reasoning
All content is grounded in CodeWiki data, not generic knowledge
Master presents the full subagent response (not a brief summary)

7. Keyword Resolution & Disambiguation (Bare Product Names)

Routing trigger: Any prompt using a bare keyword instead of owner/repo format.

Sample Prompts

@codewiki What is vue?

@codewiki Explain the architecture of react

@codewiki Compare vue vs react

@codewiki What topics does openclaw have?

Expected Behaviour

Step	What Should Happen
1	Master delegates to the appropriate subagent (Researcher, Comparison, etc.)
2	Subagent calls a CodeWiki tool with the bare keyword (e.g. `repo_url="vue"`)
3	Tool detects bare keyword and triggers MCP Elicitation (if multiple ambiguous repos found)
4	VS Code shows a selection prompt: “Multiple repositories match 'vue'. Which do you want?”
5	User selects the desired repo (e.g. `vuejs/core` for Vue 3)
6	Response includes resolution note: `> Resolved: keyword "vue" → vuejs/core (52,900★)`
7	Response shows top alternative candidates
8	The rest of the response contains normal CodeWiki documentation

Auto-select (no elicitation):

Canonical match: “openclaw” → openclaw/openclaw (owner == repo == keyword)
Single result: only one repo found → auto-selected

Fallback: If elicitation is unavailable (client doesn’t support it), heuristic selection by star count is used.

Validation Checklist

Bare keyword “vue” triggers elicitation with multiple options (vuejs/vue, vuejs/core, etc.)
User can select vuejs/core (Vue 3) instead of auto-picking vuejs/vue (Vue 2)
Bare keyword “openclaw” auto-resolves to openclaw/openclaw (canonical match, NO elicitation)
Bare keyword “react” triggers elicitation showing facebook/react and alternatives
Resolution note appears at the top of the response with star count
Alternative candidates are listed
Declining/cancelling elicitation falls back to heuristic selection
owner/repo format still works as before (no resolution note, no elicitation)
Full URLs still work as before (no resolution note, no elicitation)

Full Workflow Scenario

This single prompt is designed to trigger all 5 workflow steps (Discover → Navigate → Read → Search → Synthesize) of the Researcher subagent.

Multi-Step Scenario Prompt

@codewiki I want a deep technical explanation of the React Compiler's
compilation pipeline.

Specifically:
1. First, check what topics Google CodeWiki has for facebook/react.
2. Then look at the full table of contents to find sections about
   the compiler.
3. Read the section on "React Compiler Internals" to understand the
   multi-stage compilation pipeline, the IRs (HIR and ReactiveFunction),
   and the key optimization passes.
4. Search for "How does the React Compiler handle memoization and
   reactive scopes?" to get implementation-level details.
5. Combine everything into a single technical summary that covers:
   - The overall compilation pipeline (AST → HIR → ReactiveFunction → codegen)
   - The key intermediate representations and their purpose
   - How reactive scopes are inferred and merged
   - How the compiler replaces manual useMemo/useCallback
   - Cite which CodeWiki sections your answer comes from

Expected Tool Calls

Step	Phase	Tool Call	Purpose
1	Discover	`codewiki_list_topics("facebook/react")`	Verify wiki exists, see available topics
2	Navigate	`codewiki_read_structure("facebook/react")`	Get full ToC with section hierarchy
3	Read	`codewiki_read_contents("facebook/react", "React Compiler Internals")`	Fetch detailed compiler pipeline docs
4	Search	`codewiki_search_wiki("facebook/react", "How does the React Compiler handle memoization?")`	Get Gemini-powered implementation details
5	Synthesize	(no tool call — agent combines results)	Produce cited summary from steps 1-4

Full Workflow Validation

Step 1: Returns status: "ok" with topic list (expect ~26 sections)
Step 2: Returns status: "ok" with hierarchical section structure
Step 3: Returns status: "ok" with detailed content about HIR, ReactiveFunction, compilation passes
Step 4: Returns status: "ok" OR RETRY_EXHAUSTED (upstream timeout is a known CodeWiki issue)
Step 5: Agent produces a coherent summary citing specific CodeWiki sections
All responses include content_hash and idempotency_key
Subsequent identical calls return in <10ms (cache hit)
Agent does NOT call the same tool >2 times for the same repo

Alternative Repos for Testing

If a repo is too large or slow during testing, try these. You can use bare keywords (v1.2.0+) — the server resolves them automatically:

Input	Resolves To	Notes
`anthropics/anthropic-sdk-python`	exact match	Fast wiki generation, good for quick tests
`fastapi`	`fastapi/fastapi`	Bare keyword — well-indexed, good for architecture + review tests
`prophet`	`facebook/prophet`	Bare keyword — good for researcher + review tests
`vscode`	`microsoft/vscode`	Bare keyword — large repo, may be slower
`microsoft/vscode-copilot-chat`	exact match	Microsoft tooling

See also: Agentic AI Guide — full agent definitions, architecture, and lessons learned.

Agent Testing Guide

Prerequisites

Before You Test

Agent Configuration Reference

Master Orchestrator (codewiki.agent.md)

Subagents

Routing Quick Reference

1. CodeWiki Researcher (General Exploration)

Sample Prompts

Expected Behaviour

Validation Checklist

2. CodeWiki Code Review (Module / Function Analysis)

Sample Prompts

Expected Behaviour

Validation Checklist

3. CodeWiki Architecture Explorer (System Design)

Sample Prompts

Expected Behaviour

Validation Checklist

4. CodeWiki Comparison (Multi-Repo)

Sample Prompts

Expected Behaviour

Validation Checklist

5. Request Indexing (Unindexed Repo — Subagent Handles It)

Sample Prompts

Expected Behaviour

Validation Checklist

6. CodeWiki Synthesizer (Multi-Repo Solution Building)

Sample Prompts

Expected Behaviour

Validation Checklist

7. Keyword Resolution & Disambiguation (Bare Product Names)

Sample Prompts

Expected Behaviour

Validation Checklist

Full Workflow Scenario

Multi-Step Scenario Prompt

Expected Tool Calls

Full Workflow Validation

Alternative Repos for Testing

Master Orchestrator (`codewiki.agent.md`)