Handbook

Developer Guide — apigee2mulesoft

Present

2-min readUpdated Apr 28, 2026

Copied Raw Markdown!

Five stages. Each stage is a function in `src/pipeline.py`. Each handoff is a pydantic model in `src/models.py`. The CLI in `src/cli.py` wires them together.

---

## Architecture

```mermaid
flowchart TD
 A["Ingestion XML → PolicyList"]
 B["RAG PolicyList → CandidateList"]
 C["Mapping CandidateList → MappingList"]
 D["Execution MappingList → MulesoftProject"]
 E{"Review Gate confidence threshold"}
 F["Done Queue"]
 G["Needs-Review Queue"]
 H["Corpus Feedback (future)"]

A --> B --> C --> D --> E
 E -->|≥ 0.8| F
 E -->|< 0.8| G
 G --> H

style A fill:#E3F2FD,color:#0D47A1
    style B fill:#E3F2FD,color:#0D47A1
    style C fill:#E3F2FD,color:#0D47A1
    style D fill:#E3F2FD,color:#0D47A1
    style F fill:#E8F5E9,color:#1B5E20
    style G fill:#FBE9E7,color:#BF360C
    style H fill:#F3E5F5,color:#4A148C
```

---

## File layout

```
src/
  models.py     # pydantic types for every stage handoff
  pipeline.py   # ingest → rag → map_policies → execute → review
  cli.py        # typer CLI entry point

samples/
  apigee-bundle/
    apiproxy/
      proxy.xml
      policies/
        VerifyAPIKey.xml   # reference test case

pyproject.toml
```

---

## Data flow

Each stage receives the previous stage's output type and returns the next.

| Stage | Input | Output | Location |
|---|---|---|---|
| `ingest` | bundle path | `list[Policy]` | `pipeline.py:ingest` |
| `rag` | `list[Policy]` | `list[tuple[Policy, Candidate]]` | `pipeline.py:rag` |
| `map_policies` | pairs | `list[Mapping]` | `pipeline.py:map_policies` |
| `execute` | `list[Mapping]` | `MulesoftProject` | `pipeline.py:execute` |
| `review` | `MulesoftProject` | side effects (files, stdout) | `pipeline.py:review` |

---

## Models (`src/models.py`)

```python
Policy(name, type: PolicyType, config: dict, chain_pos: int)
Candidate(mulesoft_equiv, retrieval_score, notes)
Mapping(policy, candidate, confidence, reason, needs_review)
MulesoftProject(output_path, mappings: list[Mapping])
```

`PolicyType` is an enum. Add new types there first before wiring stage logic.

---

## Adding a policy mapping

**1. Register the policy type**

```python
# models.py
class PolicyType(str, Enum):
    VERIFY_API_KEY = "VerifyAPIKey"
    SPIKE_ARREST = "SpikeArrest"   # add here
    UNKNOWN = "Unknown"
```

**2. Add a RAG stub**

```python
# pipeline.py — _RAG_STUBS dict
PolicyType.SPIKE_ARREST: Candidate(
    mulesoft_equiv="throttling:rate-limit",
    retrieval_score=0.85,
    notes="Maps SpikeArrest to Mulesoft rate-limit policy",
),
```

**3. Add a sample policy XML** under `samples/apigee-bundle/apiproxy/policies/` for manual testing.

**4. Smoke test**

```bash
python src/cli.py samples/apigee-bundle --output /tmp/out
cat /tmp/out/migration-report.md
```

---

## Confidence threshold

Set in `map_policies`:

```python
needs_review = confidence < 0.8
```

Change the threshold as corpus quality improves. Future: make it per-policy-type.

---

## What is stubbed (replace these to go to production)

| Stage | Stub | Replace with |
|---|---|---|
| RAG | hardcoded `_RAG_STUBS` dict | pgvector + OpenAI embeddings query |
| Mapping | confidence = retrieval score directly | LLM call via OpenRouter / pydantic-ai |
| Execution | writes comment-only XML | real Mulesoft XML generation |
| Review feedback | terminal prompt only | corpus write-back to pgvector |

---

## Running locally

```bash
# install deps
uv pip install -e .

# run against sample bundle
python src/cli.py samples/apigee-bundle --output /tmp/mulesoft-out

# inspect output
cat /tmp/mulesoft-out/migration-report.md
```

> [!note] PYTHONPATH
> Run from the repo root so `src/` is on the path, or add `src/` to `PYTHONPATH`.

---

## Build order (next phases)

```tasks
title: Pipeline Build Order
width: 90vw
height: 60vh
---
task P1 "Ingestion & Parsing"
  estimate: 1w
  phase: Spine
task P2 "Policy RAG Layer"
  estimate: 2w
  phase: Spine
  depends_on: [P1]
task P3 "Policy Mapping + Model Routing"
  estimate: 2w
  phase: Spine
  depends_on: [P2]
task P4 "Migration Execution"
  estimate: 2w
  phase: Spine
  depends_on: [P3]
task P5 "Human Review Gate"
  estimate: 2w
  phase: Spine
  depends_on: [P4]
task B1 "Model Routing Config"
  estimate: 1w
  phase: Bulge
  depends_on: [P3]
task B2 "Progress Tracker"
  estimate: 1w
  phase: Bulge
  depends_on: [P5]
task B3 "Corpus Feedback Loop"
  estimate: 1w
  phase: Bulge
  depends_on: [P5]
```

Five stages. Each stage is a function in src/pipeline.py. Each handoff is a pydantic model in src/models.py. The CLI in src/cli.py wires them together.

Architecture URL copied

flowchart TD
    A["Ingestion<br/>XML → PolicyList"]
    B["RAG<br/>PolicyList → CandidateList"]
    C["Mapping<br/>CandidateList → MappingList"]
    D["Execution<br/>MappingList → MulesoftProject"]
    E{"Review Gate<br/>confidence threshold"}
    F["Done Queue"]
    G["Needs-Review Queue"]
    H["Corpus Feedback<br/>(future)"]

    A --> B --> C --> D --> E
    E -->|≥ 0.8| F
    E -->|< 0.8| G
    G --> H

    style A fill:#E3F2FD,color:#0D47A1
    style B fill:#E3F2FD,color:#0D47A1
    style C fill:#E3F2FD,color:#0D47A1
    style D fill:#E3F2FD,color:#0D47A1
    style F fill:#E8F5E9,color:#1B5E20
    style G fill:#FBE9E7,color:#BF360C
    style H fill:#F3E5F5,color:#4A148C

File layout URL copied

src/
  models.py     # pydantic types for every stage handoff
  pipeline.py   # ingest → rag → map_policies → execute → review
  cli.py        # typer CLI entry point

samples/
  apigee-bundle/
    apiproxy/
      proxy.xml
      policies/
        VerifyAPIKey.xml   # reference test case

pyproject.toml

Data flow URL copied

Each stage receives the previous stage's output type and returns the next.

Stage	Input	Output	Location
`ingest`	bundle path	`list[Policy]`	`pipeline.py:ingest`
`rag`	`list[Policy]`	`list[tuple[Policy, Candidate]]`	`pipeline.py:rag`
`map_policies`	pairs	`list[Mapping]`	`pipeline.py:map_policies`
`execute`	`list[Mapping]`	`MulesoftProject`	`pipeline.py:execute`
`review`	`MulesoftProject`	side effects (files, stdout)	`pipeline.py:review`

Models (src/models.py)URL copied

Policy(name, type: PolicyType, config: dict, chain_pos: int)
Candidate(mulesoft_equiv, retrieval_score, notes)
Mapping(policy, candidate, confidence, reason, needs_review)
MulesoftProject(output_path, mappings: list[Mapping])

PolicyType is an enum. Add new types there first before wiring stage logic.

Adding a policy mapping URL copied

1. Register the policy type

# models.py
class PolicyType(str, Enum):
    VERIFY_API_KEY = "VerifyAPIKey"
    SPIKE_ARREST = "SpikeArrest"   # add here
    UNKNOWN = "Unknown"

2. Add a RAG stub

# pipeline.py — _RAG_STUBS dict
PolicyType.SPIKE_ARREST: Candidate(
    mulesoft_equiv="throttling:rate-limit",
    retrieval_score=0.85,
    notes="Maps SpikeArrest to Mulesoft rate-limit policy",
),

3. Add a sample policy XML under samples/apigee-bundle/apiproxy/policies/ for manual testing.

4. Smoke test

python src/cli.py samples/apigee-bundle --output /tmp/out
cat /tmp/out/migration-report.md

Confidence threshold URL copied

Set in map_policies:

needs_review = confidence < 0.8

Change the threshold as corpus quality improves. Future: make it per-policy-type.

What is stubbed (replace these to go to production)URL copied

Stage	Stub	Replace with
RAG	hardcoded `_RAG_STUBS` dict	pgvector + OpenAI embeddings query
Mapping	confidence = retrieval score directly	LLM call via OpenRouter / pydantic-ai
Execution	writes comment-only XML	real Mulesoft XML generation
Review feedback	terminal prompt only	corpus write-back to pgvector

Running locally URL copied

# install deps
uv pip install -e .

# run against sample bundle
python src/cli.py samples/apigee-bundle --output /tmp/mulesoft-out

# inspect output
cat /tmp/mulesoft-out/migration-report.md

PYTHONPATH

Run from the repo root so src/ is on the path, or add src/ to PYTHONPATH.

Build order (next phases)URL copied

Items

0 groups, 0 items, 0 edges

End-User Guide — apigee2mulesoft →

Developer Guide — apigee2mulesoft

ArchitectureURL copied

File layoutURL copied

Data flowURL copied

Models (src/models.py)URL copied

Adding a policy mappingURL copied

Confidence thresholdURL copied