This is the most significant risk in the entire system. A migration marked "completed" with no validation strategy means "the code was generated" not "the API behaves identically." These are not the same thing.
The problem is that proving two APIs behave equivalently requires either running both against the same input set and comparing outputs, or having a formal specification of what the source API is supposed to do — and most source APIs have neither. Unit tests may not exist. Postman collections, if they exist, test the happy path. Custom scripts in policy chains may have side effects that are not visible from the policy configuration alone.
The least-bad options are: (1) contract testing — generate an OpenAPI or RAML spec from the source API and use it as a contract both implementations must satisfy; (2) traffic replay — capture live traffic from the source API and replay it against the migrated version, comparing responses; (3) LLM-generated test cases — use the LLM to generate test cases from the policy chain logic, with the explicit caveat that LLM-generated tests cannot catch what the LLM's code generation missed.
For the initial 20–30 API workbench, the recommended approach is contract testing where specs exist and LLM-generated tests with mandatory human sign-off where they do not. Traffic replay is the right long-term answer but requires operational infrastructure (traffic capture, replay tooling) that is out of scope for the initial build.