STET

flux-commit-0064304a

Zod (TypeScript) · W2 · GPT-5.1 Codex Mini

fail_high_conf

Tests failed. 1/9 commands passed. Strength: strong.

61.5% run pass rate
Tier 1
primary testsfailedcommand source driftnon equivalentfail
yarn test -- --runInBand
gold passagent pass
pytest -q tests/behavior/test_index_exports_lowercase_infer_alias_only.py
gold passagent fail
pytest -q tests/behavior/test_readme_parse_deep_clone_wording.py
gold passagent fail
pytest -q tests/behavior/test_readme_prefers_lowercase_infer_everywhere.py
gold passagent fail
pytest -q tests/behavior/test_index_exports_lowercase_infer_alias.py
gold passagent fail
pytest -q tests/behavior/test_readme_prefers_lowercase_infer.py
gold passagent fail
pytest -q tests/behavior/test_readme_records_section.py
gold unknownagent
pytest -q tests/behavior/test_readme_parse_deep_clone.py
gold passagent fail
pytest -q tests/behavior/test_readme_records_with_infer.py
gold passagent fail

Partial score: 1/8

Publishable: yesCache: miss

Trajectory

unknown · partial order only

Canonical trajectory missing; showing coarse derived order only.

patch written
Patch captured
#1

Stet captured agent.patch for this trial.

validation
Tests failed
#2
equivalence
Equivalence judgment
#3

non_equivalent

code review
Code review judgment
#4

fail

decision
Final decision
#5

fail_high_conf

Quality

equivalence
non_equivalent
86% confidence
code review
fail
3 findings
footprint
high (1.00)
behavioral
12.5%
cost
$0.82 · 2.2M

Equivalence Reasoning

behavioral

The patch appears to miss core intent: it only shows a README table-of-contents tweak (no visible added Records documentation, no parse deep-clone clarification, no broad `z.infer` doc updates) and changes/additions are mostly in `app/lib/src` generated-style files rather than the expected source/docs locations. This does not convincingly implement the requested documentation + public API alignment change.

Code Review

correctness: 0/4introduced bug risk: 0/4edge case handling: 0/4maintainability idioms: 0/4

The agent patch is very unlikely to satisfy the intended task: it only partially touched README navigation, missed key documentation and source-export updates, and added extensive generated `lib/` files that increase risk and maintenance cost.

3 findings
README task content is largely missing
major

Only the Records table-of-contents link was added; the required parse deep-clone explanation, widespread `z.infer` replacement examples, and full Records section content were not implemented.

README.md:25
Inference alias update was applied in built output, not source API file
major

The patch adds `TypeOf as infer` in `lib/src/index.d.ts`, but the requested public API change should be made in source (`src/index.ts`) so docs/tests/builds remain consistent.

app/lib/src/index.d.ts:41
Patch introduces broad generated `lib/` churn unrelated to requested change
major

Large additions of compiled JS/DTs/map files under `lib/src` add noise and maintenance burden, and are not needed to satisfy the targeted docs/source updates.

app/lib/src/ZodError.js:1