STET

flux-pr-1308

graphql-go-tools (Go) · W2 · GPT-5.3 Codex

pass_with_warn

Tests passed. 1/1 commands passed. Strength: strong.

93.1% run pass rate
Tier 1
primary testspassednon equivalentfail
go test -C v2 ./... -count=1 -timeout=300s
gold passagent pass

Partial score: 1/1

Publishable: yesCache: miss

Trajectory

codex · partial order only

provider-native trajectory captured; validation and decision steps are appended with coarse ordering only

session start
Session started
#1
assistant turn
Assistant turn
#2
tool call
Command started
#3

shell command

tool call
Command started
#4

shell command

tool result
Command finished
#5

shell command exit code 0

tool result
Command finished
#6

shell command exit code 0

tool call
Command started
#7

shell command

tool result
Command finished
#8

shell command exit code 0

assistant turn
Assistant turn
#9
tool call
Command started
#10

shell command

tool call
Command started
#11

shell command

tool call
Command started
#12

shell command

tool result
Command finished
#13

shell command exit code 0

tool result
Command finished
#14

shell command exit code 0

tool result
Command finished
#15

shell command exit code 0

tool call
Command started
#16

shell command

tool call
Command started
#17

shell command

tool result
Command finished
#18

shell command exit code 0

tool result
Command finished
#19

shell command exit code 0

tool call
Command started
#20

shell command

tool result
Command finished
#21

shell command exit code 0

assistant turn
Assistant turn
#22
tool call
Command started
#23

shell command

tool call
Command started
#24

shell command

tool call
Command started
#25

shell command

tool result
Command finished
#26

shell command exit code 0

tool result
Command finished
#27

shell command exit code 0

tool result
Command finished
#28

shell command exit code 0

tool call
Command started
#29

shell command

tool result
Command finished
#30

shell command exit code 0

tool call
Command started
#31

shell command

tool call
Command started
#32

shell command

tool call
Command started
#33

shell command

tool result
Command finished
#34

shell command exit code 0

tool result
Command finished
#35

shell command exit code 0

tool result
Command finished
#36

shell command exit code 0

tool call
Command started
#37

shell command

tool result
Command finished
#38

shell command exit code 0

tool call
Command started
#39

shell command

tool call
Command started
#40

shell command

tool call
Command started
#41

shell command

tool result
Command finished
#42

shell command exit code 0

tool result
Command finished
#43

shell command exit code 0

tool result
Command finished
#44

shell command exit code 0

tool call
Command started
#45

shell command

tool call
Command started
#46

shell command

tool call
Command started
#47

shell command

tool result
Command finished
#48

shell command exit code 0

patch written
Patch captured
#49

Flux captured agent.patch for this trial

validation
Tests passed
#50

go

equivalence
Equivalence judgment
#51

non_equivalent

code review
Code review judgment
#52

fail

decision
Final decision
#53

pass_with_warn

Quality

equivalence
non_equivalent
84% confidence
code review
fail
4 findings
footprint
medium (0.65)
behavioral
100.0%
cost
$11.74 · 6.0M

Equivalence Reasoning

behavioral

The patch appears to add schema/introspection surface changes and some operation-level/error-location updates, but it does not clearly implement the full required behavior, especially OneOf enforcement in runtime variable validation (`variablesvalidation`) and directive exposure in all expected introspection outputs. It also introduces `isOneOf`-centric introspection changes that differ from the stated intent of exposing the `@oneOf` directive itself.

Code Review

correctness: 1/4edge case handling: 0/4introduced bug risk: 1/4maintainability idioms: 2/4

The patch appears only partially aligned and likely does not satisfy the intended OneOf feature set end-to-end; core validation behavior and introspection direction differ from the requested change.

4 findings
OneOf validation logic appears missing in Values rule
major

The shown change in `operation_rule_values.go` only introduces a directive-name constant and does not include OneOf enforcement (exactly one field, non-null, nullable-variable checks) required by the task.

v2/pkg/astvalidation/operation_rule_values.go:11
Introspection change targets `isOneOf` field rather than directive exposure
major

The patch modifies `__Type` selections/fixtures to include `isOneOf`; requested behavior is to expose/support `@oneOf` as a built-in directive in schema/introspection, so this is likely misaligned with intended API output.

v2/pkg/engine/datasource/introspection_datasource/config_factory.go:71
No visible runtime variables-level OneOf enforcement
major

The task requires validation at runtime variables level as well; the provided diff excerpt does not show corresponding `variablesvalidation` OneOf checks, leaving key edge cases uncovered.

v2/pkg/astvalidation/operation_rule_values.go:11
Base schema OneOf directive lacks descriptive block expected by fixtures/introspection
major

The directive is added without the descriptive doc string, which can desynchronize generated schema/introspection descriptions from expected outputs and reduce compatibility with existing golden expectations.

v2/pkg/asttransform/baseschema.go:168