Judged-open pocket penalty.
A public-safe dossier for the first M3 decision-quality trial. The exact operational artifacts remain private or delayed; this page publishes the hypothesis, boundary, and evidence contract.
Hypothesis
A narrow judged-debate OPEN_LONG context appears weaker than surrounding decision pockets. Applying a small additive rerank penalty to marginal candidates in that context should demote weak opens into safer alternatives or improve target-pocket quality while preserving stronger judged-open behavior.
Target context
- Lane/origin:
judged_debate - Regime:
trending_up - Confidence bucket:
70-79 - Volatility bucket:
2-4% - Action family:
open_long
Allowed intervention
- One additive, reversible pocket penalty.
- Explicit experiment identifier.
- Feature-gated activation.
- Persisted audit metadata proving whether the rerank fired.
Not allowed
- Global threshold rewrite.
- Broad exploration expansion.
- Sizing changes.
- Execution or fill changes.
- Current order-routing changes.
Decision gate
Promote or continue tuning only if the target pocket contracts or improves in quality, stronger judged-open buckets do not degrade, HOLD-rate movement is localized and explainable, and audit metadata is complete.
Kill or revert if judged-open quality worsens, stronger buckets degrade, action distribution shifts outside the target context, metadata is ambiguous, or the change behaves like a stealth global policy rewrite.