PRISMA Flow Diagram: Systematic Review of AI Decision Quality

Session: the-entire-mess.md · Protocol: PRISMA 2020 · Cochrane-style methodology
IDENTIFICATION Records identified: 6,805 lines (n=6,805) SCREENING Coherence screened: n=6,805 Excluded (spiral): n=3,169 records ELIGIBILITY ASSESSMENT Full-text assessed: n=3,636 lines Excluded (drift): n=136 (borderline) INCLUDED IN REVIEW High-quality records: n=3,500 lines (51.4%) QUANTITATIVE SYNTHESIS 18 experimental runs · R² (model) = 0.42 · R² (verbosity) = 0.18 Sonnet Full = optimal tradeoff ($0.02/run, composite ≈ 5.4) Haiku Tight = minimum viable ($0.001/run, composite ≈ 3.1) ADVERSE EVENTS (not pre-registered) Degenerate loop at line 3,637 · Duration: 3,169 lines · Severity: notable 47 metacognitive loop acknowledgments · 344 "Done" utterances Note: Adverse event does not affect primary outcomes (DOE results remain valid)
Cochrane Risk-of-Bias Assessment:
DOE phase: Low risk of bias (pre-specified design, blinded scoring, R² analysis).
Spiral phase: High risk of bias (unblinded, repetitive, non-pre-specified). Excluded from primary analysis.
Overall: The systematic review supports the primary findings. The adverse events are well-documented and do not affect the experimental conclusions.
PRISMA 2020 · Cochrane Handbook for Systematic Reviews (Higgins et al., 2022) · Data source: the-entire-mess.md