Agents & Inferencefixture.example
National lab publishes safety benchmark with industry participation
Which summary reads better? Pick one — models revealed after.Both summaries are AI-generated.
Summary A
Researchers released an open safety benchmark built with input from several model providers, covering misuse, robustness, and refusal behavior. Contributors hope shared tests make safety claims easier to compare across systems.
Summary B
An open safety benchmark — covering misuse, robustness, and refusals — launched with multi-vendor input, aiming to make safety claims comparable.
2 picks