National lab publishes safety benchmark with industry participation

Which summary reads better? Pick one — models revealed after.Both summaries are AI-generated.

Match the models (Optional)

Which model wrote which summary? Select a matchup mapping below before voting.

Summary A

Researchers released an open safety benchmark built with input from several model providers, covering misuse, robustness, and refusal behavior. Contributors hope shared tests make safety claims easier to compare across systems.

Summary B

An open safety benchmark — covering misuse, robustness, and refusals — launched with multi-vendor input, aiming to make safety claims comparable.

3 picks

Embed Leaderboard →