Which AI writes the better take? You decide — blind.

Two top models go head-to-head on today's AI news. Pick the sharper summary without seeing the names — the crowd's verdict builds the leaderboard.

Agents & InferenceSimon Willison

GLM-5.2 is probably the most powerful text-only open weights LLM

Which summary reads better? Pick one — models revealed after.Both summaries are AI-generated.

Match the models (Optional)

Which model wrote which summary? Select a matchup mapping below before voting.

Summary A

Chinese AI lab Z.ai released GLM-5.2, a 753 billion-parameter open-weight language model under an MIT license, positioning it as the most powerful text-only open model available. Benchmarks show it leads in performance but consumes more tokens than competitors, while early tests highlight strong coding and creative output, though some results lag behind its predecessor. The model is now accessible via multiple providers at competitive pricing.

Summary B

Z.ai has released GLM-5.2 as open weights under an MIT license, a 753B-parameter text-only mixture-of-experts model with a 1 million token context window. It leads open-weight models on Artificial Analysis’s Intelligence Index and ranks second on the Code Arena WebDev leaderboard, though benchmarks suggest it uses more output tokens per task than peers.

1 pick

Permalink Embed Leaderboard →

What you'll learn · Jun 18, 2026 · 6 stories

Browse editions · 69 days

Newer Older

Latest 12 days

08-02 08-01 07-31 07-30 07-29 07-28 07-27 07-26 07-25 07-24 07-23 07-22

More stories

Agents & InferenceOpenAI

A near-autonomous AI chemist improves a challenging reaction in medicinal chemistry

Which summary reads better? Pick one — models revealed after.Both summaries are AI-generated.

Match the models (Optional)

Which model wrote which summary? Select a matchup mapping below before voting.

Summary A

A near-autonomous AI chemist powered by advanced language models has successfully enhanced a complex drug-making reaction, marking a significant step forward in medicinal chemistry. The breakthrough demonstrates how AI can accelerate and refine challenging processes in pharmaceutical research.

Summary B

OpenAI and Molecule.one demonstrated a near-autonomous AI chemist powered by GPT-5.4 that improved a challenging reaction used in drug development. The work points to how AI systems could help accelerate medicinal chemistry research by optimizing key steps in molecule synthesis.

0 picks

Permalink Embed Leaderboard →

Agents & InferencearXiv

When Rules Learn: A Self-Evolving Agent for Legal Case Retrieval

Which summary reads better? Pick one — models revealed after.Both summaries are AI-generated.

Match the models (Optional)

Which model wrote which summary? Select a matchup mapping below before voting.

Summary A

Researchers developed a self-evolving AI agent that improves legal case retrieval by automatically refining search rules without additional training. The system, tested on a Chinese legal benchmark, outperformed traditional methods by using large language models to iteratively test and eliminate ineffective rules. Findings highlight the AI's ability to leverage past results and built-in knowledge to enhance search precision.

Summary B

Researchers proposed a self-evolving LLM-based agent that improves legal case retrieval by automatically creating, testing and pruning query-rewriting rules for BM25 without parameter training. Evaluated on the Chinese LeCaRD-v2 benchmark, the framework outperformed non-evolutionary baselines such as human-designed rules and greedy rule selection, with gains tied to the LLM’s ability to use prior experimental feedback and eliminate weak rules.

0 picks

Permalink Embed Leaderboard →

Agents & InferenceHugging Face

Agentic Resource Discovery: Let agents search

Which summary reads better? Pick one — models revealed after.Both summaries are AI-generated.

Match the models (Optional)

Which model wrote which summary? Select a matchup mapping below before voting.

Summary A

Agentic Resource Discovery is a draft open specification for helping AI agents find tools, skills and other agents at runtime through searchable, federated registries. Hugging Face has implemented ARD through its Discover Tool, which exposes Hub resources such as Spaces, Skills and MCP servers as catalog entries for natural-language search by agents.

Summary B

Hugging Face has introduced Agentic Resource Discovery (ARD), an open specification allowing AI agents to dynamically search for tools, skills, and other agents at runtime rather than relying on pre-installed capabilities. Developed with industry contributors like Microsoft and Google, ARD enables federated registries to catalog and index resources, improving scalability and reducing manual integration. Hugging Face’s reference implementation, the Discover Tool, already provides search access to thousands of skills and applications across its platform and other ARD-compatible services.

0 picks

Permalink Embed Leaderboard →

Agents & InferenceTechCrunch

How to turn off AI in your Google Docs

Which summary reads better? Pick one — models revealed after.Both summaries are AI-generated.

Match the models (Optional)

Which model wrote which summary? Select a matchup mapping below before voting.

Summary A

Google Docs users can disable AI features like Gemini pop-ups by adjusting settings in Gmail to turn off smart features across Google Workspace. The process avoids the frustration of individually closing intrusive AI prompts, though some users report persistent hover tools that may require separate adjustments. The change helps prevent disruptions while writing or editing documents.

Summary B

Google Docs users can disable Gemini prompts and other AI writing features by turning off Google Workspace “smart features” through Gmail settings. The guide frames the change as a way to stop intrusive AI pop-ups and writing suggestions from interrupting work in Docs.

0 picks

Permalink Embed Leaderboard →

Agents & InferenceMistral

Introducing Search Toolkit

Which summary reads better? Pick one — models revealed after.Both summaries are AI-generated.

Match the models (Optional)

Which model wrote which summary? Select a matchup mapping below before voting.

Summary A

Mistral released Search Toolkit in public preview, an open-source framework for building production search pipelines for AI applications. It combines ingestion, retrieval, and evaluation under a shared interface, aiming to reduce integration work for teams building enterprise search, RAG workflows, internal knowledge systems, and domain-specific retrieval.

Summary B

Mistral has launched Search Toolkit, an open-source framework designed to simplify building production search pipelines for AI applications by unifying ingestion, retrieval, and evaluation tools. The toolkit aims to reduce engineering overhead, allowing teams to focus on improving search quality rather than maintaining integrations across disparate systems. It supports cloud, on-premises, and edge deployments, offering consistent processing for diverse data sources and built-in evaluation for retrieval performance.

0 picks

Permalink Embed Leaderboard →

See who's winning the model face-off

Tomorrow's blind matchup and the running leaderboard — one email a day.

Takeaways written by DeepSeek V3 — not one of this week's two contestants.