shipfeed · Research

ad slot opena single understated line lives here — sponsor wordmark + a short line.advertise on shipfeed →

Research17 clusters

N° 001 · ▲ biggest story · ai · 10:00 CET

Tilde Research Introduces Aurora: A Leverage-Aware Optimizer ...

Tilde Research introduces Aurora, a leverage-aware optimizer that addresses a hidden neuron death problem in Muon.

via marktechpost.com ·

Tilde Research introduces Aurora, a leverage-aware optimizer that addresses a hidden neuron death problem in Muon.

N° 002·▶ ai·02:00:00RESEARCH

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Proposes AutoTTS, an environment-driven framework using agentic methods for LLMs to discover and apply test-time scaling strategies, shifting from manual heuristics to automated discovery.

via reddit.com·

N° 003·▶ research·12:39:22RESEARCH

State space models emerge as serious transformer competitor

Inside the core ideas, potential and challenges of SSMs

via thesequence.substack.com·

N° 004·▶ qwen·02:00:00RESEARCH

Hosting Qwen on Blackwell

Details Perplexity's inference setup for serving post-trained Qwen3 235B models on NVIDIA Blackwell GPUs, optimizing for cost and performance.

via research.perplexity.ai·

N° 005·▶ research·02:00:00RESEARCH

What Parameter Golf taught us about AI-assisted research

Parameter Golf brought together 1,000+ participants and 2,000+ submissions to explore AI-assisted machine learning research, coding agents, quantization, and novel model design under strict constraints.

via openai.com·

Monday, May 11, 2026’s editionMonday, May 11, 2026

N° 001·▶ ai·19:52:15RESEARCH

Meta and Stanford propose transformer cutting inference memory by 50%

Meta, Stanford, and University of Washington researchers propose methods to accelerate Byte Latent Transformer (BLT) generation, reducing inference memory bandwidth by over 50% without tokenization using diffusion and…

via marktechpost.com·

N° 002·▶ ai·02:00:00RESEARCH

SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests

Microsoft Research introduced SocialReasoning-Bench, a benchmark evaluating AI agents' social reasoning in calendar coordination and marketplace negotiation, testing outcome optimality and due diligence.

via microsoft.com·

N° 003·▶ agents·19:49:43RESEARCH

WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation

via arxiv.org·

* sponsored·▶ nimbus

Need an agent shipped this quarter?

Nimbus builds production AI systems — internal tools, customer agents, retrieval pipelines — combining humans and AI end-to-end. From scoped pilot to production in 4–8 weeks.

Nimbus — talk to Nimbus →

N° 004·▶ safety·17:58:37RESEARCH

MATRA: Modeling the Attack Surface of Agentic AI Systems -- OpenClaw Case Study

via arxiv.org·

Sunday, May 10, 2026’s editionSunday, May 10, 2026

N° 001·▶ agents·13:45:41RESEARCH

AI agents that hack computers and replicate themselves, and they're getting better fast

Palisade Research shows that AI agents can hack remote computers, copy themselves onto them, and form replication chains. In one year, the success rate jumped from 6 to 81 percent. The researchers expect remaining…

via the-decoder.com·

Saturday, May 9, 2026’s editionSaturday, May 9, 2026

N° 001·▶ mcp·21:25:56RESEARCH

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

via arxiv.org·

N° 002·▶ claude·02:00:00RESEARCH

METR evaluated an early version of Claude Mythos

METR conducted risk assessment on an early version of Anthropic's Claude Mythos Preview in March 2026, estimating significant capabilities.

via reddit.com·+2 sourcesreddit.comprimary↗the-decoder.com↗·

Friday, May 8, 2026’s editionFriday, May 8, 2026

N° 001·▶ research·18:03:50RESEARCH

EMO: Pretraining mixture of experts for emergent modularity

via huggingface.co·

N° 002·▶ ai·02:00:00RESEARCH

UCLA gets $5M DARPA grant for AI math proof tools

UCLA awarded $5M DARPA grant for ALPHA project to develop AI for automating mathematical proof synthesis and verification in domains like PDEs and number theory.

via completeaitraining.com·

Thursday, May 7, 2026’s editionThursday, May 7, 2026

N° 001·▶ claude·19:54:02RESEARCH

Natural Language Autoencoders: Turning Claude's Thoughts into Text

via anthropic.com·+3 sourcesanthropic.comprimary↗the-decoder.com↗techmeme.com↗·

N° 002·▶ research·19:59:20RESEARCH

EMO: Pretraining Mixture of Experts for Emergent Modularity

via arxiv.org·

Sunday, July 27, 2025’s editionSunday, July 27, 2025

N° 001·▶ ai·02:00:00RESEARCH

LLM Agents Making Agent Tools

ACL Anthology PDF version of the work showing autonomous tool creation that can go beyond simple Python functions and produce tools for real-world scientific tasks.

via aclanthology.org·

Research — shipfeed

Research17 clusters

The week in AI, in one short email.