08:25 CETWednesday · May 13, 2026

shipfeed

⌘K SEARCHJK NAVO OPEN

on the wire

08:00:18arXiv — cs.AIMedHopQA benchmark tests LLM reasoning in biomedical Q&A◆08:00:18arXiv — cs.CLRouters learn geometry of sparse mixture-of-experts◆08:00:18arXiv — cs.AIStudy audits how LLMs generate political discourse during crises◆08:00:18arXiv — cs.AIClassifier Context Rot: Monitor Performance Degrades with Context◆08:00:18arXiv — cs.AIExecutable Agentic Memory for GUI Agent◆08:00:18arXiv — cs.CLLongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced◆08:00:18arXiv — cs.AISparse-to-dense rewards improve language model post-training◆08:00:18arXiv — cs.AIAI-native mobility dataset advances 6G handover and beam management◆08:00:18arXiv — cs.AIMedHopQA benchmark tests LLM reasoning in biomedical Q&A◆08:00:18arXiv — cs.CLRouters learn geometry of sparse mixture-of-experts◆08:00:18arXiv — cs.AIStudy audits how LLMs generate political discourse during crises◆08:00:18arXiv — cs.AIClassifier Context Rot: Monitor Performance Degrades with Context◆08:00:18arXiv — cs.AIExecutable Agentic Memory for GUI Agent◆08:00:18arXiv — cs.CLLongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced◆08:00:18arXiv — cs.AISparse-to-dense rewards improve language model post-training◆08:00:18arXiv — cs.AIAI-native mobility dataset advances 6G handover and beam management◆

home/topics/models

§ topic · models

models

12 this week·16 this month·41 all-time

New foundation model releases and major version bumps

ad slot opena single understated line lives here — sponsor wordmark + a short line.advertise on shipfeed →

clusters this week30 active

N° 001·▶ gpt·11:00:00

Introducing GPT-5.4

Introducing GPT-5.4, OpenAI’s most most capable and efficient frontier model for professional work, with state-of-the-art coding, computer use, tool search, and 1M-token context.

Thursday, December 11, 2025’s editionThursday, December 11, 2025

N° 001·▶ gpt·01:00:00

Introducing GPT-5.2

GPT-5.2 is our most advanced frontier model for everyday professional work, with state-of-the-art reasoning, long-context understanding, coding, and vision. Use it in ChatGPT and the OpenAI API to power faster, more…

Thursday, May 7, 2026’s editionThursday, May 7, 2026

N° 001·▶ ai·07:44:39

GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

OpenAI released GPT-Realtime-2, a voice model with GPT-5-class reasoning, tool use, interruption handling, and extended context windows up to 128K tokens, achieving top scores on Big Bench Audio and Conversational…

via news.smol.ai

N° 002·▶ agents·07:44:39

not much happened today

OpenAI released GPT-Realtime-2, a voice model with GPT-5-class reasoning, tool use, interruption handling, and extended context windows up to 128K tokens, achieving top scores on Big Bench Audio and Conversational…

via news.smol.ai

Friday, April 3, 2026’s editionFriday, April 3, 2026

N° 001·▶ ai·07:44:39

not much happened today

Gemma 4 was launched by Google under an Apache 2.0 license, marking a significant open-model release focused on reasoning, agentic workflows, multimodality, and on-device use. It outperforms models 10x larger and has…

via news.smol.ai

Tuesday, March 24, 2026’s editionTuesday, March 24, 2026

N° 001·▶ agents·06:44:39

not much happened today

Google launched Gemini 3.1 Flash Live, a realtime voice and vision agent model with 2x longer conversation memory, supporting 70 languages and 128k context. Mistral AI released Voxtral TTS, a low-latency, open-weight…

via news.smol.ai

Tuesday, March 17, 2026’s editionTuesday, March 17, 2026

N° 001·▶ ai·06:44:39

not much happened today

OpenAI released GPT-5.4 mini and GPT-5.4 nano, their most capable small models optimized for coding, multimodal understanding, and subagents, featuring a 400k context window and over 2x speed compared to GPT-5 mini…

via news.smol.ai

Wednesday, March 11, 2026’s editionWednesday, March 11, 2026

N° 001·▶ ai·06:44:39

not much happened today

NVIDIA’s Nemotron 3 Super is a 120B parameter / ~12B active open model featuring a hybrid Mamba-Transformer / SSM Latent MoE architecture and 1M context window, delivering up to 2.2x faster inference than GPT-OSS-120B…

via news.smol.ai

* sponsored·▶ nimbus

Need an agent shipped this quarter?

Nimbus builds production AI systems — internal tools, customer agents, retrieval pipelines — combining humans and AI end-to-end. From scoped pilot to production in 4–8 weeks.

Nimbus — talk to Nimbus →

Friday, March 6, 2026’s editionFriday, March 6, 2026

N° 001·▶ pricing·06:44:39

not much happened today

OpenAI rolled out GPT-5.4, achieving tied #1 on the Artificial Analysis Intelligence Index with Gemini 3.1 Pro Preview scoring 57 (up from 51 for GPT-5.2 xhigh). GPT-5.4 features a larger ~1.05M token context window…

via news.smol.ai

Tuesday, March 3, 2026’s editionTuesday, March 3, 2026

N° 001·▶ pricing·06:44:39

not much happened today

Google DeepMind launched Gemini 3.1 Flash-Lite, emphasizing dynamic thinking levels for adjustable compute, with notable metrics like $0.25/M input, $1.50/M output, 1432 Elo on LMArena, and 2.5× faster…

via news.smol.ai

Thursday, December 18, 2025’s editionThursday, December 18, 2025

N° 001·▶ gpt·01:00:00

Introducing GPT-5.2-Codex

GPT-5.2-Codex is OpenAI’s most advanced coding model, offering long-horizon reasoning, large-scale code transformations, and enhanced cybersecurity capabilities.

Thursday, August 7, 2025’s editionThursday, August 7, 2025

N° 001·▶ gpt·12:00:00

Introducing GPT-5 for developers

Introducing GPT-5 in our API platform—offering high reasoning performance, new controls for devs, and best-in-class results on real coding tasks.

Yesterday’s editionTuesday, May 12, 2026

N° 001·▶ agents·06:33:46

Thinking Machines releases TML-Interaction-Small for realtime voice

well done, Team Thinky.

via latent.space

Friday, May 8, 2026’s editionFriday, May 8, 2026

N° 001·▶ agents·07:44:39

not much happened today

OpenAI rapidly expanded the GPT-5.5 family with multiple variants including gpt-image-2, GPT-5.5 Pro, and GPT-5.5 Cyber, receiving positive feedback for efficiency and usability. Codex evolved into a long-running agent…

via news.smol.ai

N° 002·▶ gpt·03:40:01

OpenAI launches GPT-5.5-Cyber security model in limited preview

Sam Sabin / Axios: OpenAI is rolling out GPT-5.5-Cyber, a security-focused variant of the model, in a limited preview capacity to vetted cybersecurity teams — The capabilities of the new models have sparked an urgent…

via techmeme.com

Tuesday, April 28, 2026’s editionTuesday, April 28, 2026

N° 001·▶ nemotron·18:00:28

Nvidia releases Nemotron 3 Nano for efficient multimodal AI

Best-in-class open omni-modal reasoning model delivers the highest efficiency and accuracy to power agentic workflows such as computer use, document intelligence and audio-video reasoning.

via blogs.nvidia.com

Wednesday, April 22, 2026’s editionWednesday, April 22, 2026

N° 001·▶ ai·07:44:39

not much happened today

Alibaba released Qwen3.6-27B, a dense, Apache 2.0 open coding model with thinking and non-thinking modes, outperforming the larger Qwen3.5-397B-A17B on multiple coding benchmarks including SWE-bench and Terminal-Bench…

via news.smol.ai

Monday, April 20, 2026’s editionMonday, April 20, 2026

N° 001·▶ ai·07:44:39

not much happened today

Moonshot's Kimi K2.6 is a major open-weight 1T-parameter MoE model featuring 32B active parameters, 384 experts, MLA attention, 256K context window, native multimodality, and INT4 quantization. It supports day-0…

via news.smol.ai

Wednesday, April 8, 2026’s editionWednesday, April 8, 2026

N° 001·▶ agents·07:44:39

not much happened today

Meta Superintelligence Labs launched Muse Spark, a natively multimodal reasoning model featuring tool use, visual chain of thought, and multi-agent orchestration. It is live on meta.ai and the Meta AI app with a…

via news.smol.ai

Monday, March 2, 2026’s editionMonday, March 2, 2026

N° 001·▶ agents·06:44:39

not much happened today

Alibaba released the Qwen 3.5 series with models ranging from 0.8B to 9B parameters, featuring native multimodality, scaled reinforcement learning, and targeting edge and lightweight agent deployments. The models…

via news.smol.ai

Tuesday, December 23, 2025’s editionTuesday, December 23, 2025

N° 001·▶ ai·06:44:39

not much happened today

GLM-4.7 and MiniMax M2.1 open-weight model releases highlight day-0 ecosystem support, coding throughput, and agent workflows, with GLM-4.7 achieving a +9.5% improvement over GLM-4.6 and MiniMax M2.1 positioned as an…

via news.smol.ai

Thursday, September 12, 2024’s editionThursday, September 12, 2024

N° 001·▶ openai·12:01:00

OpenAI o1-mini

Advancing cost-efficient reasoning

Yesterday’s editionTuesday, May 12, 2026

N° 001·▶ gemini·20:03:11

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

Hey HN, Henry here from Cactus. We open-sourced Needle, a 26M parameter function-calling (tool use) model. It runs at 6000 tok/s prefill and 1200 tok/s decode on consumer devices.We were always frustrated by…

Thursday, May 7, 2026’s editionThursday, May 7, 2026

N° 001·▶ research·19:59:20

EMO: Pretraining Mixture of Experts for Emergent Modularity

* sponsored·▶ nimbus

Need an agent shipped this quarter?

Nimbus builds production AI systems — internal tools, customer agents, retrieval pipelines — combining humans and AI end-to-end. From scoped pilot to production in 4–8 weeks.

Nimbus — talk to Nimbus →

N° 002·▶ gpt·15:00:00

Scaling Trusted Access for Cyber with GPT-5.5 and GPT-5.5-Cyber

OpenAI expands Trusted Access for Cyber with GPT-5.5 and GPT-5.5-Cyber, helping verified defenders accelerate vulnerability research and protect critical infrastructure.

Thursday, April 2, 2026’s editionThursday, April 2, 2026

N° 001·▶ ai·18:15:33

Transformers v5.5.0

Release v5.5.0 New Model additions Gemma4 Gemma 4 is a multimodal model with pretrained and instruction-tuned variants, available in 1B, 13B, and 27B parameters. The architecture is mostly the same as the previous…

Saturday, February 21, 2026’s editionSaturday, February 21, 2026

N° 001·▶ evals·06:44:39

not much happened today

Gemini 3.1 Pro demonstrates strong retrieval capabilities and cost efficiency compared to GPT-5.2 and Opus 4.6, though users report tooling and UI issues. The SWE-bench Verified evaluation methodology is under scrutiny…

via news.smol.ai

Wednesday, December 31, 2025’s editionWednesday, December 31, 2025

N° 001·▶ ai·06:44:39

not much happened today

South Korea's Ministry of Science launched a coordinated program with 5 companies to develop sovereign foundation models from scratch, featuring large-scale MoE architectures like SK Telecom A.X-K1 (519B total / 33B…

via news.smol.ai

Friday, December 19, 2025’s editionFriday, December 19, 2025

N° 001·▶ ai·06:44:39

not much happened today

Alibaba released Qwen-Image-Layered, an open-source model enabling Photoshop-grade layered image decomposition with recursive infinite layers and prompt-controlled structure. Kling 2.6 introduced advanced motion…

via news.smol.ai

Wednesday, December 10, 2025’s editionWednesday, December 10, 2025

N° 001·▶ agents·06:44:39

not much happened today

NousResearch's Nomos 1 is a 30B open math model achieving a top Putnam score with only ~3B active parameters, enabling consumer Mac inference. AxiomProver also posts top Putnam results using ThinkyMachines' RL stack…

via news.smol.ai