§ tools · cluster

vLLM v0.10.2

Sep 13 · 08:37:01 · primary fetch1 sourcecluster c5b229f8updated Sep 13 · 08:37:01

Highlights This release contains 740 commits from 266 contributors (97 new)! Breaking Changes: This release includes PyTorch 2.8.0 upgrade, V0 deprecations, and API changes - please review the changelog carefully. aarch64 support: This release features native support for aarch64 allowing usage of vLLM on GB200 platform. The docker image `vllm/vllm-openai` should already be multiplatform. To install the wheels, you can download the wheels from this release artifact or install via ``` uv pip install vllm==0.10.2 --extra-index-url https://wheels.vllm.ai/0.10.2/ --torch-backend=auto ``` Model Support New model families and enhancements: Apertus (#23068), LFM2 (#22845), MiDashengLM (#23652), Motif-1-Tiny (#23414), Seed-Oss (#23241), Google EmbeddingGemma-300m (#24318), GTE sequence classification (#23524), Donut OCR model (#23229), KeyeVL-1.5-8B (#23838), R-4B vision model (#23246), Ernie4.5 VL (#22514), MiniCPM-V 4.5 (#23586), Ovis2.5 (#23084), Qwen3-Next with hybrid attention (#24526), InternVL3.5 with video support (#23658), Qwen2Audio embeddings (#23625), NemotronH Nano VLM (#23644), BLOOM V1 engine support (#23488), and Whisper encoder-decoder for V1 (#21088). Pipeline…

read full article on github.com ↗

§ sources1 publication · timeline below

github.comvllm v0.10.2primary08:37:01