Skip to content

Pinned Loading

  1. vllm vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 70.4k 13.5k

  2. llm-compressor llm-compressor Public

    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

    Python 2.7k 393

  3. recipes recipes Public

    Common recipes to run vLLM

    Jupyter Notebook 429 143

  4. speculators speculators Public

    A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM

    Python 241 39

  5. semantic-router semantic-router Public

    System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

    Go 3.2k 534

Repositories

Showing 10 of 34 repositories
  • vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    vllm-project/vllm’s past year of commit activity
    Python 70,369 Apache-2.0 13,464 1,685 (46 issues need help) 1,639 Updated Feb 16, 2026
  • guidellm Public

    Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

    vllm-project/guidellm’s past year of commit activity
    Python 852 Apache-2.0 124 62 (5 issues need help) 20 Updated Feb 16, 2026
  • production-stack Public

    vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

    vllm-project/production-stack’s past year of commit activity
    Python 2,169 Apache-2.0 366 94 (3 issues need help) 55 Updated Feb 16, 2026
  • semantic-router Public

    System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

    vllm-project/semantic-router’s past year of commit activity
    Go 3,193 Apache-2.0 534 109 (22 issues need help) 56 Updated Feb 16, 2026
  • ci-infra Public

    This repo hosts code for vLLM CI & Performance Benchmark infrastructure.

    vllm-project/ci-infra’s past year of commit activity
    HCL 29 Apache-2.0 59 0 28 Updated Feb 15, 2026
  • tpu-inference Public

    TPU inference for vLLM, with unified JAX and PyTorch support.

    vllm-project/tpu-inference’s past year of commit activity
    Python 235 Apache-2.0 100 45 (1 issue needs help) 128 Updated Feb 15, 2026
  • vllm-metal Public

    Community maintained hardware plugin for vLLM on Apple Silicon

    vllm-project/vllm-metal’s past year of commit activity
    Python 454 Apache-2.0 45 8 (2 issues need help) 5 Updated Feb 15, 2026
  • vllm-daily Public

    vLLM Daily Summarization of Merged PRs

    vllm-project/vllm-daily’s past year of commit activity
    40 3 0 0 Updated Feb 15, 2026
  • recipes Public

    Common recipes to run vLLM

    vllm-project/recipes’s past year of commit activity
    Jupyter Notebook 429 Apache-2.0 143 11 42 Updated Feb 15, 2026
  • bart-plugin Public

    vLLM Model plugin for the encoder-decoder BART model

    vllm-project/bart-plugin’s past year of commit activity
    Python 7 Apache-2.0 1 0 1 Updated Feb 15, 2026