Skip to content
@pyannote

pyannote

Speaker Intelligence Platform for developers

Identify who speaks when with pyannote

💚 Simply detect, segment, label, and separate speakers in any language

Github Hugging Face Discord LinkedIn X
Playground Documentation

pyannoteAI facilitates the understanding of speakers and conversation context. We focus on identifying speakers and conversation metadata under conditions that reflect real conversations rather than controlled recordings.

🎤 What is speaker diarization?

Diarization

Speaker diarization is the process of automatically partitioning the audio recording of a conversation into segments and labeling them by speaker, answering the question "who spoke when?". As the foundational layer of conversational AI, speaker diarization provides high-level insights for human-human and human-machine conversations, and unlocks a wide range of downstream applications: meeting transcription, call center analytics, voice agents, video dubbing.

▶️ Getting started

Install pyannote.audio latest release available from Latest release with either uv (recommended) or pip:

$ uv add pyannote.audio
$ pip install pyannote.audio

Enjoy state-of-the-art speaker diarization:

# download pretrained pipeline from Huggingface
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained('pyannote/speaker-diarization-community-1', token="HUGGINGFACE_TOKEN")

# perform speaker diarization locally
output = pipeline('/path/to/audio.wav')

# enjoy state-of-the-art speaker diarization
for turn, speaker in output.speaker_diarization:
    print(f"{speaker} speaks between t={turn.start}s and t={turn.end}s")

Read community-1 model card to make the most of it.

🏆 State-of-the-art models

pyannoteAI research team trains cutting-edge speaker diarization models, thanks to Jean Zay 🇫🇷 supercomputer managed by GENCI 💚. They come in two flavors:

  • pyannote.audio open models available on Huggingface and used by 140k+ developers over the world ;
  • premium models available on pyannoteAI cloud (and on-premise for enterprise customers) that provide state-of-the-art speaker diarization as well as additional enterprise features.
Benchmark (last updated in 2025-09) legacy (3.1) community-1 precision-2
AISHELL-4 12.2 11.7 11.4 🏆
AliMeeting (channel 1) 24.5 20.3 15.2 🏆
AMI (IHM) 18.8 17.0 12.9 🏆
AMI (SDM) 22.7 19.9 15.6 🏆
AVA-AVD 49.7 44.6 37.1 🏆
CALLHOME (part 2) 28.5 26.7 16.6 🏆
DIHARD 3 (full) 21.4 20.2 14.7 🏆
Ego4D (dev.) 51.2 46.8 39.0 🏆
MSDWild 25.4 22.8 17.3 🏆
RAMC 22.2 20.8 10.5 🏆
REPERE (phase2) 7.9 8.9 7.4 🏆
VoxConverse (v0.3) 11.2 11.2 8.5 🏆

Diarization error rate (in %, the lower, the better)

Our models achieve competitive performance across multiple public diarization datasets, explore pyannoteAI performance benchmark ➡️ https://www.pyannote.ai/benchmark

⏩️ Going further, better, and faster

precision-2 premium model further improves accuracy, processing speed, as well as brings additional features.

Features community-1 precision-2
Set exact/min/max number of speakers
Exclusive speaker diarization (for transcription)
Segmentation confidence scores
Speaker confidence scores
Voiceprinting
Speaker identification
STT Orchestration
Time to process 1h of audio (on H100) 37s 14s

Create a pyannoteAI account, change one line of code, and enjoy free cloud credits to try precision-2 premium diarization:

# perform premium speaker diarization on pyannoteAI cloud
pipeline = Pipeline.from_pretrained('pyannote/speaker-diarization-precision-2', token="PYANNOTEAI_API_KEY")
better_output = pipeline('/path/to/audio.wav')

🔌 Get speaker-attributed transcripts

We host open-source transcription models like Nvidia Parakeet-tdt-0.6b-v3 and OpenAI whisper-large-v3-turbo with specialized STT + diarization reconciliation logic for speaker-attributed transcripts.

STT orchestration orchestrates pyannoteAI diarization Precision-2 with transcription services. Instead of running diarization and transcription separately, then reconciling outputs manually, you make one API call and receive speaker-attributed transcripts.

STT Orchestration

To use this feature, make a request to the diarize API endpoint with the transcription:true flag.

# pip install pyannoteai-sdk

from pyannoteai.sdk import Client
client = Client("your-api-key")

job_id = client.diarize(
	"[https://www.example/audio.wav](https://www.example/audio.wav)",
	transcription=True)

job_output = client.retrieve(job_id)

for word in job_output['output']['wordLevelTranscription']:
	print(word['start'], word['end'], word['speaker'], word['text'])

for turn in job_output['output']['turnLevelTranscription']:
	print(turn['start'], turn['end'], turn['speaker'], turn['text'])

Pinned Loading

  1. pyannote-audio pyannote-audio Public

    Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

    Jupyter Notebook 9.2k 1k

  2. pyannoteAI-python-sdk pyannoteAI-python-sdk Public

    pyannoteAI Python SDK

    Python 11 2

  3. pyannote-metrics pyannote-metrics Public

    A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems

    Python 242 43

  4. aws-marketplace-docs aws-marketplace-docs Public

    pyannoteAI AWS Marketplace Diarization model

    Jupyter Notebook

Repositories

Showing 10 of 48 repositories
  • .github Public
    pyannote/.github’s past year of commit activity
    0 2 0 0 Updated Feb 16, 2026
  • pyannote-audio Public

    Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

    pyannote/pyannote-audio’s past year of commit activity
    Jupyter Notebook 9,188 MIT 1,011 22 13 Updated Feb 11, 2026
  • pyannoteAI-python-sdk Public

    pyannoteAI Python SDK

    pyannote/pyannoteAI-python-sdk’s past year of commit activity
    Python 11 MIT 2 1 0 Updated Jan 14, 2026
  • pyannote-metrics Public

    A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems

    pyannote/pyannote-metrics’s past year of commit activity
    Python 242 MIT 43 7 5 Updated Dec 16, 2025
  • pyannote-database Public

    Reproducible experimental protocols for multimedia (audio, video, text) database

    pyannote/pyannote-database’s past year of commit activity
    Python 114 MIT 36 10 2 Updated Dec 7, 2025
  • aws-marketplace-docs Public

    pyannoteAI AWS Marketplace Diarization model

    pyannote/aws-marketplace-docs’s past year of commit activity
    Jupyter Notebook 0 0 1 0 Updated Oct 8, 2025
  • pyannote-core Public

    Advanced data structures for handling temporal segments with attached labels.

    pyannote/pyannote-core’s past year of commit activity
    Jupyter Notebook 123 50 12 4 Updated Sep 16, 2025
  • pyannote-pipeline Public

    Tunable pipelines

    pyannote/pyannote-pipeline’s past year of commit activity
    Python 41 16 13 0 Updated Sep 9, 2025
  • pyannote-video Public

    Face detection, tracking and clustering in videos

    pyannote/pyannote-video’s past year of commit activity
    Python 469 MIT 131 12 2 Updated Mar 25, 2024
  • pyannote/AMI-diarization-setup’s past year of commit activity
    Shell 46 Apache-2.0 29 0 2 Updated Jan 22, 2024

Most used topics

Loading…