Speech & TranscriptionDocumentedScanned

parakeet-stt

>-.

Installation

npx clawhub@latest install parakeet-stt

View the full skill documentation and source below.

Documentation

Parakeet TDT (Speech-to-Text)

Local transcription using NVIDIA Parakeet TDT 0.6B v3 with ONNX Runtime.
Runs on CPU — no GPU required. ~30x faster than realtime.

Installation

# Clone the repo
git clone 
cd parakeet-tdt-0.6b-v3-fastapi-openai

# Run with Docker (recommended)
docker compose up -d parakeet-cpu

# Or run directly with Python
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 5000

Default port is 5000. Set PARAKEET_URL to override (e.g., ). ## API Endpoint OpenAI-compatible API at$PARAKEET_URL (default: ).

Quick Start

# Transcribe audio file (plain text)
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=text"

# Get timestamps and segments
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=verbose_json"

# Generate subtitles (SRT)
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=srt"

Python / OpenAI SDK

import os
from openai import OpenAI

client = OpenAI(
    base_url=os.getenv("PARAKEET_URL", "") + "/v1",
    api_key="not-needed"
)

with open("audio.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="parakeet-tdt-0.6b-v3",
        file=f,
        response_format="text"
    )
print(transcript)

Response Formats

Format

Output

`text`	Plain text
`json`	`{"text": "..."}`
`verbose_json`	Segments with timestamps and words
`srt`	SRT subtitles
`vtt`	WebVTT subtitles

Supported Languages (25)

English, Spanish, French, German, Italian, Portuguese, Polish, Russian,
Ukrainian, Dutch, Swedish, Danish, Finnish, Norwegian, Greek, Czech,
Romanian, Hungarian, Bulgarian, Slovak, Croatian, Lithuanian, Latvian,
Estonian, Slovenian

Language is auto-detected — no configuration needed.

Web Interface

Open $PARAKEET_URL in a browser for drag-and-drop transcription UI.

Docker Management

# Check status
docker ps --filter "name=parakeet"

# View logs
docker logs -f <container-name>

# Restart
docker compose restart

# Stop
docker compose down

Why Parakeet over Whisper?

Speed: ~30x faster than realtime on CPU
Accuracy: Comparable to Whisper large-v3
Privacy: Runs 100% locally, no cloud calls
Compatibility: Drop-in replacement for OpenAI's transcription API

Back to Skills Directory

Documentation

Parakeet TDT (Speech-to-Text)

Local transcription using NVIDIA Parakeet TDT 0.6B v3 with ONNX Runtime.
Runs on CPU — no GPU required. ~30x faster than realtime.

Installation

# Clone the repo
git clone 
cd parakeet-tdt-0.6b-v3-fastapi-openai

# Run with Docker (recommended)
docker compose up -d parakeet-cpu

# Or run directly with Python
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 5000

Default port is 5000. Set PARAKEET_URL to override (e.g., ). ## API Endpoint OpenAI-compatible API at$PARAKEET_URL (default: ).

Quick Start

# Transcribe audio file (plain text)
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=text"

# Get timestamps and segments
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=verbose_json"

# Generate subtitles (SRT)
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=srt"

Python / OpenAI SDK

import os
from openai import OpenAI

client = OpenAI(
    base_url=os.getenv("PARAKEET_URL", "") + "/v1",
    api_key="not-needed"
)

with open("audio.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="parakeet-tdt-0.6b-v3",
        file=f,
        response_format="text"
    )
print(transcript)

Response Formats

Format

Output

`text`	Plain text
`json`	`{"text": "..."}`
`verbose_json`	Segments with timestamps and words
`srt`	SRT subtitles
`vtt`	WebVTT subtitles

Supported Languages (25)

Language is auto-detected — no configuration needed.

Web Interface

Open $PARAKEET_URL in a browser for drag-and-drop transcription UI.

Docker Management

# Check status
docker ps --filter "name=parakeet"

# View logs
docker logs -f <container-name>

# Restart
docker compose restart

# Stop
docker compose down

Why Parakeet over Whisper?

Speed: ~30x faster than realtime on CPU
Accuracy: Comparable to Whisper large-v3
Privacy: Runs 100% locally, no cloud calls
Compatibility: Drop-in replacement for OpenAI's transcription API

parakeet-stt

Installation

Documentation

Parakeet TDT (Speech-to-Text)

Installation

Quick Start

Python / OpenAI SDK

Response Formats

Supported Languages (25)

Web Interface

Docker Management

Why Parakeet over Whisper?

Related Skills in Speech & Transcription

addis-assistant-stt

assemblyai-transcribe

audio-gen

audio-reply

critical-article-writer

parakeet-stt

Installation

Documentation

Parakeet TDT (Speech-to-Text)

Installation

Quick Start

Python / OpenAI SDK

Response Formats

Supported Languages (25)

Web Interface

Docker Management

Why Parakeet over Whisper?

Related Skills in Speech & Transcription

addis-assistant-stt

assemblyai-transcribe

audio-gen

audio-reply

critical-article-writer