Skip to main content

Overview

Smallest AI provides real-time speech-to-text transcription through a WebSocket-based integration with their Waves API. The service uses the Pulse model to stream audio continuously and receive interim and final transcription results with low latency.

Smallest AI STT API Reference

Complete API reference for all parameters and methods

Example Implementation

Complete example with WebSocket streaming

Installation

pip install "pipecat-ai[smallest]"

Prerequisites

  1. Smallest AI Account: Sign up at Smallest AI
  2. API Key: Generate an API key from your account dashboard
Set the following environment variable:
export SMALLEST_API_KEY=your_api_key

Configuration

api_key
str
required
Smallest AI API key for authentication.
base_url
str
default:"wss://api.smallest.ai"
Base WebSocket URL for the Smallest API. Override for custom or proxied deployments.
encoding
str
default:"linear16"
Audio encoding format.
sample_rate
int
default:"None"
Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.
settings
SmallestSTTService.Settings
default:"None"
Runtime-configurable settings. See Settings below.
ttfs_p99_latency
float
default:"SMALLEST_TTFS_P99"
P99 latency from speech end to final transcript in seconds. Used for processing metrics.

Settings

Runtime-configurable settings passed via the settings constructor argument using SmallestSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstrpulseModel identifier. Currently only pulse is supported.
languageLanguage | strLanguage.ENLanguage code for transcription.
word_timestampsboolFalseInclude word-level timestamps in transcription results.
full_transcriptboolFalseInclude cumulative transcript in results.
sentence_timestampsboolFalseInclude sentence-level timestamps in transcription results.
redact_piiboolFalseRedact personally identifiable information from transcripts.
redact_pciboolFalseRedact payment card information from transcripts.
numeralsstrautoConvert spoken numerals to digits. Options: auto, always, or none.
diarizeboolFalseEnable speaker diarization to identify different speakers.

Usage

Basic Setup

from pipecat.services.smallest.stt import SmallestSTTService
from pipecat.transcriptions.language import Language

stt = SmallestSTTService(
    api_key=os.getenv("SMALLEST_API_KEY"),
    settings=SmallestSTTService.Settings(
        language=Language.EN,
    ),
)

With Advanced Features

stt = SmallestSTTService(
    api_key=os.getenv("SMALLEST_API_KEY"),
    settings=SmallestSTTService.Settings(
        language=Language.ES,
        word_timestamps=True,
        diarize=True,
        redact_pii=True,
    ),
)

Updating Settings at Runtime

Transcription settings can be changed mid-conversation using STTUpdateSettingsFrame:
from pipecat.frames.frames import STTUpdateSettingsFrame
from pipecat.services.smallest.stt import SmallestSTTSettings

await task.queue_frame(
    STTUpdateSettingsFrame(
        delta=SmallestSTTSettings(
            language=Language.FR,
            word_timestamps=False,
        )
    )
)
Changing settings will trigger a WebSocket reconnection, which may cause a brief interruption in transcription.

Notes

  • WebSocket streaming: The service uses WebSocket connections for real-time streaming. The connection is automatically managed and will reconnect if interrupted.
  • VAD integration: Uses Pipecat’s VAD to detect when the user stops speaking and sends a finalize message to flush the final transcript.
  • Keepalive: The service sends periodic keepalive messages (every 5 seconds) to prevent idle timeouts on the WebSocket connection.
  • Language support: Supports 31 languages including Bulgarian, Bengali, Czech, Danish, German, English, Spanish, Estonian, Finnish, French, Gujarati, Hindi, Hungarian, Italian, Kannada, Lithuanian, Latvian, Malayalam, Marathi, Maltese, Dutch, Odia, Punjabi, Polish, Portuguese, Romanian, Russian, Slovak, Swedish, Tamil, Telugu, and Ukrainian.

Event Handlers

Smallest AI STT supports the standard service connection events:
EventDescription
on_connectedConnected to Smallest AI WebSocket
on_disconnectedDisconnected from Smallest WebSocket
on_connection_errorWebSocket connection error occurred
@stt.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Smallest AI STT")