Smallest AI

Overview

Smallest AI provides real-time speech-to-text transcription through a WebSocket-based integration with their Waves API. The service uses the Pulse model to stream audio continuously and receive interim and final transcription results with low latency.

Smallest AI STT API Reference

Complete API reference for all parameters and methods

Example Implementation

Complete example with WebSocket streaming

Installation

pip install "pipecat-ai[smallest]"

Prerequisites

Smallest AI Account: Sign up at Smallest AI
API Key: Generate an API key from your account dashboard

Set the following environment variable:

export SMALLEST_API_KEY=your_api_key

Configuration

api_key

str

required

Smallest AI API key for authentication.

base_url

str

default:"wss://api.smallest.ai"

Base WebSocket URL for the Smallest API. Override for custom or proxied deployments.

encoding

str

default:"linear16"

Audio encoding format.

sample_rate

int

default:"None"

Audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

settings

SmallestSTTService.Settings

default:"None"

Runtime-configurable settings. See Settings below.

ttfs_p99_latency

float

default:"SMALLEST_TTFS_P99"

P99 latency from speech end to final transcript in seconds. Used for processing metrics.

Settings

Runtime-configurable settings passed via the settings constructor argument using SmallestSTTService.Settings(...). These can be updated mid-conversation with STTUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`pulse`	Model identifier. Currently only `pulse` is supported.
`language`	`Language \| str`	`Language.EN`	Language code for transcription.
`word_timestamps`	`bool`	`False`	Include word-level timestamps in transcription results.
`full_transcript`	`bool`	`False`	Include cumulative transcript in results.
`sentence_timestamps`	`bool`	`False`	Include sentence-level timestamps in transcription results.
`redact_pii`	`bool`	`False`	Redact personally identifiable information from transcripts.
`redact_pci`	`bool`	`False`	Redact payment card information from transcripts.
`numerals`	`str`	`auto`	Convert spoken numerals to digits. Options: `auto`, `always`, or `none`.
`diarize`	`bool`	`False`	Enable speaker diarization to identify different speakers.

Usage

Basic Setup

from pipecat.services.smallest.stt import SmallestSTTService
from pipecat.transcriptions.language import Language

stt = SmallestSTTService(
    api_key=os.getenv("SMALLEST_API_KEY"),
    settings=SmallestSTTService.Settings(
        language=Language.EN,
    ),
)

With Advanced Features

stt = SmallestSTTService(
    api_key=os.getenv("SMALLEST_API_KEY"),
    settings=SmallestSTTService.Settings(
        language=Language.ES,
        word_timestamps=True,
        diarize=True,
        redact_pii=True,
    ),
)

Updating Settings at Runtime

Transcription settings can be changed mid-conversation using STTUpdateSettingsFrame:

from pipecat.frames.frames import STTUpdateSettingsFrame
from pipecat.services.smallest.stt import SmallestSTTSettings

await task.queue_frame(
    STTUpdateSettingsFrame(
        delta=SmallestSTTSettings(
            language=Language.FR,
            word_timestamps=False,
        )
    )
)

Changing settings will trigger a WebSocket reconnection, which may cause a brief interruption in transcription.

Notes

WebSocket streaming: The service uses WebSocket connections for real-time streaming. The connection is automatically managed and will reconnect if interrupted.
VAD integration: Uses Pipecat’s VAD to detect when the user stops speaking and sends a finalize message to flush the final transcript.
Keepalive: The service sends periodic keepalive messages (every 5 seconds) to prevent idle timeouts on the WebSocket connection.
Language support: Supports 31 languages including Bulgarian, Bengali, Czech, Danish, German, English, Spanish, Estonian, Finnish, French, Gujarati, Hindi, Hungarian, Italian, Kannada, Lithuanian, Latvian, Malayalam, Marathi, Maltese, Dutch, Odia, Punjabi, Polish, Portuguese, Romanian, Russian, Slovak, Swedish, Tamil, Telugu, and Ukrainian.

Event Handlers

Smallest AI STT supports the standard service connection events:

Event	Description
`on_connected`	Connected to Smallest AI WebSocket
`on_disconnected`	Disconnected from Smallest WebSocket
`on_connection_error`	WebSocket connection error occurred

@stt.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Smallest AI STT")

API Reference

Services

Utilities

Events

Frameworks

Frames

Pipeline

Overview

Smallest AI STT API Reference

Example Implementation

Installation

Prerequisites

Configuration

Settings

Usage

Basic Setup

With Advanced Features

Updating Settings at Runtime

Notes

Event Handlers

API Reference

Services

Utilities

Events

Frameworks

Frames

Pipeline

​Overview

Smallest AI STT API Reference

Example Implementation

​Installation

​Prerequisites

​Configuration

​Settings

​Usage

​Basic Setup

​With Advanced Features

​Updating Settings at Runtime

​Notes

​Event Handlers

Overview

Installation

Prerequisites

Configuration

Settings

Usage

Basic Setup

With Advanced Features

Updating Settings at Runtime

Notes

Event Handlers