Getting Started¶

Installation¶

pip install pyvoicebox-sap                # core (numpy, scipy, soundfile)
pip install "pyvoicebox-sap[plot]"        # with matplotlib for plotting functions

For development (includes pytest and matplotlib):

pip install -e ".[dev]"

Dependencies¶

Package	Purpose
`numpy`	Array operations and linear algebra
`scipy`	Signal processing, special functions, sparse matrices, optimization
`soundfile`	Audio file I/O (WAV, FLAC, AIFF, AU) via libsndfile
`matplotlib`	(optional) Plotting and display functions

Quick Start¶

Frequency conversions¶

from pyvoicebox import v_frq2mel, v_mel2frq
mel = v_frq2mel(440)          # Hz to Mel
hz = v_mel2frq(mel)           # back to Hz

MFCC extraction¶

from pyvoicebox import v_melcepst
import soundfile as sf
signal, fs = sf.read('speech.wav')
mfcc = v_melcepst(signal, fs, 'M0dD', 12)  # 12 MFCCs + deltas

LPC analysis¶

from pyvoicebox import v_lpcauto, v_lpcar2cc
ar, e, k = v_lpcauto(signal, 12)     # 12th-order LPC
cc = v_lpcar2cc(ar)                    # AR -> cepstral coefficients

Quaternion operations¶

from pyvoicebox import v_roteu2qr, v_rotqr2ro
import numpy as np
q = v_roteu2qr('xyz', np.array([0.1, 0.2, 0.3]))  # Euler -> quaternion
R = v_rotqr2ro(q)                                     # quaternion -> rotation matrix

Noise estimation & speech enhancement¶

from pyvoicebox import v_estnoiseg, v_specsub
noise_psd = v_estnoiseg(signal, fs)       # noise PSD estimate
clean = v_specsub(signal, fs)             # spectral subtraction

Function naming¶

All functions are available with both the v_ prefix (matching the MATLAB original) and without:

from pyvoicebox import frq2mel       # same as v_frq2mel
from pyvoicebox import melcepst      # same as v_melcepst
from pyvoicebox import lpcauto       # same as v_lpcauto

The v_ prefix avoids naming collisions and makes it easy to grep for VOICEBOX functions in your codebase. Use whichever style you prefer.

Type annotations¶

All functions have return type annotations, so your IDE can show what each function returns:

v_frq2mel(frq) -> tuple[np.ndarray, np.ndarray]
v_lpcauto(s, p=12, ...) -> tuple[np.ndarray, np.ndarray, np.ndarray]
v_midi2frq(n, s='e') -> np.ndarray
v_writewav(d, fs, filename, ...) -> None

Many VOICEBOX functions return tuples — for example, v_frq2mel returns (mel_values, gradient). The type annotations make this visible without reading the docstring.

Next steps¶

Browse the API Reference to find functions by category
See how pyvoicebox compares to librosa and openSMILE
Read about the testing approach and how correctness is verified