Documentation · zos-voice

zos-voice — voice-note ingestion.

Privacy-first voice ingestion, derived from the source of zos_voice 0.1.0. Working code, June 2026.

zos‑voice turns a folder of voice notes into a folder of Markdown transcripts — privately. It sweeps a plain local directory of audio files, transcribes each one with a LOCAL backend you supply (mlx_whisper, whisper.cpp, or any callable), and lands one .md per note, with state markers, partial-success sweeps, retry of failures, and a true dry-run. Python 3.11+ · stdlib only · BUSL-1.1

Status: working code · 49 tests · CI green · BUSL-1.1 · repository private until public launch.

The privacy contract

No function in this package opens a network connection, ever. No bundled model, no hosted API, no telemetry. Subprocesses are only spawned for the command template you configure, with shell=False and a timeout. The test suite enforces this (tests/test_privacy.py): a full sweep runs with socket.socket monkeypatched to raise, and a static guard fails the build if the package ever imports socket/http/urllib/requests.

The input is a plain local folder. If a cloud-sync client (Drive, Dropbox, Syncthing, …) happens to populate that folder, that is entirely the caller's business — zos-voice neither knows nor cares, and never talks to any service on your behalf. What happens to transcripts afterwards (classification, sync, landing into your notes system) is deliberately out of scope — zos-voice stops at the .md on local disk.

Transcribers — zos_voice.transcriber

class Transcriber(Protocol): name: str def transcribe(self, audio_path: Path, workdir: Path) -> str: ...

workdir is a fresh per-note scratch directory provided by the sweep. Failures must raise TranscriptionError(RuntimeError) — one note failed; the sweep records it and continues (partial success).

CommandTranscriber(template, output="stdout", timeout=600, name=None)

Wraps any local CLI.

CallableTranscriber(fn, name="callable")

Wraps fn(audio_path: Path) -> str (an in-process model, or a test fake). Non-string/empty results and any exception become TranscriptionError.

Example backendCommand templateoutput
mlx_whisper (Apple Silicon)mlx_whisper {audio} --model mlx-community/whisper-large-v3-turbo --output-format txt --output-dir {workdir}txt
whisper.cppwhisper-cli -m /path/to/ggml-base.en.bin -f {audio} -np -ntstdout

Sweep — zos_voice.sweep

Sweep(source_dir, out_dir, transcriber, root=None, extensions=DEFAULT_EXTENSIONS, state=None)
discover() -> list[str] run(dry_run=False, max_notes=None, only_failed=False) -> SweepReport retry_failed(dry_run=False, max_notes=None) -> SweepReport status() -> dict transcript_path(rel) -> Path

SweepReport (dataclass): dry_run, discovered, skipped_done: [rel], would_process: [rel] (dry-run only), processed: [{rel, transcript}], failed: [{rel, error, attempts}]; to_dict() for JSON.

Transcript format

---
source: <rel path>
source_bytes: <int>
duration_seconds: <float>     # only when cheaply available (WAV header)
processed_at: <iso8601 UTC>
transcriber: <name>
---

<text>

State — zos_voice.state

A record: {"status": "done"|"failed", "processed_at", "transcript", "error", "attempts"} keyed by source-relative path. Constants: STATUS_DONE, STATUS_FAILED.

class StateStore(Protocol): def get(self, rel) -> dict | None: ... def set(self, rel, record): ... def all(self) -> dict[str, dict]: ...

CLI — zos-voice

Subcommands sweep, status, retry-failed; shared flags --root, --marker {state,sidecar}, --extensions; sweep/retry flags --command (required), --command-output {stdout,txt}, --timeout, --max, --dry-run. Prints a JSON report. Exit codes: 0 success · 1 ≥1 note failed · 2 usage error.

# See what WOULD process — writes nothing at all
zos-voice sweep ~/voice-notes ~/voice-transcripts --dry-run \
  --command "mlx_whisper {audio} --model mlx-community/whisper-large-v3-turbo --output-format txt --output-dir {workdir}" \
  --command-output txt

# Run it for real
zos-voice sweep ~/voice-notes ~/voice-transcripts \
  --command "mlx_whisper {audio} --model mlx-community/whisper-large-v3-turbo --output-format txt --output-dir {workdir}" \
  --command-output txt

# Where do things stand?
zos-voice status ~/voice-notes ~/voice-transcripts

# Re-attempt only the notes that failed last time
zos-voice retry-failed ~/voice-notes ~/voice-transcripts --command "..." --command-output txt

--command is substituted ({audio} = the audio file, {workdir} = a per-note scratch directory), split with shlex, and executed without a shell.

This page mirrors docs/API.md in the zos-voice repository, derived from the source at 0.1.0. Companion: platform overview · zos-core library API. Questions? Request early access.