OiPer Desktop
native voice command center

Hold key. Speak. Release. Continue writing.

Local-first dictation engineered for real desktop speed

OiPer turns short recordings into ready-to-insert text with very low latency. It stays on-device by default, supports optional online cleanup, and keeps the workflow focused on the active app instead of another recording window.

Privacy-first architecture
Fastest benchmarked result

quick workflow

One hotkey, no detour

01

Hold to record

A global hotkey starts capture instantly from anywhere on the desktop.

02

Release to transcribe

The moment you release, OiPer processes the clip with your selected backend.

03

Inject and refine

Text lands in the active app and can be cleaned locally or through an optional online model.

User experience

Hold a global hotkey, speak naturally, release, and keep working in the same application without context switching.

Performance

Native code, GPU acceleration where available, and low-latency transcription tuned for heavy daily usage.

Privacy

Audio, logs, and transcription stay on your machine by default. Online services only run when you opt in.

privacy and security

A local core with optional cloud edges

local processing

Transcription runs on your machine. Audio clips and activity logs remain on-device, so the default setup does not export your speech anywhere.

online services

Cleanup and optimization are optional. Use your own API key, choose the provider, and switch the feature off whenever you want.

advanced accuracy

Bring in an LLM when the language gets specialized

OiPer can route transcription through lightweight LLMs when you need stronger handling for technical terms, product names, or domain-specific phrasing. Smaller fast models such as Gemini 2.5 Flash Lite fit well here.

Choose local-only, online cleanup, or a hybrid stack depending on whether privacy, speed, or precision matters most for the task in front of you.

settings and configuration

Tune models, backends, and providers without losing simplicity

backend selector

autocpugpu
Speech model selection with simple downloads and size choices
Backend preferences for auto, CPU-only, or GPU acceleration
Provider controls for base URL, API key, and model name
Local or online text cleanup depending on your workflow
LLM-assisted transcription for technical terminology and niche vocabulary
Codex-5.2 1Codex-5.2 2Codex-5.2 3Codex-5.2 4Codex-5.2 5
Codex-5.3 1Codex-5.3 2Codex-5.3 3Codex-5.3 4Codex-5.3 5
Codex-5.3 1Codex-5.3 2Codex-5.3 3Codex-5.3 4Codex-5.3 5
GPT-5.4 1GPT-5.4 2GPT-5.4 3GPT-5.4 4GPT-5.4 5
Gemini-3.1 Pro 1Gemini-3.1 Pro 2Gemini-3.1 Pro 3Gemini-3.1 Pro 4Gemini-3.1 Pro 5
Sonnet-4.6 1Sonnet-4.6 2Sonnet-4.6 3Sonnet-4.6 4Sonnet-4.6 5
Sonnet-4.6 1Sonnet-4.6 2Sonnet-4.6 3Sonnet-4.6 4Sonnet-4.6 5
Opus-4.6 1Opus-4.6 2Opus-4.6 3Opus-4.6 4Opus-4.6 5
GLM-5 1GLM-5 2GLM-5 3GLM-5 4GLM-5 5
Kimi-K2.5 1Kimi-K2.5 2Kimi-K2.5 3Kimi-K2.5 4Kimi-K2.5 5
Qwen-3.5 1Qwen-3.5 2Qwen-3.5 3Qwen-3.5 4Qwen-3.5 5
Lovable 1Lovable 2Lovable 3Lovable 4Lovable 5
V0-Max 1V0-Max 2V0-Max 3V0-Max 4V0-Max 5