Samuel is a free, open-source voice AI desktop assistant for macOS that provides real-time language interpretation, ambient screen understanding, and hands-free desktop control. Version 0.1.0 is available now as a 376 MB DMG download for Apple Silicon Macs (macOS 14 Sonoma or newer); Intel Mac, Windows, and Linux builds are on the public roadmap. He continuously watches your screen via GPT-4o Vision, continuously listens to system audio via ScreenCaptureKit, and answers by voice through the OpenAI Realtime API. Common use cases: learning Japanese, Spanish, Mandarin, or any other language while watching anime, foreign news, or lectures; live meeting translation; hands-free web browsing such as "show me my emails" or "check my GitHub notifications" via real Chromium browser automation; building custom AI tools by voice — Samuel writes the code himself with GPT-5.5 and auto-repairs it when it breaks. A free trial is included on first launch — no OpenAI API key required to try it. Every current beta feature is free forever. Released under the MIT license. Built by Sam Feng.
Meet Samuel
An AI that lives with you.
And grows with you.
Other AIs sit behind a tab. Samuel lives on your Mac — quiet, attentive, and a little curious. He sees what you see, hears what you hear, and the moment you ask for something he can't do yet, he simply learns how.
Live capture · Real-time language interpretation
Real Japanese. Real time.
A four-second peek at ambient language tutoring. Lyrics on screen, a vocabulary question whispered aloud — Samuel translates and explains without you pausing the video. The same flow works for Spanish, Mandarin, French, Korean, or any language you're learning.
Not an app. A presence.
You don't open Samuel. You just talk to him.
No menus. No tabs. No copy-pasting your screen into a chat box. He's already there — watching what you're working on, listening for the moment you need him.
He sees what you see.
Watching anime, reading a paper, debugging a recipe — Samuel quietly takes in the screen so you never have to explain context.
He hears what you hear.
The Japanese news on your screen, the lecture in your headphones, the meeting in another tab — he listens alongside you.
He answers in your voice's pause.
No typing, no buttons. Speak the way you'd speak to a friend sitting beside you. Samuel replies in roughly half a second.
The part nobody else does
Ask for anything.
He'll build the ability if it doesn't exist.
Most AI products are limited to whatever the company shipped. Samuel isn't. When you ask for something new — a weather widget, a stock tracker, a way to read your calendar out loud — he writes the tool, tests it, and installs it. While you're still talking.
And when something he built breaks later? He fixes it himself, then casually mentions what happened.
The real problem
Stop switching tabs.
Just talk.
The average knowledge worker switches between apps 33 times a day. Slack, Notion, Gmail, GitHub, Figma — every switch costs roughly 20 minutes of focus to recover. Most people lose 2+ hours a week to tool fatigue alone.
for the average knowledge worker
to context switching
on average
Samuel is the only app you talk to. He opens the others for you, reads them, drives them, and answers — without you ever leaving what you were doing.
Where he lives, what he can do
A new shape for AI.
We didn't make a smarter chatbot. We made something with a different silhouette — closer to a roommate than a tool.
The potential
The ceiling isn't what we shipped.
It's what you'll ask for.
Every conversation teaches Samuel a little more about how you work. Every new ability he builds becomes part of him. The longer you live with him, the more your Samuel he becomes — quieter where you want quiet, sharper where you want sharp, fluent in the things you care about.
Questions, plainly answered
FAQ
- What is Samuel?
- Samuel is a free, open-source voice-first AI companion for macOS — wake-word activated ("Hey Samuel"), speaks back in under half a second, sees your screen and hears your system audio when you allow it, drives any Mac app, browses the web like a human, and writes his own tools on demand using GPT-5.5. Released under the MIT license.
- How is Samuel different from ChatGPT, Siri, or Alexa?
- ChatGPT lives in a browser tab and only sees what you paste; Siri and Alexa run scripted commands; meeting tools like Granola or Otter summarize after the call. Samuel is the only one you can just talk to, in real time, about whatever just happened on your screen or in your audio — wake word in, voice out, sub-500 ms — and the only one that writes brand-new tools for itself when you ask for something he can't yet do.
- What can I use Samuel for?
- "What did they just say?" mid-meeting, podcast, or lecture · live in-call assist for sales, support, and interviews · hands-free Mac control for RSI or VoiceOver users · ambient language learning while watching anime, K-drama, foreign news, or YouTube · meeting summarization without a bot joining · voice-controlled web browsing ("show me my Gmail") · self-building AI tools by voice ("build me a weather widget") · ambient monitoring ("tell me when you hear X").
- How fast does Samuel respond?
- The voice loop is roughly half a second end- to-end — wake word in, reply out. That's the OpenAI Realtime API speed. Tasks that need a screen read or audio recall add a couple of seconds; deep reasoning (e.g., writing a new tool with GPT-5.5) takes 3–8 s, but Samuel narrates what he's doing while he works so you're never left wondering.
- Can Samuel actually do things, or only answer?
-
He can do things. Samuel drives any macOS app via the
Accessibility tree — clicks, types, scrolls, switches tabs,
opens apps — and falls back to GPT-5.5 visual computer-use
when an app's accessibility info is thin. You choose how
aggressive he is:
background_workspace(zero-touch ambient),observe_only(read-only),ask_before_action(asks before writes), ortakeover(full keyboard and mouse). - Does Samuel really write his own tools?
- Yes. When you ask for something Samuel doesn't already do — for example, a weather widget — he generates the code with GPT-5.5, has it reviewed by GPT-4o-mini, validates it, and installs it without restarting. If the new tool breaks later (because an external API changed, say), he diagnoses the failure and patches the code automatically. Maximum two repair attempts; after that he explains in plain language what he needs from you.
- Can Samuel translate or interpret in real time?
- Yes. While you watch foreign-language content — anime, news, lectures, YouTube — you can ask any question by voice and Samuel answers within roughly half a second without pausing the video. Works for Japanese, Spanish, Mandarin, French, Korean, German, and any other language pair the underlying models support. The same flow works for live meetings.
- Can I ask Samuel about audio he just heard?
-
Yes. Once audio listening is allowed, Samuel keeps a rolling
local audio buffer running silently. When you ask "what did
they just say?" / "translate the last 30 seconds" / "teach
me the words from that clip", he ffmpeg-trims the tail of
the buffer to your window, transcribes it with
gpt-4o-transcribe, and answers. Your question is the boundary — no polling cadence, no auto-pause/resume keystroke fights, and zero transcription cost while idle. - How does Samuel browse the web?
- Three tiers, chosen automatically. Quick search via SerpAPI for "look up X" requests. Deep research via OpenAI Responses API with web search for "find more details" requests, returned with cited sources. Real browser automation via Playwright for any login-required site (Gmail, GitHub, your bank, internal tools) — Samuel opens a visible Chromium window, you sign in once, and he reads and clicks through the page like a human.
- Which operating systems does Samuel support?
- The v0.1.0 release is a 376 MB DMG for Apple Silicon Macs running macOS 14 (Sonoma) or newer. An Intel Mac build, a Windows port, and a Linux port are on the public roadmap — drop your email in the notify section and we'll let you know when yours is ready.
- Is Samuel free?
- Yes — every beta feature available today will stay free forever. That's a commitment, not a trial period. Samuel is open source under the MIT license, and the entire current capability set — voice conversation, ambient screen and audio, browser automation, plugin generation, auto-repair, memory — is yours to keep. You only pay OpenAI for model usage (wake-word listening ~$0.006/min, ambient assistance ~$0.02–0.05/min, voice conversation at standard Realtime API rates). Plugins and browser automation run locally and cost nothing.
- Do I need an OpenAI API key to try Samuel?
- No — v0.1.0 ships with a free trial proxy so you can use Samuel without bringing your own key on first launch. When the trial credits are used up, paste your own OpenAI API key in Settings → API Key for unlimited use; the app then talks to OpenAI directly and never contacts the proxy again. Samuel itself is free forever; the key only pays OpenAI for the model calls.
- Is my data private?
-
Memory, preferences, skills, plugins, and API keys are
stored locally in
~/.samuel/. Browser sessions run locally via Playwright. Screen captures and audio are sent to OpenAI only while a feature is active, and every privacy-sensitive surface (continuous listening, continuous screen watching) requires an explicit Allow on a consent card the first time it flips on — no auto-approve countdown for those two. Toggle screen watching and audio listening off at any time in Settings, or say "stop listening to my speakers." - What models does Samuel use?
-
OpenAI Realtime API for voice conversation (~500 ms),
GPT-5.5 with reasoning for plugin code generation and visual
computer control (3–8 s), GPT-4o Vision for screen
understanding (3–5 s), GPT-4o-mini for code review and
trigger classification (~1 s), and
gpt-4o-transcribefor high-fidelity audio recall (3–10 s). - How do I install Samuel?
- Download the DMG (376 MB), open it, and drag Samuel into Applications. On first launch, right-click Samuel and choose Open to bypass the "Apple cannot verify" notice — notarization is on the roadmap. The free trial proxy lets you start without an API key; for unlimited use, paste your own OpenAI key in Settings.
Available now · v0.1.0
Bring Samuel home.
Free, open source, and yours to keep. The first public build is for Apple Silicon Macs — Intel, Windows, and Linux are next on the roadmap.
- Open
Samuel-0.1.0-arm64.dmgand drag Samuel into Applications. - First launch: right-click
Samuelin Applications and choose Open to dismiss the "Apple cannot verify" notice. (Notarization is on the roadmap.) - A free trial is included — no OpenAI API key required to try it. For unlimited use, paste your own key in Settings.
On an Intel Mac, Windows, or Linux?
Drop your email and we\u2019ll tell you the day your build is ready.
Source code — github.com/sambuild04/screen-voice-agent