I can’t take credit for this as an original idea. A colleague mentioned offhandedly that he’d rolled his own digital note-taker โ a transcriptionist, essentially โ in a matter of minutes. I decided to follow in his steps without any guidance to see if it was really that easy. I also had a good use case if it worked.
A while back, I wrote about how I use Claude Code to manage all my meeting notes as markdown files in a GitHub repository. That workflow is great โ I type stream-of-consciousness observations and Claude organizes them into a clean agenda or summary. But there’s a problem: it’s hard to both conduct a meeting and act as a high-speed typist at the same time.
Many meetings announce upfront that they’re being recorded, but if you didn’t initiate the meeting, you may never end up with the recording. Even for meetings I do initiate, the audio takes a few minutes to become available afterward, and in back-to-back meetings, I’ve already moved on by then.
Building Hearsay
So I asked Claude Code for help building a transcription tool for Windows that could capture both online meeting audio โ Teams, Zoom, whatever โ and my microphone. We were successful.
The result is Hearsay, a Windows desktop app that records system audio and microphone input and transcribes it in real time using OpenAI’s open-source Whisper model. Everything runs locally on your machine โ no cloud services, no API calls, no data leaving your computer. A Windows installer is available under Releases on the GitHub repo, and it can be cleanly removed later via Add or remove programs.
The application runs from the system tray, so the workflow is simple: right-click, start recording, and it quietly transcribes everything it hears into timestamped markdown files. It detects whether you have an NVIDIA GPU and selects an appropriate model โ the larger models on capable hardware, smaller quantized models on CPU-only machines. A first-run wizard handles hardware detection, audio device selection, and model download.
The Result
Today, I produce more detailed meeting notes than I ever had before โ in every meeting I participate in online from my Windows desktop. I haven’t leaned into prettying up the output yet. Right now I just dump the raw transcript into Claude Code after each meeting and let it merge the transcription with whatever I managed to type during the call. But there’s a lot of opportunity to massage the output into something immediately distributable โ auto-generated summaries, action items, the works.
I also need to build a Linux version. I end up taking my Dell XPS 13 running Linux to any face-to-face meetings, and it would be nice to have the same capability there.
A Word on Consent
A word of warning: roughly a dozen U.S. states โ including California, Florida, Pennsylvania, and Washington โ require all-party consent to record a conversation. That means every participant must know about and agree to the recording. You should always alert your meeting guests that you’re using a transcription tool.
That said, if someone has already invited their own digital note-taker into a meeting and advertised its presence, I just fire up mine in the background. Fair is fair.
Why This Matters
This was a fun little project, and a good example of how agentic, command-line coding can turn an idea into a working tool in short order. No SDK documentation to wade through, no boilerplate to scaffold โ just describe what you want and iterate. If you’ve been looking for a weekend AI project that solves a real problem, building your own transcription tool is a great place to start.
