Hotmic: Rolling Audio Buffer

LLM assistance was taken to refine this post.

I wrote hotmic, a small CLI tool that keeps a rolling microphone & system audio buffer and lets me save the last few minutes of audio on demand that works for in-person or virtual (Slack huddles, Google Meet etc.) discussions.

This is useful for moments where I realize slightly too late that something should have been recorded. These could be some good discussion in a meeting, a thought while talking to people, or some context that I want to turn into notes later.

I did not want a meeting recorder that constantly writes files. I wanted a scratch buffer. In its most basic form, it can work like below.

hotmic listen --buffer 30
hotmic save 5 --name "Weekly Review"

The first command keeps the last 30 minutes in memory. The second command saves the last 5 minutes into a timestamped directory.

There are more advanced support/capabilities (transcription, summary, diarization) that I’ll go through.

Ring buffer

The buffer starts with a small allocation and grows lazily until whatever is configured. After that it behaves like a normal ring buffer where new samples overwrite the oldest samples and no more allocations are needed.

RAM implication of buffer size and sample rate

Buffer	44100 Hz	16000 Hz
5 min	26 MB	9 MB
30 min	159 MB	58 MB
60 min	317 MB	115 MB

Command model

hotmic listen owns the microphone stream. Other commands talk to the running listener through a FIFO at /tmp/hotmic.pipe.

This makes it easy to wire into a hotkey daemon like skhd:

cmd + shift - s : hotmic save 5
cmd + shift - a : hotmic save
cmd + shift - m : hotmic mark
cmd + shift - p : hotmic pause
cmd + shift - r : hotmic resume

I have also setup some MacOS notification cards that pop up on top right when I make any of the hotkeys do some action just to know if they are triggered or not.

Marks

For more planned talks, it’s easier to mark “start” and “end” and only save between marks. Or just use these marks as audio bookmarks for later.

hotmic mark meeting-start
hotmic mark meeting-end
hotmic save --between-marks --name "Architecture Review"

Internally a mark stores the current total sample count and timestamp. When saving between marks, hotmic converts those sample-count snapshots back into a range inside the ring buffer.

This also gives a clean failure mode. If the marked audio has already been overwritten by the rolling buffer, hotmic can say that instead of silently saving the wrong range.

System audio

For virtual meetings, along with microphone there has to be the other party audio too so I also added optional system audio capture for macOS using audiotee.

This is where tools like Granola work well without being too invasive in meeting tools but still cleanly capture meeting notes.

hotmic listen --buffer 60 --system-audio

With system audio enabled, each save writes a few files:

audio.wav         mixed mono mic + system audio
mic.wav           microphone only
system.wav        system audio only
audio_stereo.wav  mic on left, system audio on right
metadata.json

The split files are useful because different downstream tools behave better with different inputs. For quick listening, the mixed file is enough. For debugging capture quality or trying diarization/transcription, having the separate tracks is probably much better.

This part is macOS specific right now because it depends on Core Audio taps.

Transcription and notes

To transcribe, I am doing it using mlx-whisper.

hotmic listen --buffer 30 --transcribe
hotmic listen --buffer 30 --diarize --summarize

Transcription writes .txt and .srt files next to the saved audio. Diarization adds speaker labels. Summarization currently does a claude -p subprocess call and writes a .summary.md file with summary. I went lazy on it to embed a claude -p in the code with little flexibility. I’ll probably put a more flexible sdk to call any model but for now it is what it is.

The transcription and summarization work runs after save in background threads, so the listener can keep recording while the previous save is being processed. On exit, hotmic waits for those workers to finish.

Diarization is slow and not so accurate right now. I might just remove it.

There’s one more reason to build this. I (and a lot of people in my current org) use Granola, but I also wanted to own the audio of the meeting along with the usual transcripts and summary that it provides. Basically storing the meeting in the rawest data form possible which was keeping the audio too. This gives me capability to transcribe through possibly an even better STT system or a better LLM for summarization etc. which Granola doesn’t provide.