Jon Moshier / Notes / Audiobook Player Improvements draft
Note · From the Notebook

Audiobook Player Improvements

A list of improvements for my audiobook player.

I built an Audiobook player recently for my bookclub. We’re reading Children Of Time and I couldn’t find the audiobook in Libby App so I decided to build an Audiobook player

Yes, I use AI and once I was done I asked for some suggestions on things I could do next. Here’s the response.

  • “Recap where I left off.” A 16-hour book listened to over weeks — when you resume after a few days, a button generates a “Previously…” recap of everything up to your current position. You already extract the full EPUB text, you know the current timestamp, and the char→time mapping tells you exactly how far into the text you are. Feed that slice to Claude → a spoiler-safe catch-up. For long books this is genuinely useful and I haven’t seen it anywhere.
  • Text search → jump to the audio. Search the book’s text (“where does Portia first appear?”), and because we proved the chars-per-second alignment, you can translate a text position into an audio timestamp and seek right to it. The alignment work we just did is exactly what makes this possible — it’d be a standout feature.
  • Read-along transcript. Show the text scrolling as it’s narrated (karaoke-ish), at least at paragraph granularity, using the same mapping. More ambitious, but the data’s already there.

Practical daily-use wins (you run with this a lot)

  • Offline for runs (the planned Phase 4) — pre-download a book so a dead-zone run doesn’t stall. The real win for your use case; the only wrinkle is iOS storage limits for a full book.
  • Cross-device resume — position lives in localStorage now (per-device). A tiny DynamoDB store (you’ve got the muscle from analytics) would let you pause on the phone mid-run and pick up on the laptop.
  • Shelf progress — progress bars on the cover cards + a “Continue listening” card up top, so the shelf shows where you are in each book.
  • Sleep timer — obvious, easy, nice for bedtime listening.

Bookclub (you mentioned it)

  • Shared bookmarks/discussion — drop a timestamped bookmark (“discuss this”) that your bookclub friend also sees. Spoiler-aware progress sharing (“Jon’s at ch. 30”) is a fun twist. This is where the gated-multi-user angle could actually shine.

The quiet payoff

  • Actually generate a Kokoro audiobook. We built the whole EPUB→TTS pipeline and never ran it — it needs an OPENROUTER_API_KEY. Picking a public-domain book and producing your first real Kokoro narration would close the loop and let you compare it against the commercial recording. That’s the thing the project was for.

My vote: “recap where I left off” and text-search→jump are the two that are both novel and uniquely enabled by what we’ve already built (extracted text + proven alignment). Offline is the most practically valuable for your runs.

Which thread pulls at you — the clever AI/search stuff, the run-practical stuff, the bookclub angle, or finally generating a real Kokoro book?

← All notes Read recent essays →