MANIFOLD
Will calibre ebook manager have easy integration with text to audio apps by mid 2025
10
Ṁ1kṀ11k
resolved Aug 3
Resolved
N/A

Something like: enter your apikey to openai or elevenlabs, click book, click generate, audio is being produced and will show up

It doesn't have to be perfect but at least those basic steps without doing a bunch of manual hacking yourself.

  • Update 2025-06-30 (PST) (AI summary of creator comment): The creator is looking for an integration that allows a user to easily flip back and forth between reading the text and listening to the generated audio.

  • Update 2025-07-13 (PST) (AI summary of creator comment): The creator has specified that for a YES resolution, the integration should ideally treat the generated audio as another 'version' or format of the book (like epub, mobi, etc.).

An implementation that requires going through the 'edit book' UI to manage the audio is considered a point against a 'YES' resolution.

  • Update 2025-07-13 (PST) (AI summary of creator comment): In response to a user, the creator has clarified their original intent for a YES resolution:

    • The integration must produce a portable and sharable audio file (e.g., an MP3).

    • The desired outcome is a tangible audio file that can be kept and listened to on other devices, not just a real-time "read aloud" feature within the Calibre application.

Market context
Get
Ṁ1,000
to start trading!
Sort by:

Good luck @Ernie , let me know if it works good as I have a project in mind where I can use this

@Ernie any chance we can resolve this?

@JoshDreckmeyr yeah, good Saturday oroject

@JoshDreckmeyr

trying this... having to open up the edit book UI seems weird. I had assumed calibre would treat the audio of a book as another "version" along with the existing concept of "version" like epub/mobi/pdf etc. Let's see how it goes

Seems fairly close to my intention. I'm getting latest calibre to try. The web version of piper seems just barely okay (compared to whisper etc) but it does seem like what I was pointing at.

I'll have to test the integration on the product side, though. I want to be able to easily just flip back and forth

@Ernie flip back and forth between ‘Read Aloud’ and reading the text yourself?

I just did that myself last night and found it easy and convenient (at least imo).

@snazzlePop @Ernie I like the ljspeech voice myself. I found some of the other “high quality” voices lacking.

@Ernie Hmm. It's allegedly playing although i can't hear anything. I could dig into windows error messages. But this doesn't seem close to what I wanted. I currently use a little script like this to take a text file and make mp3s out of it to listen to

https://github.com/ernop/make-audio/blob/master/make-audio.py

I couldn't get the calibre one to work, and if it did, it doesn't seem like what I want at all.

But, from my description, this seems hard to have known unless you knew what I was thinking. I had meant that we are producing a thing that we have, an audio file that's good and can be shared, listened to on the road etc.

But this seems like a classic NA.

bought Ṁ3,500 YES

calibre 8.0 added integration with the Piper local text-to-speech engine.

@snazzlePop looks like it was added in 7.18, just a day or two after market creation https://calibre-ebook.com/whats-new. So @Ernie will have to try it himself and see if it satisfies all the functionality he had in mind (looks like it doesnt let you use an api key for a higher quality service as in the description, but is strictly a local model?)

sold Ṁ841 YES

@Ziddletwix I sold some of my shares because I’m not sure how @Ernie will interpret it. Like you said, no api key is required, but it is certainly easy to use and it’s integrated into the calibre program.

@snazzlePop yeah i think it's at least a little bit ambiguous.

Something like: enter your apikey to openai or elevenlabs, click book, click generate, audio is being produced and will show up

It doesn't have to be perfect but at least those basic steps without doing a bunch of manual hacking yourself.

it doesn't satisfy the specific scenario described. but that exact scenario isn't required, it can be "something like" that scenario. i do think the description implies that there should be at least a vaguely similar degree of functionality. so if the piper text-to-speech local model is really really bad, then this integration is fundamentally different than what ernie described. since it's about "easy integration" with "apps"—i.e. not just whether there's any feature that nominally generates audio based on text, but rather whether calibre lets you leverage the new AI apps (which is notable bc they are so much better than what was available everywhere 5 years ago). if piper in practice is more like the oldschool terrible text-to-audio features that countless apps already had, then i wouldn't think it'd count (i.e. NO). if it could plausibly fill the same role as someone who had previously wanted to use openai/elevenlabs for text-to-audio, then it'd likely be YES.

© Manifold Markets, Inc.TermsPrivacy