#voice synthasis
Explore tagged Tumblr posts
Text
Digital Narration from Apple Books
Apple just launched digital narration service using voice synthesis!
From the Forbes article:
The initial target: long-tail books that will never be worth paying a human narrator for.
“More and more book lovers are listening to audiobooks, yet only a fraction of books are converted to audio — leaving millions of titles unheard,” Apple says. “Many authors — especially independent authors and those associated with small publishers — aren’t able to create audiobooks due to the cost and complexity of production.”
Apple is releasing four voices to start, two female and two male. Voices are optimized for specific genres of books, so Jackson is intended for fiction or romance with a deep, somewhat husky voice, while Helen is a soprano designed for nonfiction and self-development.
If you click though to the Apple authors announcement you can hear samples of the 4 voices. They are in my opinion, extremely listenable.
Understandably, I've seen a lot of people railing against the move. People worried about putting voice actors / narrators out of work. Other commentators question listener fatigue and the sort of attention level that can be given to an AI generated voice over a long period. But also, good for small authors, the cost of producing and audio book (even if you do it yourself) is colossal. I think AI voiced long tail / self published books will be really popular.
I know people (Ross I'm looking at you) who use over cranked TTS voiceover voices to read PDFs and other non fiction to them in the car when they are driving around, this voices are much more natural and will be popular if *built in* to devices int he near future.
Growing up in the 90's AI / Computers speaking were supposed to sound like HAL, Marvin the Robot or Stephen Hawking.
Instead we got the the TikTok lady. Which had its own crazy 'ethics in AI model creation' story a few years ago.
I was thinking recently that the TikTok lady has - at this point - got to be the most famous voice in the western world? An order of magnitude more famous than any living person surely?
Also I'm still laughing about the 'Boiling my husband alive in oil 🥰' post I reblogged yesterday.
Anyways, I built my own voice model last year using the 10's of hours of podcast's I've recorded. I've been sneakily using it for replacement here and there in my new interview show. Unless you looked at my working files, no one will ever ever know. I wonder how many professional voice actors who narrate for a living also have their own custom models too?
Whats interesting to me about Apple's move is the 4 voices optimised for different kinds of genre out the gate?
The chirpy upward inflection of the TikTok lady voice is optimised to deliver meaning as clearly as possible inside the typographic container of TikTok's media environment.
Audiobook producers of course have been making casting decisions for years around content and narrator but it seems significant to me that right out the gate there are certain kinds of voices best suited for certain kinds of content?
Theres a big difference of course between hearing and listening (cf Pauline Oliveros, R. Murray Schafer, Deep Listening etc).
Comprehension is always an active process. The philosopher Mortimer J. Adler talks about how teaching and learning are a reciprocal process
“Just as teaching will not avail unless there is a reciprocal activity of being taught, so no author, regardless of his skill in writing, can achieve communication without a reciprocal skill on the part of readers.”
The same is also true of effective storytelling. There is the teller of stories and the person being told.
I want AI voice Narration to be turned up to 11 - tomorrow. The optimisation of Human-Like voices to deliver meaning I'm convinced could be pushed a lot further.
Why not create a Human-Like voice that is optimised to deliver meaning at 2/3x speeds?
What does a synthesised voice that reaches 'I know kung-fu' levels of comprehension to a willing and active listener even sound like?
At that level of speed and active comprehension do we need to tweak for realism or optimise for something else?
#Ai#audiobook#audiobook narrator#narrator#human narrator#ai voice#tiktok#meaning#reading#reading comprehension#voice synthasis#apple#audiobooks#self publishing
2 notes
·
View notes
Video
youtube
early-80s clip about how electronic music legend Suzanne Ciani created the soundtrack and sound effects for the Xenon pinball game. Xenon was the first talking Bally pinball game and the first pinball game voiced by a woman.
1 note
·
View note