Jay Rose Posted January 28, 2018 Report Share Posted January 28, 2018 Lyrebird.ai is a Canadian company doing AI creation of new speech from sample recordings. They have an online demo, where you record a minimum of one minute (they guide you through sentences, so the samples have a key), it runs the samples through a neural net, and then it'll create your voice saying anything you type. Pros: almost real-time, with a web interface. Cons: still somewhat artificial sounding, but a lot better than previous while-you-wait examples. ... and this is just a first-gen beta demo, with a really small training set and no ability to tweak the results. The company isn't posting anything about their technique, so I'm just guessing (from the operation and from the principals' bios) that it's NN. They seem to be interested in selling their sdk to other developers, rather than offering a service to filmmakers... but that's also just a guess. Quote Link to comment Share on other sites More sharing options...
VASI Posted January 28, 2018 Report Share Posted January 28, 2018 This is not good... Just saying Quote Link to comment Share on other sites More sharing options...
Constantin Posted January 28, 2018 Report Share Posted January 28, 2018 3 hours ago, Jay Rose said: They seem to be interested in selling their sdk to other developers, rather than offering a service to filmmakers... but that's also just a guess. No no, they are going to sell it to super-pacs and the likes so these can create their own „undercover“ videos, adding any line of speech to their opponents they wish. Quote Link to comment Share on other sites More sharing options...
BAB414 Posted January 30, 2018 Report Share Posted January 30, 2018 No no, they are going to sell it to super-pacs and the likes so these can create their own „undercover“ videos, adding any line of speech to their opponents they wish. Still no way to program subtext/sarcasm/inflection/intention...yetSent from my SAMSUNG-SM-G891A using Tapatalk Quote Link to comment Share on other sites More sharing options...
Jay Rose Posted January 30, 2018 Author Report Share Posted January 30, 2018 Bab414, Other artificial speech apps -- some I've posted here -- have controls for prosody and inflection. [I'm assuming you weren't being sarcastic in your post, but actually looking for information...] What hasn't been done ... or rather, what hasn't been published yet ... is training a neural net to generate those wrinkles automatically. It still has to be done by a human operator. But as soon as someone comes up with a training library that's properly keyed for these elements, it'll happen. The mass market is there, for digital assistants with an edge. And building the library won't be hard, as soon as someone develops a consumer app that provides a benefit for users tagging the subtext. Quote Link to comment Share on other sites More sharing options...
BAB414 Posted January 30, 2018 Report Share Posted January 30, 2018 Bab414, Other artificial speech apps -- some I've posted here -- have controls for prosody and inflection. [i'm assuming you weren't being sarcastic in your post, but actually looking for information...] What hasn't been done ... or rather, what hasn't been published yet ... is training a neural net to generate those wrinkles automatically. It still has to be done by a human operator. But as soon as someone comes up with a training library that's properly keyed for these elements, it'll happen. The mass market is there, for digital assistants with an edge. And building the library won't be hard, as soon as someone develops a consumer app that provides a benefit for users tagging the subtext. I stand corrected. Thanks for the info. Very interesting stuff. A colleague of mine believes this technology will be used to recreate noisy/scratchy/unusable dialogue in post.Sent from my SAMSUNG-SM-G891A using Tapatalk Quote Link to comment Share on other sites More sharing options...
Jay Rose Posted January 30, 2018 Author Report Share Posted January 30, 2018 Quote A colleague of mine believes this technology will be used to recreate noisy/scratchy/unusable dialogue in post. Add me to that list. I first proposed it about a dozen years ago, in a DV Magazine column. Cleaning up bad production dialog is one thing. But these apps are (or soon will be) capable of making a convincing recording of anybody saying anything you want to type as an input. All you need is enough samples of their voice to use as training material. And as I reported about a month ago, a different app can take video of someone, and make an absolutely convincing new lipsync video of them saying a new audio input. Demos already online, and of course that's also still in Beta. "Oh brave new world, that has such [non existent] people in it!" -- Shakespeare Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.