Jump to content

A.I. Generated Dialog


Recommended Posts

Fox News Article

 

Has anyone seen these A.I. Generated fake Joe Rogan podcasts? I immediately started thinking about how it could be used for ADR or even to clean up reference tracks.

 

I remember reading years ago about Roger Ebert and how they used a large sample library of his DVD commentary to create a synthesized voice / sampler for him to use with a keyboard. 

Link to comment
Share on other sites

Let’s just hope that SAG/AFTRA makes things difficult for any AI voice use, and hopefully synthesizing peoples voices becomes outright illegal worldwide. People really are taking AI too far. Sometimes I feel like people are self fulfilling prophecies of doom. Didn’t we learn anything from 1984 or Skynet?

Link to comment
Share on other sites

Hopefully discussions are well under way. Remember last year the controversy around the documentary “Roadrunner: A Film About Anthony Bourdain”?

 

Here's a brief and non-paywalled news story from The Guardian:

Anthony Bourdain documentary sparks backlash for using AI to fake voice

 

And here's a longer article that includes comments from the director and others from The New Yorker:

The Ethics of a Deepfake Anthony Bourdain Voice

 

Here's a short news story with some faked audio from the film:

 

 

Link to comment
Share on other sites

  • 3 weeks later...
On 4/14/2023 at 5:21 PM, JonG said:

Let’s just hope that SAG/AFTRA makes things difficult for any AI voice use, and hopefully synthesizing peoples voices becomes outright illegal worldwide. People really are taking AI too far. Sometimes I feel like people are self fulfilling prophecies of doom. Didn’t we learn anything from 1984 or Skynet?

Couldn't agree more!! You know productions will REALLY want to use AI for thing like ADR etc... SAG/AFTRA need to draw a line in the sand.

This kind of stuff should also fall under NIL (name, image, likeness) and have to be approved by the subject. Then again, productions will bury it deep in a contract that they can use AI to synthesize..

Link to comment
Share on other sites

On 4/14/2023 at 5:21 PM, JonG said:

Let’s just hope that SAG/AFTRA makes things difficult for any AI voice use, and hopefully synthesizing peoples voices becomes outright illegal worldwide. People really are taking AI too far. Sometimes I feel like people are self fulfilling prophecies of doom. Didn’t we learn anything from 1984 or Skynet?


I have a different take on whether voice synthesis should be banned. I can think of many legitimate uses that already rely on synthesized voices, and making those voices sound more human can be a good thing. Think of services for the blind, or automated telephone systems, siri and alexa, etc.  


I do strongly believe your voice is your likeness and it should never be allowed for someone to synthesize your voice without your express permission.  

That being said i’m sure you could find plenty of actors that would be happy to let a studio adr a line or two for a show they star in as long as their are clear limits on what the studio can and can’t do with that technology. SAG/AFTRA will have a lot of work to do in that regard. 

Then what about an author that wants to release some of their old books as audio books, but maybe in their younger voice?  Or just don’t care to spend the time recording. I see no harm in letting them synthesize their own voice. 

 

What about a company that allows voice actors to license their voice to be synthesized and get paid a royalty for every minute generated?  I think this will become a lucrative market in the next decade or so, and with the amount of money being poured into generative ai I only see this sector growing too. 

Link to comment
Share on other sites

2 hours ago, Wandering Ear said:

 What about a company that allows voice actors to license their voice to be synthesized and get paid a royalty for every minute generated?  I think this will become a lucrative market in the next decade or so, and with the amount of money being poured into generative ai I only see this sector growing too. 

Maybe…  but would you feel that listening to an audio book read by AI with the sound of a famous person’s voice is something you would buy? Isn’t the selling point that “your favorite person” is actually the one reading the book in the way that they interpret the story?

 

Maybe it’s just me, but I would probably have a hard time focusing on the story, being distracted by the fact that “it really sounds like so and so”. And it’s kinda creepy, too haha!

 

 

Link to comment
Share on other sites

2 minutes ago, Johnny Karlsson said:

Maybe…  but would you feel that listening to an audio book read by AI with the sound of a famous person’s voice is something you would buy? Isn’t the selling point that “your favorite person” is actually the one reading the book in the way that they interpret the story?

Sure, there will always be some people who prefer the old way vs the new. 

Just like there are still some people who prefer the human touch of an artist doing their painting for their portrait. 

But most people will end up preferring the approach which costs 1% as much and is 100x faster, of simply snapping a photograph to create their portrait. 

Link to comment
Share on other sites

1 minute ago, IronFilm said:

Sure, there will always be some people who prefer the old way vs the new. 

Just like there are still some people who prefer the human touch of an artist doing their painting for their portrait. 

But most people will end up preferring the approach which costs 1% as much and is 100x faster, of simply snapping a photograph to create their portrait. 

So diluting the value of art in order to save money? Haha! I'm not sure I agree this is the same thing as taking a photo vs painted portrait.

 

Check this out;

 

https://www.wsj.com/articles/i-cloned-myself-with-ai-she-fooled-my-bank-and-my-family-356bd1a3

 

Link to comment
Share on other sites

1 hour ago, Johnny Karlsson said:

So diluting the value of art in order to save money? Haha! I'm not sure I agree this is the same thing as taking a photo vs painted portrait.

 

I am just pointing out that people will eventually end up choosing the cheaper and better option. (and believe me, in the eyes of directors/producers/editors then the AI voice will not just be cheaper, but overall "better" too!)

 

Back in the day of painted portraits, then portraits had both a utility value and an artistic value. If a painting was the only way to accurately preserve a person's image, then that utility value was very high! Once photography arrived, the utility value of painted portraits dropped to near zero relative to their artistic value.

 

Thus why we saw in the 1800s and early 1900s a radical shift in painting styles, no longer was there the same value generated in perfecting your artistic craft in realism, but instead painters would lean more heavily into a more abstract artistic style, to set themselves apart in offering something unique that cameras didn't. 

 

That's why Academism / Realism saw their peak in the 1800's and didn't last as a dominant force into the 1900s, because as photography got better and better we saw instead the rise of other non-realistic painting styles such as Impressionism, Symbolism, and Cubism that painters could use to set themselves apart from photography. 

 

That was an understandable reaction to the rapid rise in technology and the threat it posed to them, I'm not so sure what is the right response right now to this new wave of technology, but people will need to figure out how to pivot into a new niche if AI takes over their current niche. For myself, I'm just glad I'm not a Voice Over Artist or a Stock Photographer! (but I'm consciously aware that in the long run I won't be immune to feeling the impacts of AI either, one of the many reasons I'm back part time at uni to upskill myself)

 

1 hour ago, Johnny Karlsson said:


Yup, I read that a couple of days ago. I told my mother (who is almost at retirement age) many months ago to watch out for scams, and told her about how these AI methods are going to make scams 1000x more powerful than they used to be. As although I regard her as well above average when it comes to technological savviness (heck, she even was a COBOL programmer in her youth!), I was a little worried for her safety and wanted her to be aware of what is coming. And I keep on reminding her, every few weeks, I sent her for instance WSJ article after I read it. 

Link to comment
Share on other sites

1 hour ago, Johnny Karlsson said:

Maybe…  but would you feel that listening to an audio book read by AI with the sound of a famous person’s voice is something you would buy? Isn’t the selling point that “your favorite person” is actually the one reading the book in the way that they interpret the story?

 

Maybe it’s just me, but I would probably have a hard time focusing on the story, being distracted by the fact that “it really sounds like so and so”. And it’s kinda creepy, too haha!

 

 


My hypothetical about an author wasn’t meant to say that i think it is better that way. I think the technology still has a long way to go before I can’t tell the fake from the real especially in long form. Once it gets there though …… I don’t know if i’ll still care if it’s the original voice or the synth voice, and the distraction of a new reality usually fades pretty quickly as we get used to the technology . If I don’t notice and can again get lost in the story, well, that’s what i want from an audio book anyway.  Lots of audio books are not read ny the original author and are very enjoyable.
As long as the author is in control of the process the same way they control the text, story, and characters, etc. I think i would not have a hard time accepting it. 

What i was trying to highlight was i think it’s perfectly ok for that author to have the option to do so if they feel it is the right choice for them. 
 

Or in other words, i am quite optimistic about this technology and what it can add to our art, despite the potential for abuse which is always a possibility with new technologies and should be mitigated. 
 

Link to comment
Share on other sites

1 hour ago, IronFilm said:

 

That's why Academism / Realism saw their peak in the 1800's and didn't last as a dominant force into the 1900s, because as photography got better and better we saw instead the rise of other non-realistic painting styles such as Impressionism, Symbolism, and Cubism that painters could use to set themselves apart from photography. 

 

That was an understandable reaction to the rapid rise in technology and the threat it posed to them, I'm not so sure what is the right response right now to this new wave of technology, but people will need to figure out how to pivot into a new niche if AI takes over their current niche. For myself, I'm just glad I'm not a Voice Over Artist or a Stock Photographer! (but I'm consciously aware that in the long run I won't be immune to feeling the impacts of AI either, one of the many reasons I'm back part time at uni to upskill myself)

Just read a good article that presented the optimistic side of why AI won't be immediately replacing all of everyone's jobs: 

https://www.understandingai.org/p/software-didnt-eat-the-world 

I'm a little bit more pessimistic myself, as I think AI is going to have a huge impact over the next decade. But the author did a good job covering what jobs will still exist and why. 

Link to comment
Share on other sites

On the topic of generative AI, check out this beverage advert made by AI:
 

 

It might seem hilariously bad right now, but compare this to what AI video looked like 6 months ago vs 12 months ago vs 18 months ago vs 3yrs ago?

The pace of improvement is ASTONISHING!!

Look at how fast AI photos improved over the last five years. Look at how good they are today. (basically, stock photography is going to be DEAD)

You might laugh at this video for being the garbage it is, but if you blink for a second and take your eye off the progress being made you might discover its become better than what many humans could do.

 

Here is MidJourney v3 vs v5, only merely 8 months apart:

May be an image of lovebird, parrot and text that says 'Exact same prompt 8 months apart.'

 

Not difficult to foresee how fast this will improve.

Link to comment
Share on other sites

 

In the future we will be watching dreams of the AI. If AI can make a GoPro footage look like a ARRI camera. Can make an amateur DP work look like Roger Deakins. Can make a sound mixer sound like Simon Hayes. Then there will be an abundance of quality like there is now... now all that will be different is the spirit. It's quite exciting and spiritual to contemplate... what makes us human?

Link to comment
Share on other sites

"SAG-AFTRA, the actors’ union, says more of its members are flagging contracts for individual jobs in which studios appear to claim the right to use their voices to generate new performances.

A recent Netflix contract sought to grant the company free use of a simulation of an actor’s voice “by all technologies and processes now known or hereafter developed, throughout the universe and in perpetuity.”

Netflix said the language had been in place for several years and allowed the company to make the voice of one actor sound more like the voice of another in case of a casting change between seasons of an animated production."

 

https://www.nytimes.com/2023/04/29/business/media/writers-guild-hollywood-ai-chatgpt.html

Link to comment
Share on other sites

11 hours ago, Izen Ears said:

Was all of the sound on that also AI generated?

Not in this case, but audio is the relatively easy part of generative AI video.

 

For example:

 

https://time.com/6273529/drake-the-weeknd-ai-song/ 

 

Clearly that audio is miles better than the visuals from that beverage advert. It is the visuals which need to catch up. 

 

Link to comment
Share on other sites

AI video just keeps on getting better and better:

 

The video to video feature is especially interesting, I could imagine a very rough video being made cheaply, then being heavily stylized to create a much more polished output for use in an advert. 

 

A longer look at AI tools for video: 

 

 

Link to comment
Share on other sites

  • 2 weeks later...

Unfortunately, it seems that sound engineers on the set will no longer be needed... or rather, if there will be live recording at all, one person with a recorder and one microphone will be enough, and the rest will be fine-tuned by AI and that's already the present... There will be no need at all for wireless systems and similarly... I'm trying to find the only thing why a sound engineer with a car full of equipment would be needed and I can't think of anything at all....

Link to comment
Share on other sites

3 hours ago, humbuk said:

Unfortunately, it seems that sound engineers on the set will no longer be needed... or rather, if there will be live recording at all, one person with a recorder and one microphone will be enough, and the rest will be fine-tuned by AI and that's already the present... There will be no need at all for wireless systems and similarly... I'm trying to find the only thing why a sound engineer with a car full of equipment would be needed and I can't think of anything at all....

No engineer needed. Mic plugged directly into the camera. Only thing staving this off in the near future is unions, right? Right?

Link to comment
Share on other sites

I don't see that at all.  Decades of feature films were recorded with a recorder and one microphone ... that didn't make our department any less critical.  We started using wireless because it helped us do more.  And AI will help us do even more.  The tools will change, but ultimately someone still needs to use those tools.

Maybe there's a world where expensive tent-pole features shift to a workflow that is entirely AI-based (this seems to be happening a bit already on the visual side, where less and less filming happens without VFX and environments are created virtually), but the for the vast majority of productions it will still be cheaper to capture the actual performances as they happen than to try and re-create "better" performances in post.  Doing so effectively doubles the work that the actors have to do, and either way, a technician is needed to run the recording.

Not to mention, many, many genres like news, documentary, reality etc. rely on capturing a reality that is actually happening in a way that can't be recreated by AI in post.  And, most actors and directors are still going to prefer the "real" performances that happen when staging scenes.

Whatever happens, and whatever the workflow, there will always be a need for someone to be "responsible" for sound, and to have knowledge of the tools needed to realize whatever the creative vision.  In some cases, maybe the "sound engineer" will become some kind of AI technician as well, but the basic need to record audio well isn't going away.

 

Link to comment
Share on other sites

21 hours ago, humbuk said:

Unfortunately, it seems that sound engineers on the set will no longer be needed... or rather, if there will be live recording at all, one person with a recorder and one microphone will be enough, and the rest will be fine-tuned by AI and that's already the present... There will be no need at all for wireless systems and similarly... I'm trying to find the only thing why a sound engineer with a car full of equipment would be needed and I can't think of anything at all....

Might still be a need for lav mics / earwigs / etc due to the demands of a scene, and the need for the director and others on set to be able to hear the performance. 

 

And no matter how powerful postproduction techniques might get benefiting from AI, it's a different matter to have them done live. (as the computational demands for it to be done instantaneously vs taking a minute or two, is massively higher) 

 

Plus there are cultural effects too, just because something is technically possible doesn't mean it will necessarily happen. Work culture tends to change a lot slower than technology does. 

 

That's why I'm very confident that 5yrs from now we'll still be needed and our jobs will still be mostly "more or less" the same. 

 

But if you ask about 10yrs, or 15yrs, or 25yrs from now?? That's much trickier for me to predict. 

 

It wouldn't shock me at all if within my working life span that our job radically changes to the extent people no longer care about quality. So long as they can here "something" (and even then, the live "something" on set might be radically cleaned up to the extent quality doesn't even matter for that either)  live on set then that is all that matters. Because it will become the norm that not even one second of production audio will be used in the final edit. 

 

Now, this is by no means certain to happen, I don't  have a perfectly working crystal ball. But I would not be betting against (& for the status quo) that this outcome will be happening during my life time. 

 

 

17 hours ago, The Documentary Sound Guy said:

And AI will help us do even more.  The tools will change, but ultimately someone still needs to use those tools.

 

But what if the tools make it is so easy a monkey could operate it? (even a cameraman...)

 

Take for instance portrait painters, a huge demand for them in the 18th century! As how else could you capture someone's image?? 

 

A quote: 

" in 1669 the Secretary of the French Academie des Beaux-Arts proclaimed portraiture to be the second most important genre of fine art. That proclamation, needless to say, drove many of the greatest painters of the time towards the art of portraiture."

https://www.chairish.com/blog/complete-history-portraiture-artists/

 

But once cameras arrived in the 1800's (especially as they got better) the need for portrait painters & sketchers quickly plummeted. 

 

When the first reports about photography came out in 1839, one Dutch periodical published a letter warning of “an invention…which could cause some alarm to our Dutch painters. A method has been found whereby sunlight itself is elevated to the rank of drawing master, and faithful depictions of nature are made the work of a few minutes.”

 

Of course a few such artists still exist today, but nobody would seriously suggest portrait painting as a viable career.  Thus the decline of the traditional painted portrait.

 

As today you'd just snap a photo of yourself with your cellphone, rather than commission a portrait. 

 

17 hours ago, The Documentary Sound Guy said:

Maybe there's a world where expensive tent-pole features shift to a workflow that is entirely AI-based (this seems to be happening a bit already on the visual side, where less and less filming happens without VFX and environments are created virtually), but the for the vast majority of productions it will still be cheaper to capture the actual performances as they happen than to try and re-create "better" performances in post. 

 

I disagree, as it might be the mega budget productions which will have the money to put into R&D to develop innovative new AI techniques. Just like how CGI effects first happened with mega budget films such as Jurassic Park before filtering down to low budget films. Such as Gareth Edwards and "Monsters" (2010), which was shot with the Sony EX3 and he did the visual effects himself in his bedroom! 

 

But who knows, it is tricky to accurately predict which niche of the industry will embrace all of the new AI audio tools first. 

 

17 hours ago, The Documentary Sound Guy said:

Doing so effectively doubles the work that the actors have to do, and either way, a technician is needed to run the recording.

 

Not at all, AI can reduce the workload of actors. 

 

17 hours ago, The Documentary Sound Guy said:


Not to mention, many, many genres like news, documentary, reality etc. rely on capturing a reality that is actually happening in a way that can't be recreated by AI in post. 

 

Personally, I suspect Reality etc will be one of the early adopters of new advanced AI techniques. 

 

Look at the terrible quality they have to deal with already, and how often you see subtitles you used in reality tv!! They'd love to replace that with crystal clear audio on the cheap. 

 

17 hours ago, The Documentary Sound Guy said:

And, most actors and directors are still going to prefer the "real" performances that happen when staging scenes.

 Will the director prefer having full creative control during the edit? (which AI can give them) 

 

Or will they prefer being locked into only being able to choose from the five perforfmances/takes that the actor gave then during the shoot itself? 

Link to comment
Share on other sites

30 minutes ago, IronFilm said:

Will the director prefer having full creative control during the edit? (which AI can give them) 

 

Or will they prefer being locked into only being able to choose from the five perforfmances/takes that the actor gave then during the shoot itself? 

I think that most directors will prefer to get it right on the set and not "waste" time to fiddle around with AI for every scene. Of course there might be moments which they'll like to alter. But that's what ADR is for - it's not just a technical thing, but also a creative exchange between director and actor/actress where both sides bring their ideas and interpretations to the table.

And I think that the actors guild will fight against actor's voices being copied by AI and to keep the voice as an intellectual property of the actors. 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...