Jump to content

Adobe technology sneak: VisualSpeechEditor


Jim Feeley

Recommended Posts

The demo doesn't give us much to go on, but I think they are indicating pitch... or at least, the identifying by color the predominant  band (maybe an octave wide) in the audio at any one time. 

 

So they've invented a very simplified, gross spectrogram. That is, a gross version of a the more refined display mode the free, open-source Audacity has had for years. If Audacity could only scrub, it would blow this new Adobe feature out of the water.

 

And don't worry... it won't turn anyone into an editor any more than a waveform display did. But the tutorials supplied with this new program will probably be as bad as the tutorials with current waveform-displaying NLEs, which tell you to edit music by looking for a big blob that could be a drum beat.

 

"Professional sound editors" edit sound, not blobs on a screen. Putting the blobs in color might save a little time, but it's not how you find the cuts.

 

Actually, Audacity's (or anybody's) spectrogram is potentially a lot more useful than this simplified display. If the track is tall enough, you can see the characteristic formant patterns of different vowels. And if you know what the words are, you can even read the phonemes to a degree. But speaking as a sound editor who studied phonetics: I still find it a lot faster to scrub for the phonemes, rather than trying to find their lines on a spectrogram.

 

 

(By the way, props to Audacity for also being one of the few waveform editors with a log waveform display mode as well as a voltage-linear one. And the ability to choose different display modes for different tracks in the same project. Ah... if only it could scrub and mark as quickly as PT and Nuendo...)

Link to comment
Share on other sites

The idea of overlaying a transcript to a waveform isn't bad, though.

First, read the transcript, setting markers on the fly. Then, edit as usual.

I don't think software can be any good at smoothing background jumps, so you'd still need someone who knows their job.

 

This is not for editing a scripted scene, is it.

Link to comment
Share on other sites

So this is what Adobe has been doing instead of fixing the myriad of things keeping Audition from being a really great audio program?  Like automation snap-shotting?  Like untangling the messes that the app's duelling automation schemes make?  Like figuring out how to get enough tracks with enough auto data on the screen at once to allow the mix of a project of scale?   Like making exportable change lists?  This guy is a brilliant programmer no doubt, but has fallen victim to Ocker's First Fallacy:  "if I can do it, you can do it".  I don't see how I would use his system at all in the course of fine niggly dialog editing of material that had a lot of problems, in performance, with a nasty BG and possibly with other issues like lav clothing noise, wind rumble, boom handling noise etc thrown in.  He's shooting fish in a barrel--two exposed lav mics on a TV soundstage.  (And thus bringing to mind "Berger's Law": it works the best when you need it the least.)  If they want to put the colors-vs-pitch thing into Audition and PP that's fine, but please allow us to turn it off--it would kind of suck for FX and music editing, be a distraction.

 

philp

Link to comment
Share on other sites

So this is what Adobe has been doing instead of fixing the myriad of things keeping Audition from being a really great audio program? 

 

Ya, they have some work to do with Audition. It turns out the presenter is Andy Moorer. You know: CCRMA, SoundDroid, SonicSolutions NoNoise, and for a dozen years Adobe. But he was in the (probably no-longer extant) DVD group and I don't know ifs he's involved with Audition. More on Moorer:

 

http://en.wikipedia.org/wiki/James_A._Moorer

 

http://www.jamminpower.com/main/articles.html

 

He's the real thing. Perhaps his claims were a bit of marketing coaching and (somewhat deserved) hubris. My guess is his team isn't doing their core development on an iPad...so there might be a lot more going on behind the scenes that we haven't yet seen. 

 

BTW- here's another (local to me) group looking into visual/content-aware audio editing tools. I think a couple people from this project are now at Adobe.

 

Content=-based tools for editing audio stories

http://vis.berkeley.edu/papers/audiostories/

 

 

(By the way, props to Audacity for also being one of the few waveform editors with a log waveform display mode as well as a voltage-linear one. And the ability to choose different display modes for different tracks in the same project. Ah... if only it could scrub and mark as quickly as PT and Nuendo...)

 

An aside: That log display helped my high-school son with last year's science fair project. He compared the harmonic structure of different saxophone reeds. The different strength reeds have significantly different sonic profiles; Audacity really helped him understand and display those differences.

Link to comment
Share on other sites

I know who Andy M. is.  And I think his genius could be applied to more interesting and useful things in the audio world.  Re Audition--Adobe needs to amp it up or mothball it.  PP does some sophisticated audio things that Audition can't.  Re the presentation--the audio guys needed something visual to show, I guess, and they found it.  Note how the Adobe cheerleaders work the crowd--a roomful of Adobe PP and PS etc users and programmers and marketers are really that excited about basic dialog editing?

 

philp

Link to comment
Share on other sites

Phil, I figured you know who Moorer is. But I also figured some of us don't (I initially thought the presenter was another seminal guy). And  like you say, it seems the Adobe MAX conference audience is fairly visually oriented, so that probably limits what the audio team can successfully show, but I do wonder if there's something cooler running under the hood here (and perhaps that cooler thing will end up in Adobe's visual editing tools like Prelude and Premiere...who knows?) A couple years ago they demoed what became Audition's Sound Remover, that spectral editing tool, which is visual and demoes well. And that segues to your main point. 

 

I haven't found Sound Remover to be super helpful in my work. Could be user error, but compared to Izotope RX, well... It really seems like there's an opening right now for Audition to do well, and the problems/issues seem knowable. But will their priorities match those of people deeply immersed in audio post? 

 

 

Jim

PS- I need to disclose that I've done some consulting work for Adobe, though not directly with the Audition team.

Link to comment
Share on other sites

I've worked for them too.  I mentioned the need for the audio guys at any company (this was explained to me by Apple guys actually) to come up with stuff that LOOKS cool in a demo because the higher ranking marketing people are all visually oriented and things that help unfashionable types like sound editors do their jobs better but that don't look expecially interesting are automatically sent down the to-do list in favor of more cool visual fx or etc.  If this gets the Audition guys some budget to fix the real issues in their app (or what I think are the major issues) then great.  For itself, who is it even for?

 

Agree about Sound Remover--not a patch on RX.

 

philp

Link to comment
Share on other sites

  • 2 weeks later...

Not ready for Broadway, but it’s a very thought-provoking, of not a bit impatient, presentation.

 

During a period of “I just can’t edit dialogue anymore,” I worked for Sonic Solutions. Andy Moore is truly scary smart. He was in Marin Country; I was in Paris, so no regular interaction. But those quarterly meetings kept me aware of the “out there” ideas behind the machine.

 

Having Andy fronting a product is not necessary a sign of eminent success, but it shouts that it’s worth your while to pay attention.

 

I think it will be a while before the visual speech editor steals our jobs. For now I use the more earth-bound but very clever ReVoice Pro from SynchroArts for those impossible ADR and alternate take moments. No speech-to-text, but very smart matching.

 

Hopefully I’ll be out to pasture before “everyone can be a sound editor.” I can’t compete with that.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...