Jump to content

Jay Rose

  • Content count

  • Joined

  • Last visited

Everything posted by Jay Rose

  1. Re-edit of film years later using 5.1 stems

    Your plan makes sense, except I wouldn't totally throw out LFE on the stereo mix... there might be something important there. And there might be some level trimming necessary if the scene-to-scene flow has changed. The unanswered question, of course, is will the new edit disrupt any of those stems? I don't know what cutting they did, but music frequently gets mangled both inside and across scenes, and bgs can have abrupt shifts if there are edits within a scene. That might take some bandaids -- or at least, offsetting the edits -- before you mix.
  2. The "Best client/producer/agency quotes" thread

    Or brag that you know a pix editor with a new 'visual noise reducer' plug-in, and they should be using your friend instead of the editor they've booked.
  3. The "Best client/producer/agency quotes" thread

    VO for large financial firm, making their first foray into the Salt Lake City media market. Ad manager, straight arrow Boston financial type: "That's good, but can you make it sound more Mormon?"
  4. Content Jobs

    Forgive an old postie who stopped doing location mixing a long time ago: When did the term become 'content job'? A web search on that exact phrase suggests it refers to people who manage, edit, grab user-generated stuff from the web... but not a kind of shoot. (Without 'exact phrase' clicked, it gives me a bunch of job listings for things like web content creator.) Or is this a special kind of job that's supposed to give you contentment for little money, because it's "new media"...
  5. The "Best client/producer/agency quotes" thread

    "Can you make her sound like I love her?" see article...
  6. Sound sync advice using film 0.01%

    I won't address why you're shooting micro-budget on 16mm. Surely stock / lab / fxr costs are going to be significant. But that's your choice. (The last true low-budget indie I worked on that used film was in 2002... it was a very short piece, and the experienced videographer specifically wanted to shoot 35mm for the film experience.) So let's address the pull-up/ pull-down: This is necessary when 24 fps film gets transferred to video, which runs just a tiny bit slower than 30 fps. (The 24<>30 conversion introduces its own strangeness to motion, but at least that's integral so audio sample rates don't change. It's the 30 <> 29.97 fps that means you have to slow down the sound.) 1. It's not absolutely necessary that you compensate. If you're doing short takes, you can sync each one. The drift at the end of 30 seconds will barely be noticeable. If you then have cuts so individual camera angles are even shorter, you can nudge to compensate. I've worked for major post houses where, in the era when spots were shot S16 and finished on NTSC, they didn't even bother with pull-down. 2. Using FCP can be an issue. Years ago, FCP had a known, uh, feature where it would apply .01% changes whether you wanted it to or not, depending on the drop-frame settings. (Drop frame rates have nothing to do with pulldown. The frames are the same length. It's how those equal-length frames are numbered that makes the difference.) Many productions were brought up short -- with complaints of 'your mix is out of sync' -- until we got a handle on this. I don't know if it still exists in the version you're using. I also don't know if other NLEs have had this problem. FWIW, just about every professional DAW and most audio editors have a function to compensate for pull-up or -down. 3. If you do need to compensate, you don't have to do it during production. You can do it in post, before editing. 4. If you're going fully old-school, shooting film and then editing 16mm workprint against fullcoat, it's a don't-care. So long as the production track playback speed is the same as what you recorded when you transfer to fullcoat, you'll be fine.
  7. Neural Networks for Audio: how they work

    Mike, Do you mean it sorts character A's voice from character B's? Or that it sorts dialog from other noises like footsteps and bird calls (which are usually immune to conventional noise reduction)? Both are theoretically possible with NN, but I haven't heard of anyone doing the former. It would take a lot of training. The latter is commercially available in a few products now.
  8. Audionamix's TRAX Pro SP and the Dialog Isolate module in iZotope RX6 are kind of amazing: they use Neural Networks to clean up production tracks in ways we've never been able to before, and they can even give you a stem with the stuff they took away from dialog (like a clean bg track, or just the environmental music). Far better than any of the multiband expansion noise reducers or other algorithmic techniques we've been using for a couple of decades. They can also seriously screw up your track. Just like any other processing. Both manufacturers graciously gave me large chunks of Skype time with their senior developers, so I could write an article about the techniques for CAS Quarterly. The article appears online today, and will be mailed soon. We've also posted a web page where I've put downloadable samples of actual processing of production elements. (If you do the web page, please download the AIFFs. The posted mp3 files are just for orientation, and distort some of what the processors are doing.) Fellow CAS member Stephen Fitzmaurice added a sidebar with initial impressions of the Audionamix in his mix theater. Detailed reviews will be coming in a future issue. Article is in the Quarterly at cinemaudiosociety.org, or as a downloadable pdf at jayrose.com. This stuff has been blowing my mind. Please comment. (On the technique, not on my mind; that's a lost cause.)
  9. iPhone 7 mic

    I picked up a Focusrite iTrack ($130) so I could use good mics with my phone and iPad. Works brilliantly... just be aware it has to be running before you launch the audio app.
  10. Neural Networks for Audio: how they work

    NewEndian, thanks for the link. That's incredible stuff. Off the top of my head, I suspect iZotope and Audionamix didn't use GAN because 1) it's bleeding edge and these products have been in the works for a year, 2) the infrastructure for commercial development -- like easily purchased AWS training -- isn't there yet (I'm sure it'll be available soon), 3) the challenges of time-variant audio are so different from the xy arrays of image processing, and 4) the immediate market for image manipulation is so much bigger than that for audio manipulation. Visual bias strikes again!
  11. 96khz to 48khz converter

    Sound Grinder is a fast and powerful batch conversion utility.
  12. Another make-an-actor-say-anthing app

    Add me to that list. I first proposed it about a dozen years ago, in a DV Magazine column. Cleaning up bad production dialog is one thing. But these apps are (or soon will be) capable of making a convincing recording of anybody saying anything you want to type as an input. All you need is enough samples of their voice to use as training material. And as I reported about a month ago, a different app can take video of someone, and make an absolutely convincing new lipsync video of them saying a new audio input. Demos already online, and of course that's also still in Beta. "Oh brave new world, that has such [non existent] people in it!" -- Shakespeare
  13. Lyrebird.ai is a Canadian company doing AI creation of new speech from sample recordings. They have an online demo, where you record a minimum of one minute (they guide you through sentences, so the samples have a key), it runs the samples through a neural net, and then it'll create your voice saying anything you type. Pros: almost real-time, with a web interface. Cons: still somewhat artificial sounding, but a lot better than previous while-you-wait examples. ... and this is just a first-gen beta demo, with a really small training set and no ability to tweak the results. The company isn't posting anything about their technique, so I'm just guessing (from the operation and from the principals' bios) that it's NN. They seem to be interested in selling their sdk to other developers, rather than offering a service to filmmakers... but that's also just a guess.
  14. Another make-an-actor-say-anthing app

    Bab414, Other artificial speech apps -- some I've posted here -- have controls for prosody and inflection. [I'm assuming you weren't being sarcastic in your post, but actually looking for information...] What hasn't been done ... or rather, what hasn't been published yet ... is training a neural net to generate those wrinkles automatically. It still has to be done by a human operator. But as soon as someone comes up with a training library that's properly keyed for these elements, it'll happen. The mass market is there, for digital assistants with an edge. And building the library won't be hard, as soon as someone develops a consumer app that provides a benefit for users tagging the subtext.
  15. Reverb in exteriors

    It easily could have been. They were so far away, you couldn't see their mouths. Chances are likely they were recorded wild (I won't guess whether it was was during production or post, but post have certainly been cheaper). The verb was definitely added in post. I'd prefer to believe the director requested its wetness and wasn't happy until it sounded that way, because the rerecording mixer probably knew better.
  16. I was watching one of this year's screeners last night, on a calibrated system in a good room. In one scene of this action/drama, the protagonists are walking through a clearning in a large forest. There's a lot of snow covering everything. There are no mountains or large buildings in the scene; presumably from the plot, there aren't any nearby. One of our guys hears the enemy's voices, coming from the side. They turn around and spot the enemy party - maybe half a dozen men - a great distance away. We hear the enemy soldiers' voices at a reduced volume but clearly, and with a lot of complex interior reverb. If any exterior shouldn't have reverb, it's this one. Snow sucks up reflections, and the only things that could have been reflecting sound were tree trunks. Long distances in air cause high frequency attenuation from friction, which is why very distant thunder rumbles rather than claps. Perhaps the attenuation wouldn't have been as great as usual because the cold air was denser than normal... but there'd be some. This wasn't a case of an unrealistic effect being needed because reality sounds strange, like the necessity to sometimes put a 'whoosh' under a rocket ship in a vacuum. (Or to ignore the speed of sound [in a vacuum?] when blowing up a planet.) Level and eq could have sold the distant dialog, just like it does in a lot of other films. So, soundies: 1. Is exterior wet reverb (as opposed to a few distinct slaps from buildings) becoming the new normal? Are we back to the early days of talkies, when outdoor dialog was pushed through the studio's echo chamber because "everybody knows there are echoes outdoors"? Are there other current examples? 2. Has this come up when you're mixing a film? If so, what was the discussion? What arguments did the director have other than "just do it"? 3. Or am I a curmudgeon for still believing in physical laws?
  17. Reverb in exteriors

    I don't know if the snow was supposed to be soft or not. But it sure looked fluffy. And we didn't hear the protagonists' footsteps crunching. Not even cornstarch. If there was a layer of ice, it'd expect it to reflect highs and absorb lows. Unless the ice were a couple of cm thick... in which case the scene would be about skating rather than walking.
  18. That can also spark a discussion of language that has survived the obsolescence of the technology it refers to: Does anybody still "dial" a phone? Do kids even know what that word comes from? How about "dial tone"... which doesn't exist on the small phones most people carry. And those "telephones" do a lot of things that aren't related to hearing things over a distance... I've heard kids saying "repeating like a broken record". What's one of those? My favorite is when I discovered derivation of "wired", when it means "energetic, charged, hyper". It appears to come from the early history of our craft, when theaters were bragging about now being "wired for sound". That three-word phrase is almost a hundred years old -- and now doesn't refer to copper or loudspeakers -- but the new usage of both versions are found in Urban Dictionary. And coming up in a few years: Algorithm, which Wikipedia defines as an unambiguous specification of how to solve a class of problems. How could that possibly apply to what AI engines do in their hidden layers? Other examples? Branch to a different thread?
  19. Headphone

    +1 to 7506 in the field for spotting problems, if that's what you've gotten used to. But +99 for the equalization warning. It's not just that the curve is different than speakers in air. It's that we listen differently to speakers in air! Dynamics are affected a lot because of how we attend to the sound; and while stereo soundstage is more-or-less the same, the brain misses the localization cues from tiny head movements. The only time I've found quality decisions on headphones to be accurate, is when I'm sure the viewer will be watching with headphones. Or perhaps in a closed, nearfield kiosk.
  20. Larry, they were used when you subscribed to a non-AT&T long distance service (I used one for MCI) and the phone company charged stiff fees for touchtone, so your actual desk phone had a dial. This even though tone dialing was cheaper for them to support.
  21. Sound is no longer respected on set?

    Edward, thanks for the compliment. Writing a book that's readable and useful is a serious undertaking; trying to do it as a background task can take a year. I developed the "Half the Movie" proposal with Randy because 1) he's got insights that I think need to be shared with a lot of people in every craft (but particularly producing and directing), and 2) the publisher we approached would market and distribute it widely enough that it would reach those people... and net us a few bucks for our efforts. Since that project doesn't seem to be on the horizon, I'm paying it forward with consulting for young filmmakers, free tutorials and occasional custom software gadgets on my websites, contributing to an inner-city kids' media workshop, writing for the CAS Quarterly (detailed feature on neural nets coming next issue), developing a course for one of the large Boston-area universities, and -- of course -- my other books. Those latter two activities also get me a few bucks... which lets me pay my mortgage and do a couple of other necessary things. Besides, I've talked to people in our industry who've self-published. Not for me. The two websites are plenty.
  22. Fascinating article in today's NYT about neural networks generating still images of faces with no 'uncanny valley'. But buried in that article is reference to work at University of Washington last Summer... that automatically edits lips to match a different track! Literally puts new words in someone's mouth. On a computer screen, sync looks absolutely realistic. Resolution might not be enough for a big screen... but these things tend to leap forward quickly. Here's a link just to the UW demo. They took some real Obama speeches, and put them into multiple other Obama faces. Same speech, many different visual deliveries. The article doesn't mention what could happen when you edit the source speech to say something new. But heck, good dialog editors have always been able to change what someone says, on the track. Now a computer can make the target individual appear on-camera, saying the edited version! NYTimes full article link.
  23. Sound is no longer respected on set?

    About the 'moment passing' for Randy's and my book: I'd love to pick it up again. But it needs a big time slice from both of us. I can't think of anyone else with Randy's combination of practical experience, creative and tech chops, historic perspective, and writing/explaining ability. Besides, he owns half the proposal. I'm semi-retired now, doing only the things that come in over the transom, and a few low-budget indies for small money when the film and filmmaker interest me. Plus some contract writing -- mostly manuals for high-end broadcast gear -- which pays pretty well (the way I do it). So I could conceivably work on a book now. But Randy's still very busy with film commitments. And I wouldn't do this project without him. I don't even want to waste his time asking about it now. But if you're in the SF area and see him regularly, you might drop a few hints... -- I think it's Larry Niven who said something like "A book collaboration is a partnership where each party does 75% of the work." So it takes a real commitment.
  24. Sound is no longer respected on set?

    Six years ago, Randy Thom and I developed an idea for a book called Half the Movie. Aside from the in-joke of the title, it was to be about practical--as opposed to film-theory--sound design: what a director / producer / sound designer should be considering, at each step of the process; techniques for thinking creatively about sound; and examples from big and little films. We wrote a seven page proposal and outline. My publisher liked the idea and gave us a contract. Unfortunately neither Randy nor I had time to actually write the thing, so we bagged the project. I think now its moment may have passed. -- On the other hand, I've been really impressed with the creative design in a few of this year's awards screeners. Some people are still doing good work, even if it's an uphill battle.
  25. A question on MS recording for film

    On a few commercial gigs in difficult rooms, I've used a setup with a hyper very close to a group talking, and an omni at some distance. Then shuffled them in post as if the omni were a side mic in a proper m/s pair. Worked great, and since this was broadcast, I knew the LR info would disappear on mono receivers. ...with two gotchas: 1. I was also doing the post, so there was no question how to handle the tracks I'd recorded. 2. These were radio spots. I wanted a sense of real-world stereo in that awkward space. But with a large portion of the FM audience listening in mono, I didn't want anything that could sum awkwardly. Never tried it on TV. And wouldn't dare try it on a film.