Jump to content

96kHz -- Pros and Cons?


Recommended Posts

Just in the last couple of weeks, my post house received a 35mm high-budgeted commercial for dailies, with the sound recorded at 96kHz (24-bit).

My eyebrows were raised by this. Internally, our plant is 100% 48kHz AES/EBU digital audio: the router, every DA, every VTR, every server, every Avid, and every mixing console in the building is only capable of handling 48kHz. Technically, the Fostex DV-40s and 824s we use can play back 96k, but I'm not sure there's a point to it. (We do have one or two newer Yamaha consoles that can handle 96kHz, but we'd still have to SRC down to 48k before the signal left the room.)

My question is two-fold:

1) how many mixers are using 96kHz to record dialog in the field?

2) is there a real-world difference between 96kHz and 48kHz?

Me personally, my own experience is that anything above 48kHz for dialog makes no sense. I can understand using 96kHz or even 192kHz for an orchestral recording or for gathering sound effects, and I readily accept the advantages of 24-bit rate for dynamic range, noise floor, and other issues. But I'm not convinced going at 96kHz buys you anything except a bigger file. And in terms of a post workflow, it forces us to go analog, and everything still goes down to videotape at 48k (even in HD). Conversations with two re-recording mixers told me they've never had to deal with 96k for features or TV in Pro Tools. It's one of those things that's "theoretically possible" (as one put it to me tonight), but "as rare as a dog that speaks Norwegian."

The spots in question, BTW, were the usual combination of booms and lavs, under hectic conditions, background traffic noise, echoey interiors, etc. -- nothing out of the ordinary. Maybe three lines of dialog in :30 seconds, tops.

If this becomes a trend, we're going to have to re-think how we set up our facility. Changing over from 48kHz to 96kHz will be a daunting task.

--Marc W.

Link to comment
Share on other sites

I agree with you--I think 96k has some interest for A: sfx recordists who want to be able slow down their recordings and still maintain full-fi, and B: in music recording where ALL the other parameters (score, player, instrument, recording chain, room) are ultra high quality.  For location dialog--I will be frank and say that using 96k for that is just showboating, ignorance or both.  Besides, for anything other than a spot like you mention w/ just a few lines, making the deliverables would be a pain in the ass--very time consuming.  I don't think I get a vote about whether 96k will become a trend, but I hope it doesn't.

Philip Perkins

Link to comment
Share on other sites

For location dialog--I will be frank and say that using 96k for that is just showboating, ignorance or both.

Philip Perkins

This is beautiful --- can I quote you on this (oh, I guess I just did).

Seldom do the people proposing these things consider the whole chain, what its purpose is, the release format and so on. I remember talking to a Canadian mixer who was still using his modified Nagra 4-STC with the Bryston DolbySR kit (several large car batteries on a follow cart to keep it all running) who said he was refusing to "go digital" (DAT) because he didn't like the sound. I talked to the post people on the production and all the dailies were being transferred to DAT and they were never going back to the original 1/4". So, he was using analog Dolby SR as a big filter on the front end of his recordings, but going digital or not wasn't really up to the sound mixer.

-  Jeff Wexler

Link to comment
Share on other sites

Most people know where I stand on this issue.  I believe that anything above 48k is strictly for marketing purposes. Today's A to D converters over-sample and do 1 bit delta sigma conversion so the old excuses for higher sample rates (allows for less severe anti-alias filter) are moot.

Humans can only hear on average up to about 17khz and a very few people can hear some small amount of sound up to 20 khz.  So 44.1 khz sample rate covers the entire range of human hearing.   Dialogue is even more severely limited in bandwidth with the majority of intelligible bandwidth to be in the 5 to 6 khz range.   Most of the sounds that exist in the dialogue recording chain above 14khz are unwanted and typically removed before final distribution.

What do you think Dolby noise reduction was all about?  What about De-Essers that are used on dialogue to remove unwanted sibilance.   All of these tools are used to remove higher frequency components that are unwanted and not conducive to delivering intelligible dialogue. 

Now if you are doing scientific recording of ultrasonic events you may want to have a mic or recording chain that could handle up to 48khz bandwidth (96Khz sample rate). Things like recording dolphin sounds or bat communications for computer analysis of waveforms etc.  However for all Film and Television audio workflows 96khz or above sample rate is just a waste of storage space and a waste of time.    Many people parrot the excuse "Storage is so cheap,  why not record at the highest sample rate the machine can record" .  Well the hard drives may be cheap, but time is not.   And what they all overlook is the fact that the time it takes to copy those bigger files around a facility are doubled when you double the sample rate.  So it takes twice as long to burn a disk or copy the project from one server or media to another.  In the old Analog world of re-recording making copies (adding a generation) was avoided at all cost.  However in the digital world where an audio track can be duplicated and cloned hundreds of times without appreciable loss of quality, the time it takes to move that data is more important than any theoretical gain in what is essentially an inaudible improvement in sound quality.

I love to hear the elitist production mixer proclaim that he is recording everything at 24bit 192khz for the best quality.  Then look at his setup and note that he is using analog fm wireless links for everything including the boom.   So even if the mic could capture sounds that 96 or 192khz would be necessary to record, the wireless link limits the bandwidth and dynamic range to a range that can be completely faithfully reproduced with 16 bit 44.1khs digital A to D.  It's all about the "Spinal Tap effect".   You know, "Our band is better because the amps go up to 11".  ---Nigel Tufnell

And all the sound effects recordist that claim they need the extra sample rate for slowing down sounds.  While that may seem like a perfectly valid excuse, in reality most workstations do not simply slow the clock down to slow down a sound.   The original sound is re-sampled and SRC'd to fit the timeline and then interpolated to generate the slowed down sound.  So unless they are re-stamping files recorded at 96 as 48k  (cutting speed an frequency by 1/2) before importing the track to their edit session, any advantage of recording at 96k will be largely eliminated.

And even if they did have a recording chain (mic capsule, preamp, mixer and A to D) that passed frequencies up to 48khz,  Unwanted sounds like system self noise, RF carriers from Sennheiser RF Condenser series etc will now be lowered into the audible range by slowing it down.  So you probably want to eliminate those unwanted high frequency elements before conversion to digital.  Making the higher sample rate of the recorder unnecessary and unwanted.

-----Courtney

Link to comment
Share on other sites

I remember talking to a Canadian mixer who was still using his modified Nagra 4-STC with the Bryston DolbySR kit (several large car batteries on a follow cart to keep it all running) who said he was refusing to "go digital" (DAT) because he didn't like the sound.

I remember in the early days of DAT (circa 1991-1992), some guys were using DBX -- excuse me, dbx (lower-case) -- noise-reduction on the tracks. This would drive us nuts in post, plus I could hear the noise-envelope and gating and so on on sharp dynamic peaks. SR was a lot cleaner, no question. But I think the S/N of a straight Nagra 4S was already pretty damned good.

I think it might be true that some of the early DAT decks' A/D converters were kind of crappy. One machine that I know was widely hated was the original Panasonic SV-3500. The later machines, like the 3700 and 3800, were a lot better. I always thought the Fostex PD4's converters sounded fine, though; not sure when that was introduced.

Glad to see I'm not going crazy on this. I thought 96k for dialog seemed a little extreme, especially for a somewhat-noisy exterior. I hadn't even thought of the extra time needed to mirror the files -- that's gotta be a nightmare as well.

BTW, the timecode on these 96kHz tracks was totally screwed-up, forcing us to sync them all up by hand, adding hours to the session. The DVD-RAMs were hours off from the slate (not just a simple drop/non-drop issue), so this was a definite fubar situation.

--Marc W.

Link to comment
Share on other sites

Well, as a relatively ignorant n00b I'm going to stick my neck on the line here and say that although there is no single compelling reason to ever record dialogue at 96kHz, I do it anyway. Perhaps I have succumbed to the marketing, and I certainly couldn't tell the difference between 96 and 48 in a blind test. Dumping my files at the end of a day takes about 5 minutes, usually on to the editor's laptop (or external storage that will be given to the editor). I have read the endless debates about 96 vs 48, and I always ensure the director and editor are happy that 24/96 won't cause anyone any workflow headaches (typically everything is done in FCP so it's a non-issue, facilities houses are typically beyond the budget). The fact there are debates means there must be some doubt (or, a bunch of full-time marketing shills) so I err on the side of too many bits (I read somewhere that 64kHz would be optimal) until someone beats some sense into me or my showboat runs aground.

Anyhow, a couple of years ago I came up with 5 reasons to use 96kHz. If each reason can be plausibly shot down I'll gladly readjust my outlook.

http://web.mac.com/miker71/HARP28/Slog/Entries/2006/8/20_Top_5_Reasons_To_Record_88.2kHz_and_Beyond.html

There is actually a 6th reason - there is some "unconfirmed" scientific tests that suggest 96kHz provides a more accurate stereo image, and I've just started experimenting with M/S dialogue recording ... I haven't actually done the maths but apparently

"... some recent research suggests that the

human brain can discern a difference in a sound's arrival time

between the two ears of better than 15 microseconds – around

the time between samples at 96 kHz sampling – and some

people can even discern a 5µS difference! So while super-high

sample rates are probably unnecessary for frequency response,

they may be justified for stereo and surround imaging accuracy.

However, it should be noted that many authorities dispute this

conclusion."

Link to comment
Share on other sites

I don't think any of those reasons for recording location production dialog at 96K are worthwhile, accurate or provable in any sense, with the possible exception of the SFX recordings, which is a hotly debated topic.  What I CAN prove is that doing so is:

A: non-standard, and a pain for people downstream (as for the OP).  If not for the first people in line, certainly farther on (as in audio post).  It is the audio equivalent of deciding you will roll your TC at 24 fps.  A valid known rate, but not one anyone uses and is set up for, and would require an extra effort and more time.

B: A storage and deliverable issue--the footprint of the files is much larger, copying takes longer, backups are more of a deal.

C: Any increase in audio fidelity will be nullified by background noise on location, anomalies and noise introduced by wireless mics and other systemic audio issues in or around the sound cart.

D: Ultimately, the audio will have to be SRCed down to a rate that can be dealt with in post and distro.  SRC is another much debated topic, but I think there is agreement that NOT doing it is better than doing it.  There is a point of view that SRCing 96k down to 48k compromises whatever increase in audio quality was gained by recording 96k in the first place, especially if dither is used.

E: Are you sure you are recording on equipment that is really up to 24/96?  Many recorders and recording chains really can't take advantage of the higher sample rates (and bit widths)--their analog audio electronics (pres, summing, covertors) aren't up to the task.

Philip Perkins

Link to comment
Share on other sites

Guest klingklang

Sorry not true. We shoot 24P in PAL land and record 24fps TC to match AVID project requirements. So 24fps TC is quite alive here.

But having said this it still is a good analogy because there are exceptions for everything even for 96KHz ;-)

This is a really good analogy Philip.. 

Link to comment
Share on other sites

Perhaps I have succumbed to the marketing, and I certainly couldn't tell the difference between 96 and 48 in a blind test.

So, you've proved it to yourself, yet you continue to use 96 kHz. Now that seems a bit odd doesn't it?

I do a lot (or use to) of on-location live music recording. Only when I'm doing acoustical sets will I even record at 96 kHz, and then I know that 90% of the audio above 48 kHz ends up in the bit bucket because I have to dither it down to and change the sampling rate so it all goes on a CD.

For dialog recording 48 kHz is smaller, easier to mirror at the end of the day, and as pointed out is much easier on most post production people.

For audible differences, there was a huge jump between 16-bit DAT and the current 24-bit recorders. That makes perfect sense, but increasing the sampling rate doesn't. I know AES has presented several technical papers on higher sampling rates, and they do in a pure scientific world make sense, but outside the lab, there are limitations of both equipment, and people. And when our job is providing audio for people and 99% of the population can't tell the difference why do something non-standard?

Wayne

Link to comment
Share on other sites

I don't think I proved anything. I can't tell the difference between 4:2:2 and 4:4:4 but video folk tell me it's worthwhile and go mad if I suggest otherwise. Perhaps I should just make myself captain of the showboat. Picture in Showscan, naturally.

Link to comment
Share on other sites

I don't think I proved anything. I can't tell the difference between 4:2:2 and 4:4:4 but video folk tell me it's worthwhile and go mad if I suggest otherwise. Perhaps I should just make myself captain of the showboat. Picture in Showscan, naturally.

Yeah but this is an apples and oranges thing.  There IS a diff between 48 and 96k, but what most of us record won't allow that diff to be noticeable, so it doesn't add anything to the show.  There is a visible diff between 4:4:4 and 4:2:2, and it is visible no matter how crappy the production is.

Philip Perkins

Link to comment
Share on other sites

Guest klingklang

It might simply become a requirement of the media you record for. Has anyone ever heard a difference between 44.1 and 48k? So why are we recording at 48K then? Simply because it has become the standard for film. Might happen with 96k too.

Most people are fine with mp3 on one hand but wont by any crappy 100$ handheld recorder unless it records 192k/24bit.

Yeah but this is an apples and oranges thing.  There IS a diff between 48 and 96k, but what most of us record won't allow that diff to be noticeable, so it doesn't add anything to the show.  There is a visible diff between 4:4:4 and 4:2:2, and it is visible no matter how crappy the production is.

Philip Perkins

Link to comment
Share on other sites

Guest klingklang

Anyhow, a couple of years ago I came up with 5 reasons to use 96kHz. If each reason can be plausibly shot down I'll gladly readjust my outlook.

http://web.mac.com/miker71/HARP28/Slog/Entries/2006/8/20_Top_5_Reasons_To_Record_88.2kHz_and_Beyond.html

It´s funny that you propagate 192k/24bit but use some of the noisiest mics in the industry.

Link to comment
Share on other sites

It´s funny that you propagate 192k/24bit but use some of the noisiest mics in the industry.

And probably work on the noisiest sets in the industry. I don't preclude hiring quieter mics with flatter response but at this stage in my career it's hardly worth it.

I certainly don't want to propagate this debate since there is nothing left to debate. 48kHz for dialogue is more than adequate by all accounts thus far.

Link to comment
Share on other sites

I can't tell the difference between 4:2:2 and 4:4:4 but video folk tell me it's worthwhile and go mad if I suggest otherwise.

I can demonstrate the difference to you in less than five minutes. That's the video equivalent of 16 bits and 24 bits. If nothing else, you get vastly better blue screen keys from 4:4:4 than in 4:2:2. Every little bit helps when you're doing D.I.'s made from HDCam-SR tape.

Talk to any VFX guy, and he'll explain the advantages of 4:4:4. The differences are easy to grasp and are not subtle. It also helps that the most widely-used 4:4:4 VTRs are Sony HDCam-SR machines, which have less compression, more audio options, and better pictures than the older D5 and HDCam HD VTRs.

Addition: I forgot to also mention that every high-end video post house in the world can handle 4:4:4 HD video with no problem. But I don't know any that can handle audio above 48kHz.

--Marc

Link to comment
Share on other sites

To take that analogy further - I am fully aware of the technical superiority of 4:4:4 - however I can't "see" the difference unless I get up close (or start compositing), just like I can't "hear" a recording is 16 or 24 bit. But you're right, it's probably more analogous to bit depth rather than sample rate. Perhaps a better analogy to sample rate is fps ... I'm watching Cameron with interest as he proposes stereoscopic 2K/48fps.

Link to comment
Share on other sites

  • 3 months later...

The BBC are now experimenting with 100 frames/s (progressive scan). I saw their setup at IBC.

The difference between 100 frames/s and 50 frames/s (normal TV is 50 fields/s) is quite remarkable, it will make a moving picture much sharper, much more than I imagined.

As movie theater projectors run the film at 48 frames/s anyway (doubling each frame), I don't think the change to higher frame rates is far fetched.

Movie theaters will have to compete with what people have at home, so they will need to go for higher frame rates and higher resolution.

Japan already wants to roll out Super HI-Vision in 2015, which is 8K@10bit@60p with 22.2 channels of sound. Does anyone of you already have a boom pole with 22.2 channels of audio?

Link to comment
Share on other sites

I have one - its called a SoundField!!

With their software plugin I believe you can matrix out as many points as you wish!!

Kindest regards,

Simon B

Movie theaters will have to compete with what people have at home, so they will need to go for higher frame rates and higher resolution.

Japan already wants to roll out Super HI-Vision in 2015, which is 8K@10bit@60p with 22.2 channels of sound. Does anyone of you already have a boom pole with 22.2 channels of audio?

Link to comment
Share on other sites

  • 2 weeks later...

1. SOUND DESIGN

It’s obvious when you think about it. Time stretching, pitch shifting etc. More samples gives you more real data to play with before the computer has to start making stuff up to fill in the gaps.

This is the only reason that makes any sense.  And despite a post above it does work better in stretching and manipulating and it is not unusual for FX recordists to record at 96.  I don't, well very rarely but a number do regularly.

2. ERROR CORRECTION

THis is pretty much B.S.  Just because you have more bits doesn't give you more error correction.  Component and design quality count a lot more.  And there is also some evidence that not pushing the converters lets them be "more accurate" so that would argue against 96 et all.

3. MASTERING HEADROOM

Higher bit rate doesn't give you more headroom, it gives you higher frequency response.

4. BECAUSE THE PRODUCTION DICTATES IT

Well yes that would seal the deal.

5. BECAUSE YOU CAN

You can also record with a wire recorder.

I'm mostly in post and the first thing I would do with your 96k files is SRC down to 48k.  96k FX files are used in design sessions and bounced down to 48 for editing and mixing.  From a post perspective 96 makes no sense unless someone demands it.

Link to comment
Share on other sites

  • 8 years later...

I have a very good friend who is an eminent music engineer and he calls higher sample rate recording "cork sniffing".  I don't hear any diffs in the work I do 48/96 or 44.1/88.2, but I sure feel the inconveniences involved.  But....many people who make decisions about who gets to do what in the music world ARE convinced that 88.2/96 is better, AND that it is a marketing tool.   Serious music buyers, esp in the classical etc world are a SNOOTY bunch, they do not want music that just anyone can buy ( I guess) so the extra added attraction of "recorded at 96k" (or higher) is kind of like "extra-bitchen virgin vinyl" in LPs etc.  I decided that if my peeps thought it would sell more albums I'd do it, so when they do I go 96.

Link to comment
Share on other sites

Don't record 96KHz unless specifically asked.  Don't even ask permission to if you think it's a good idea.  There are a few good reasons to record 96K for certain projects.  If it's going to end up being a Blu-Ray music special or special audio disc format, like SACD (in which case you'll either record 192 or DSD), or DVD-A (and even then this is more for marketing purposes than actual technical need).  96Khz latency is a fraction of 48Khz, approaching but not quite half I suppose, but not sure if this is an advantage, since it doesn't really help anyone unless there is a very very picky IEM listener.  I believe I could hear the difference on certain material, in certain rooms, on certain types of systems, but very subtle, like how the splash of a cymbal crash sounds tonally, or how some abstract trait of its image decades in a very controlled playback environment, but not anything that I'd be able to pick out in a single playback take (or 10)... only after length intense study and analysis in a context that is totally outside of anything that resembles music enjoyment or dialogue intelligibility trials.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...