Jump to content

Sampling rates


Tony Johnson

Recommended Posts

Even if the advantages of higher sample rates may not be immediately obvious, what would be the downside? 

 

Depends on the production (and budget). Many sub-$20mil productions are ingesting prodn audio into the editor, cutting* there across all tracks (even though they're listening only to the mixed track), and delivering the whole package to audio post... meaning absolutely no time wasted on conforming.

 

So if the NLE can't handle 96k and downsamples on import, then the downside is that we're wasting media and editorial time. The semi- upside is that those 96k files will still be available for post if they rebuild the track... but remember, we defined this production as not doing a full conform.

 

I say "semi- upside" because we haven't yet determined if there's anything useful in the production dialog tracks > 20 kHz. Just a lot of assertions on both sides.  

 

Which is fine... people of good will can and should discuss this kind of thing, and make contributions to the discussion based on their training and experience. That's what a professional group is for... and jwsound is as close to a professional online discussion group as exists anywhere.

 

---

 

* Caveat: NLE can do cuts only. So they're bit-for-bit copies in the OMF/AAF. Any volume changes, fades, or (gasp) processing has to be turned off before delivery to audio post.

Link to comment
Share on other sites

  • Replies 106
  • Created
  • Last Reply

Top Posters In This Topic

So the only reasons against higher sample rates provided in this thread are:

1) we need more digital storage space

2) there aren't sufficient reasons

3) we've always done it this way and it's better not to change a standard, where there finally is a standard.

4) the NLE issue mentioned by Jay above

1) is a non-issue, in my opinion, as in this day and age digital storage space is cheap and readily available and the amounts we need are still puny compared to digital video.

2) maybe true, but there are some valid reasons as outlined in this thread, such as allowing better manipulation capabilities for post and capturing a wider frequency spectrum, for better or for worse, but certainly more acurate.

3) is pathetic, in my opinion, as the breaking free from set standards is important to progress, develop and improve. If accepted as a paradigm, we may never have progressed into the digital realm in the first place.

It's important to have standards, though, but they must be continually adapted and so far I have not seen a valid reason why we should not move to 96kHz, except for 4) which is a good reason, but of course Software needs and will adapt to evolving requirements

Link to comment
Share on other sites

1. Agree, non issue.

2. It depends (ugh, sorry), if more accurate is more accurate to our ears. Maybe.

3. Standards evolve, like I said in my first post, I'm onboard and happy to do it if people ask.

4. I wouldn't think that is the only issue for post, just the only one mentioned in this thread.

 

In some way it's not really a location call at all. The guy's handling transfer, sync, workflow and final compression. The whole post chain right the way to delivery are the people who need to ask for it, not us. 

 

 

I'd be interested to know what others think.

 

Best

 

Nick

Link to comment
Share on other sites

Also I think most IMAX films are done in 96

I might be wrong about that though

 

I can't speak to "most" IMAX films. I worked two IMAX projects in recent years and I don't recall any discussion of sample rates or specifications at all. I may have said something along the lines of "I'll be supplying the audio as BWF files at 24-bit and 48 kHz. Is that OK?" Or some variation on that theme. They were OK with that.

 

David

 

The pictures were:

 

Time, the Fourth Dimension (http://www.imdb.com/title/tt1734489/)

and

Air Racers 3D (http://www.imdb.com/title/tt1535421/)

 

I've never seen either of them so couldn't say if they were great or stinkers. But they seemed pretty good at the time.

Link to comment
Share on other sites

The biggest reason for not using 96khz is because every piece of equipment in your signal path would have to have that sample rate and frequency response in order to take advantage of it.

 

The vast majority of microphones and standard analog gear have a high frequency rolloffs and disregard any noise in that frequency range. And that includes the 96khz recorder (at least the portable recorders that we use).

 

In order to take full advantage of 96khz, every piece of equipment in your signal path must have a higher rolloff as well as electronics clean enough to not add noise to that frequency range. The higher frequency range not only requires higher quality electronics which add to the cost, but it also means that quality engineering is not focused on the frequency range we actually care about.

 

Remember too that dealing with 2x the amount of data is not just about hard drive space. It means double the processing power which increases heat dissipation and power draw (decreasing the battery life if applicable). It also means double the memory bandwidth which means fewer channels and/or higher quality solid state (with less capacity than a hard drive).

 

Though I can understand upsampling after the fact for processing purposes, I honestly do not believe humans can hear the difference between 48khz and 96khz. That is up for debate of course, but for me the cost/benefit just doesn't make sense.

Link to comment
Share on other sites

You need to get your mind off of bandwidth and frequency response. Higher sample rate have benefits other than this.

Benefits I've already mentioned.

 

Please point out if I'm missing something?  Jays point still stands. Unless you record dialogue in stereo. What are the benefits to increased localisation from a mono source, that may come at 96k?

Link to comment
Share on other sites

Your previous posts have been centered around 3 things:

- Interaction between frequencies generating other frequencies

- Spatial awareness coming from the higher frequencies

- Post processing

 

Interaction between frequencies into the audible range either occurs inside the hardware of the system in which case it's actually distortion, or it occurs before it hits the microphone and would therefore be recorded by a normal 48khz system.

 

Anything mentioning higher frequencies, you say I should get my mind off of.

 

Post processing I said I can understand, but again, unless you are interested in the higher frequency content (which requires a complete system tuned for that purpose), it suffices to take lower frequency content and oversample it for processing purposes.

 

None of this addresses the idea that whatever benefits you say are provided by the higher sample rate, I question that the magnitude of these benefits can be perceived by human ears. There is plenty of rationale to be made on these topics, but I would like to see a published listening test on higher sampling rates.

Link to comment
Share on other sites

http://people.xiph.org/~xiphmont/demo/neil-young.html

There's a good case here against higher sampling rates. If I may simplify the argument for those want cliffs notes, it boils down to two ideas:

 

- A sampling rate captures perfectly any audio content whose frequency lies entirely below its Nyquist limit (half the sampling rate). Any distortion made by the quantization method exists above the Nyquist limit. This is similar to how a square wave's frequency content consists of the fundamental frequency and of odd harmonics above that frequency.

 

- The higher frequencies represented by a higher sampling do mix with each other into audible range when hardware not designed for the higher frequency content overloads and causes intermodulation distortion.

The article's writer, Monty, is responsible for, among other audio compression algorithms, OggVorbis and Speex: two very popular open-source algorithms for music compression and VoIP. He knows what he's talking about.

Link to comment
Share on other sites

Again I type long winded responses, and again safari crashes and deletes all my work...

Ok, first let me reiterate that the majority of the benefit of 96 will be found in post an for those who record things other than dialog in the field.

For dialog only, yes there are benefits, but they are negligible.

Frequencies outside the range if human hearing interact with frequencies we can hear. We agree on this it seems. The results of these frequencies are not distortion as long as they are being accurately and naturally reproduced. This is no different than the combination of frequencies you can hear.

When it comes to 96, more samples and higher frequencies result it better quality more accurate processing and higher quality, more natural summing.

Better manipulation especially when it comes to things involving time. More samples = more stretch with fewer artifacts.

There is a big difference playing back a single track with extended frequency range and playing back 200 tracks all interacting with each other with an extended frequency range.

Could you up sample your location tracks to 96 from 48 to support a 96 work flow? Sure. But why not record at 96 with everything else?

Big picture.

It's very difficult to discuss these topics in text. I think a interactive real time video discussion is much more appropriate. If people are interested maybe we should make that happen.

Then we can easily have pictures and audio comparisons and so on...

Link to comment
Share on other sites

I think a interactive real time video discussion is much more appropriate. If people are interested maybe we should make that happen.

Then we can easily have pictures and audio comparisons and so on...

You mean something like a Google Hangout? I'd love to listen in on that discussion.
Link to comment
Share on other sites

Dr. Rose: " anything useful in the production dialog tracks > 20 kHz. "

I'm in the "camp" that agrees, however CC has some points, even if he is not completely clear on them.  If the higher rate sampling is done well, it can increase the accuracy of the potential reproduction of the analog waveform as their is less interpretation of what is the actual waveform between the individual samples.

I'm not sure there is much, if any, practical value to routinely using higher sampling rates, but there are no practical downsides if the bandwidth is going to be available.

 

NE: " Nyquist limit (half the sampling rate). "

actually Nyquist specifies greater than at least double the highest frequency, but does not specify how much greater; also technical issues strongly suggest some value greater as well...

CC: " Frequencies outside the range if human hearing interact with frequencies we can hear. "

however those and other technical issues arise when reaching above about 20kHz, thus audio sampled at higher sampling rates still does not actually include much, if any, information above about 20kHz, it just contains more samples of the audio below about 20kHz.

 

 

CC: " again safari crashes and deletes all my work... "

maybe a change in browser?

Edited by studiomprd
Link to comment
Share on other sites

NE: " Nyquist limit (half the sampling rate). "

actually Nyquist specifies greater than double the highest frequency, but does not specify how much greater; also technical issues strongly suggest this as well...

 

Hi senator. I'd like to ask you to state your source on this, because my understanding of Nyquist is that double the highest frequency is exactly the required sample rate to reproduce that signal. Wikipedia entry:

http://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem

 

Note the phrases "completely determined" and "perfectly reconstructed".

Link to comment
Share on other sites

OK:

says: " the sampling frequency should be at least (equal to or greater than) twice the highest frequency contained in the signal "

and says: " the sampling rate must be at least 2fmax, or twice the highest analog frequency component. "

thus the requirement that it be greater than twice the highest frequency is only a practical requirement to account for aliasing and other technical issues

 

in practice, exactly double is not typically possible

Link to comment
Share on other sites

The theorem itself: "If a function x(t) contains no frequencies higher than B hertz, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart."
 

Put another way: any sampling rate higher than 2x the highest frequency of a waveform will 100% completely reproduce that waveform.

 

You can use higher sampling rates, but if your sampling rate is higher than 2x the highest frequency, it does represent that waveform completely.

Link to comment
Share on other sites

NewEnd put it: " any sampling rate higher than 2x the highest frequency of a waveform will 100% completely reproduce that waveform " and " you only need a sampling rate higher than the 2x the highest frequency. "

exactly in theory (or in a theorem)

but as a practical / technical matter, the sampling rate needs to be somewhat higher than that to actually produce usable results

thus 48kHz was selected for attaining 20kHz response, (and the 44.1 compromise rate for audio CD's)...and in the early days of audio CD's "oversampling" was a huge deal, and a way to deal with the technical aspects of aliasing, etc.

Edited by studiomprd
Link to comment
Share on other sites

NewEnd put it: " any sampling rate higher than 2x the highest frequency of a waveform will 100% completely reproduce that waveform "  and " you only need a sampling rate higher than the 2x the highest frequency. "

exactly in theory (aka theorem)

but as a practical / technical matter, the sampling rate needs to be somewhat higher than that to actually produce usable results

 

thus 48kHz was selected for attaining 20kHz response, (and the 44.1 compromise rate for audio CD's)...and in the early days of audio CD's "oversampling" was a huge deal, and a way to deal with the technical aspects of aliasing, etc.

 

"Theory" doesn't equal "theorem". They have very different meanings. Theorems are proved based on postulates. Theories are just ideas.

Sampling rates higher than 40kHz were chosen to account for filters that were not perfect enough to limit the frequency response of hardware to 20kHz: then there is higher frequency content than 20kHz, that's why they chose a higher sampling rate. Not because a 41kHz sample rate wouldn't perfectly represent a 20kHz waveform.

Yes "oversampling" is a way to deal with aliasing. Like was described in this article: http://people.xiph.org/~xiphmont/demo/neil-young.html

The majority of A/D converters actually oversample and down convert to the chosen sample rate, filtering out higher frequencies and eliminating aliasing artifacts. Everything below the Nyquist frequency, is left 100% intact.

Link to comment
Share on other sites

NewEnd put it: " any sampling rate higher than 2x the highest frequency of a waveform will 100% completely reproduce that waveform "  and " you only need a sampling rate higher than the 2x the highest frequency. "

exactly in theory (aka theorem)

but as a practical / technical matter, the sampling rate needs to be somewhat higher than that to actually produce usable results

 

thus 48kHz was selected for attaining 20kHz response, (and the 44.1 compromise rate for audio CD's)...and in the early days of audio CD's "oversampling" was a huge deal, and a way to deal with the technical aspects of aliasing, etc.

Yes it does, anti-aliasing/imaging ( I assume thats what your talking about no?) does add to the requirements of converter design but as you know, it isn't an additional 48K, so it isn't really at question here. 

Link to comment
Share on other sites

Frequencies outside the range if human hearing interact with frequencies we can hear. We agree on this it seems. 

I do not agree with that statement. But I did not have an explanation for certain phenomena until I looked it up, and so I did not want to dismiss the idea entirely.

But now, I submit that frequencies do not interact naturally. Take two sine waves and sum them without any nonlinearity and your result is a waveform of those two frequencies only.

 

The only way that two frequencies produce more frequencies is when you put them through a nonlinear (ie overdriven) system. The resulting output has different frequency content than the input: it is therefore distortion.

There is a phenomenon that occurs when two close frequencies are heard together. The result is what is described as "beating" at another lower frequency. Piano tuners and the like use it to help tune their instruments. This effect is psychoacoustic: 

http://en.wikipedia.org/wiki/Combination_tone

http://en.wikipedia.org/wiki/Binaural_beats

One might suggest that frequency interaction is important even if it occurs psychoacoustically. However, the ear cannot detect frequencies higher than 20kHz, therefore the only place where ultrasonic frequencies interact with auditory frequencies is in nonlinear hardware and that would be considered distortion.

Link to comment
Share on other sites

 

 

The only way that two frequencies produce more frequencies is when you put them through a nonlinear (ie overdriven) system.

 

You've never heard a concert pipe organ play two low notes (or pedals) a half-step apart? No overdriven electronics involved. I'd be hard-pressed to even call it distortion, if it's what the composer intended. Though that's a semantic argument, since the waves coming from the pipes are intermodulating...

 

But I'd be much harder pressed to call beating a psychoacoustic phenomenon. A superhet radio - which depends on beat frequencies - doesn't have ears or a brain.

 

There are some purely psychoacoustic beats, like the ones that can happen intra-aurally. But it's definitely also a physical phenomenon.

 

This is far afield from the original topic, though.

Link to comment
Share on other sites

A pipe organ and a superheterodyne radio are two different things.

 

The pipe organ is the psychoacoustic beating effect

 

The superheterodyne radio uses intermodulation via nonlinear circuitry

 

Air is a passive system with extremely large dynamic range, it does not cause intermodulation

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...