Jump to content

Unveil - 'unreverberator'


nhaudio

Recommended Posts

  • Replies 70
  • Created
  • Last Reply

Top Posters In This Topic

Here is another plug-in from SPL (d-verb) that does a similar thing, and is available for PT.

It's just not as good as the UNVEIL.

http://spl.info/soft...gs/de-verb.html

While De-verb is a very useful plug-in, it does something very different, even if the name suggests it did something similar. De-verb is a dynamics processor, it effectively "rides the fader" down when it detects the sustain phase of a sound . This can work very well for individual sounds with time between the events, and with mainly late /diffuse reflections, but not on mixed signals, and it doesn't remove reverberation components from *within* the individual events. But for one dial voice in a large-ish room, you can use it pretty effectively.

On a side-note: De-verb is a spin-off of the SPL Transient Designer (basically, it's the "Sustain" function from that device), which was invented by Ruben Tilgner, now running his own company Elysia. Anything that man does is very, very useful and good. A friend of mine has recently been processing weaponry SFX for an AAA game title, and was using an Elysia compressor called mPressor to get an insanely tight initial snap on the shots, while still retaining the fullness/body.

Link to comment
Share on other sites

Started playing with Unveil on real-world dialog tracks... the more marginal ones, like from docy interviews where the producer couldn't use his "A" crew.

It's going to be a learning curve (and I'm considered pretty good with DSP) before I consider it much more useful than a well-tuned 3- or 5-band expander.

If their documentation actually described what the algorithm was looking at, and what each knob was responsible for tuning - maybe with audio demos or graphic analogies - it would help.

Part of the problem is the incredibly complex nature of real-world echoes. You can turn on Unveil's audition feature (listens only to what the software is removing, pretty standard on any NR software) and tune for more and more verb... but when you go back to the fully processed signal, it feels almost as wet as it was before. I haven't yet sussed whether it's because certain frequencies of verb remain untouched, or timing differences, or what...

but I'll definitely take more time experimenting.

Link to comment
Share on other sites

Sorry to hear you're not getting results yet. It's hard to tell from a distance what settings you're trying.

You could send me a part of your dialog file and I'll send you back a setting that suits, to help get you started. I sent you an email recently, so you should have my email address.

Here's a couple of considerations.

The ADAPTATION parameter should be set to approximately the reverb time in your signal. Basically, just tweak the control until the slope of the adaptation display in the main display looks similar to the decay slope of the signal level display.

When you have set that, increase FOCUS to de-reverberate. If you're not getting enough de-reverberation at max value, try setting LOCALIZE to it's minimum or maximum value, to see at which extreme you're getting most reverb reduction. If LOCALIZE is set to a high value, you may get some warbling/artificialness. To counteract this, raise REFRACT and/or PRESENCE. A good balance between these parameters for most signal types is when they're at their default values.

If you're working with a signal that has a lot of verb from a very short room, disable the transient bypass function (move the TRANSIENT THRESH slider all the way top the right), as some of the reverb may be passed through that. Also note that room *resonances* will not necessarily be removed, as these are usually interpreted as discrete echoes, which we leave in place. To get as much resonance removal as possible, you will want to be using high LOCALIZE values and low REFRACT values, typically. I say "typically" as this is not a linear process like an expander or FFT-gate.

WRT to the documentation...well, it may not be as well-written as your books ;D - it is so far only a quick-start-guide- but it DOES tell the user what the controls do. What particular control description are you unclear about?

Please note that it is not possible to describe what a pattern recognition system does in classic DSP terms (no thresholds, RMS-values, FFT-amplitude-thresholds etc involved).

HTH,

Denis

Link to comment
Share on other sites

Thanks for the extra tips. I'll have a chance to try them out later tonight. I'd rather learn the system that way then send you a file and get back a cookbook recipe.

I'm not asking for a description in classic DSP terms, or in analog expander terms... but those knobs all do something concrete, modifying specific parameters. You and the other members of your team must have conceptualized those functions, long before you came up with front panel labels.

How about sharing those concepts? They might be too involved for some users, but it'll help folks who want to really learn your software. Those "power users" will be the ones who get the best results, convince others to buy the plugin, and eventually teach them from-the-trenches tips.

Link to comment
Share on other sites

Well, we actually do write what they technically do in the manual. The thing with neural network based technology is that you don't actually explicitly write the function it will later perform. You explicitly write the structure of the network itself, but from there on, it is somewhat of a black box. You train it to recognize a particular feature that you want it to recognize, then breed it through various mutating generations until it does what it is supposed to do. So when we describe a parameter as

This determines the refractory time for the analysis network. Shorter values will cause the analysis to be more sensitive to signal changes, but might affect short term oscillations such as noise in a way that sounds artificial. If you increase the refractory time the overall sound of the de- reverberated signal usually improves and it sounds more natural, even though some of the reverb reflections might return.

...that is pretty much what we implemented, really.

Let me write a short overview of how the process works. Basically, we use pattern recognition and a perceptive model to identify signal components that the human auditory system and the human "analysis logic" perceive as "foreground", or "significant" components, and we then consider everything else to be "background" or "insignificant". The pattern detection has been pre-trained to focus on reverb-like background components, so the "background" is pretty close to being the reverb only. But, hence we speak of "FOCUS", and not "(De-)Reverb-Amount", this also grabs other signal components, such as background ambience, some types of noise, or - when using musical signals - "mud" in a mix.

This differentiation is then used to de-mix those two elements. By setting the relative amounts of these two "layers" in the output mix, you can attenuate or boost reverb amount.

Let me try to describe the controls with other words than the rather abstract ones used in describing pattern recognition parameters.

FOCUS: think of this as a cross-fader between the reverb/background signal components (fully CCW), the unprocessed input signal (12:00) and the direct/foreground signal components (fully CW). This analogy is pretty precise, actually.

FOCUS BIAS: these sliders offset the FOCUS value for 10 frequency bands. Kind of like having a RATIO control per band on a multiband compressor. With FOCUS at 12:00, these cover the entire range. With FOCUS at maximum, raising the BIAS sliders will have no effect, and setting them to their minimum values is like setting FOCUS to 12:00 for that particular band. These are useful if you would like to reduce reverb one one element in a mixed signal but not reduce background info on others (example: reduce reverb on dialog between 1kz and 5kz, but leave the "mumble/rumble" of a background ambience between 250 and 750 hz in place).

t/f LOCALIZE: an analogy to this would be FFT frame size, where shorter sizes preserve more time detail, and longer sizes resolve the frequency more precisely (we do NOT use an FFT, though, and this is not a parameter for the transform but for the...erm...priorities within the pattern detection). So basically, low values for this parameter tend to give less artifacts and sound "crisper", but may not remove as much reverb as higher values when reverb and signal overlap. For example in very small rooms with a low amount of direct signal compared to the reverb amount, this parameter will usually need to be set to higher values to catch the reflections that are "fused" to the direct components. Higher values may however start sounding unnatural when removing a lot of reverb. When you want to isolate the reverb for up-mixing or off-screen-placement purposes, set this high, otherwise set it as low as possible for starters.

t (REFRACT): oookay, now this one is pretty tricky to explain ;-) Essentially, you can think of this as the reaction time of the neural network and how long it thinks about what to do before deciding. This means that low values (= short reaction/decision time) will remove more early reflections, but the probability that short-time signal features get misinterpreted as reflections also rises. Higher values allow for a "better educated guess" at what parts of the signal are reflections, so you get less wanted signal components removed, which results in a more natural sounding signal. The trade-off is that you also retain some more early reflections. Raising the value of this parameter can help counteract adverse effects caused by high LOCALIZE values.

ADAPTATION: this sets the length of reverb that the pattern detection is looking for. You're giving it a clue to help it do it's job. For most applications, set this to a value that approximates the actual reverb.

PRESENCE: this introduces some randomness into the pattern detection, which statistically results in less reverb reduction and less artifacts, but also tends to highlight the *presence* of a signal (as found on old broadcast equalizers). In a way it also changes the frequency response of the reverb, as reduction of reverb removal using PRESENCE is a function of frequency. Raising this makes the removed reverb darker, and the remaining signal brighter (at least on average that's what it does). Also good for counteracting effects of high LOCALIZE values.

TRANSIENT THRESH sets a detection threshold for transients in the input signal, which are then bypassed. This allows more reverb reduction while keeping transients crisp. The threshold takes dynamics as well as statistical signal properties into account.

Hope that helps ;-)

--d

Link to comment
Share on other sites

I had a few minutes to gather some production tracks that were professionally recorded, but in echoey surroundings, and then process them using the hints posted above.

Review in a nutshell: yes, it works. Nothing is perfect, and normally I wouldn't even try this kind of software before pulling down room nodes with a parametric so it wouldn't have to work so hard... but I'm liking it so far. Seems much more benign than other ways to lower verb, and works during the vowels as well.

JackieWet and DiabetesDialog1 are the originals.

Note that the pauses have some noises... These aren't artifacts; they're on both the orig and proc versions. I didn't do any expansion or editing, so the noises come through untouched.

In fact, the only other processing was normalizing the originals before doing this.

Files are Apple Lossless in a Quicktime wrapper, for full quality.

I'm still learning what the software can do and feeling my way around the settings. I've posted screenshots of how I set up the plug in each case (the Diabetes dialog is on top). Feel free to experiment, or offer me tips on better ways to tune things.

--

Note that Bias Peak (Pro 6) rendered the plugin with a generic AU interface, instead of what's shown in the demo or in their standalone software. I haven't investigated why.

DiabetesDialog1.mov

DiabetesDialog1proc.mov

post-2900-0-03475000-1334259072.gif

JackieProc.mov

post-2900-0-28116100-1334259083.gif

JackieWet.mov

Link to comment
Share on other sites

Hi Jay,

Yes, the reverb seems reduced but the "gating" effect sounds somewhat unnatural i.e the words at times sound "squeezed".

I wonder if this 'by product" is objectionable to others.

Thanks Jay for your follow through and lending your expertise to the discussion.

I'm on the fence about this product hoping to decide before the price goes up.

Jon K. Oh

Link to comment
Share on other sites

Well, consider how 'gated' it would sound trying to remove that much verb with just a multiband expander.

I'm probably also overdoing the effect. And of course there'd be some parametric before it, as well as a little compression for this particular kind of dialog. Not sure yet whether the compression wants to be before or after Unveil, since it would make verb more prominent. And I didn't do any level riding on the tracks at all.

Obviously I still need to do some more tests.

One of the things you can set is how much echo removal there is in each band. (The bands are roughly octaves - those "focus" sliders - so not as precise as I'd want for dialog.)

Link to comment
Share on other sites

Hi all!

Jay --- nice examples, thank you for sharing these and your findings! I'm on the road at the moment, so not near any decent monitoring setup, but I'll have a close listen & a go at the original files once I'm back in the studio/office on Monday.

WRT to the generic AU GUI: Peak (like a variety of other hosts) still uses the deprecated 32-bit Carbon framework. Our plug-ins use the newer Cocoa framework for native 64-bit support. Carbon hosts are not able to load the Cocoa GUI, so they default to the generic AU GUI.

We do, however, install a Carbon-compatible version in /applications/zynaptiq plug-in support/legacy/, so for use with Peak, simply replace the UnveilAU.component in library/audio/plug-ins/components/ with the "legacy" version and you'll get the proper GUI. You'll lose the 64-bit AU support though, so it's probably best to not overwrite the component but move it somewhere else instead, for swapping it back in for 64-bit hosts as needed.

I see the point about wanting more control or a different set of center frequencies for the BIAS sliders. We'd probably need to stick to 10 as the number of sliders for the moment, but we may be able to have a different set of center frequencies selectable for dialog work in a future update. What set of 10 frequencies would you find most useful? Something like 200, 400, 750,1k, 2k, 3k, 4k, 5k, 6k, 10k?

On a longer time-scale, we've been thinking about a transfer curve like you would find in a free-form parametric EQ, but that's neither something I could promise right now or give an ETA on (but I would like to have that in there for sure!).

Have a nice week-end,

Denis

Link to comment
Share on other sites

What set of 10 frequencies would you find most useful?

Off the top of my head:

100, 200, 300, 400, 800, 1k6, 2k4, 3k2, 6k4, 11

Roughly octaves, except a little finer in the low hundreds, which is where I usually hear room nodes when dialog is shot in real interiors; and also in the consonant range to help intelligibility.

Certainly open for discussion. And I have no idea who sharp these 'filters' are...

Any chance of giving the user a couple of different options in a lookup table, say for small ints, large ints, music...? Or even ten separate entries in an environment file that can be tweaked by power users?

Link to comment
Share on other sites

I'll talk to my partner about what is possible. I do believe that the ideal solution would be a "rubberband" free-form transfer curve....kind of like a parametric EQ like found in the FLUX Epure or DMG eQuality....that way, you'd get the ability to change the "bands" without going into a text editor :-)

Link to comment
Share on other sites

"rubberband" free-form transfer curve

I like that a lot. I assume there'd be away for user to save various different curves for different purposes.

Pet peeve: some software lets you draw impossible curves, like absolutely vertical slopes* (there's a screenshot in one of my books). Please make sure yours accurately reflects what the program is doing... or at least, approximates it in graphic terms.

___

* Okay, as I understand it, an absolutely vertical slope is theoretically possible... but only if you give the filter an infinite amount of time to process the signal.

Link to comment
Share on other sites

It's going to be a rubberband curve for FOCUS, and possibly also for ADAPTATION if that turns out to be helpful during testing.

You can obviously store the entire plug-in setting as a preset, but would you need additional presets just for the frequency nodes on the transfer curve? I'd think that most of the time, you'd be adjusting that on a case-by-case basis anyway, no?

We'll definitely not let the GUI draw curves that would require infinite amounts of time to process unless we find an application where that would make sense. You know, removing reverb when inside a black hole or something like that...*grin*

Link to comment
Share on other sites

need additional presets just for the frequency nodes on the transfer curve? I'd think that most of the time, you'd be adjusting that on a case-by-case basis anyway, no?

You might want a particular curve for a room, or to favor some formants in a particular actor's voice, but still want to tweak the curve (and other settings) for different mic positions.

But no biggie. I'd just save an entire preset called "RoomXxx" with the curve set and the other controls in neutral positions. It's what I do now with equalizers. For instance, I have a parametric preset called "Tune" that maxes the Q in all the sections but leaves their gain at 0 and their freqs at 'typical' room nodes and their harmonics. Then I can boost the gain of the section being tuned, sweep, notch, and move on to the next section.

unless we find an application where that [infinite processing time] would make sense

Definitely need that for when the producer says a show is locked. That way you'd still be rendering when the project is finally, really locked.

Link to comment
Share on other sites

  • 2 weeks later...
  • 2 weeks later...

Forget PT--- long live VST!

(reaper, sony vegas, nuendo, logic, etc)

This demo seems amazing. I am definitely going to try it out...

I just stacked up Jay's demo w/ Jackie against a pass on the Cedar DNS1000 (basically an expander).

The UNVEIL definitely does a slightly better job at first glance with the "de- reverb" (though it IS close)

Link to comment
Share on other sites

  • 1 month later...

Hey all,

we've just updated UNVEIL to v 1.0.7, which implements the following changes:

  • Main controls are now in relative mode (for the "old" absolute mode: hold ALT before clicking on the controls)
  • There is now an output gain slider to compensate for any level changes
  • The Stand-Alone app will now use the sample rate of the source file instead of the system sample rate
  • The Stand-Alone app can now record its output to a new audio file
  • We've increased processing resolution a little further for less artifacts when using extreme settings
  • Added parameter smoothing to prevent rare zipper noise during automation
  • Various small fixes & enhancements

....so except for the rubber-band transfer function for the frequency-dependent FOCUS and for supporting more plug-in formats we've implemented most user-requested changes I guess ;-)

--d

Link to comment
Share on other sites

  • 1 month later...

Hello Denis, I think you guys are on to something, but for "surgical" use, one would really need a better translation of the parameters in standard audio speak, most of all the "Adaptation" would benefit a real-world related parameter, say... seconds. That would make it much easier to take a rough guess in which direction to even turn the dial. What are the implemented min/max values? It think we audio people tend to think in orders of magnitude and such. But I don't want to be a smart ass, it could always be I totally misunderstand things about the app. Keep up the good work.

Link to comment
Share on other sites

Hi!

I do understand that, I'm an advocate of sticking to naming conventions myself, but it is inherent to the nature of the beast that perceptually meaningful parameters are of a multi-layered, complex nature, and that there *is no* established terminology for naming them.

Adaptation would really be the only parameter than can be expressed in a standard metric like seconds --- in an approximate way. We've got looking into that on our to-do list, I'm thinking we'll display an approximation in seconds on the main display, but that won't make it in until after we finish the VST and PT versions for Mac/Win, which is our #1 priority at the moment.

It's a similar thing with our de-mixing based re-composition tool, PITCHMAP, where we use parameter names like "PURIFY", "FEEL" and "ELECTRIFY" to describe complex, multi-layered parameter groups ;-)

As to the rough guess --- compare the ADAPTATION display "envelope" shape to the shape of the input signal in the display visually. If they match approximately visually, they'll match sufficiently accurately under the hood to do the job. By taking the parameter to very low values, you can in some cases focus on removing early reflections, by the way. As a starting point, I usually go for around 10:30.

Cheers,

Denis

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...