Jump to content

Hush: an AI-powered macOS app for removing noise and reverb from dialogue


Recommended Posts

Hi everyone!

 

Just wanted to share an app I made that post-production folks might find useful. It’s called Hush, and it uses AI to automatically remove background noise and reverb from dialogue (and other spoken audio) — with minimal artifacts. I designed the model myself, and trained it on a large dataset of common noise types: ventilation, traffic, honking horns, barking dogs, chirping birds, etc. — as well as room reflections from a wide variety of indoor spaces. You can hear a quick demo over on the website.

 

The model continues to evolve as I add more samples to the training data. I’m always open to suggestions, and happy to fine-tune it for specific use cases wherever possible. For example, I have another module in the pipeline to handle lav-related noise: clothing rustle, muffled dialogue, etc.

 

The app itself is a batch processor with a simple drag-and-drop interface. It can handle single files or many at a time. On Apple Silicon Macs (highly recommended, if not strictly required), it runs entirely on the Neural Engine, which massively accelerates processing while leaving the CPU cool (and the fans off). It can also run on an external GPU.

 

It’s been in beta for two months, and I’ve gotten some great feedback from dialogue editors, voice actors, and other folks who’ve used it in their projects. I just released the first public version on the Mac App Store today, for an introductory price of $49.99 US. You can also download a 21-day free trial, without any other restrictions, at hushaudioapp.com.

 

As a solo developer, I put a lot of care into my work and really thrive on feedback — so if you have any questions or suggestions, please let me know :).

 

Thanks,
Ian

 

HeroDark.jpg

Link to comment
Share on other sites

I look forward to test driving this app, but I probably would never "batch" process any audio for NR in a movie mix.  I'd want something like this to work in real time on individual tracks or clips of audio so I can hear what it is doing (or not doing) and be able to tweak what it does per clip or even per word if needed, with those tweaks being part of the mix automation.  Will you make this into a real time VST plug in that can work within a DAW?

Link to comment
Share on other sites

Great question — it’s really helpful for me to hear how this might fit (or not fit) into existing workflows. And yes, I’d like to make it into a real-time plugin in the near future. I already have it working as a prototype AU plugin in Logic, but with really high latency (~500 ms) and you can only run one instance at a time without freezing tracks (at least on a base model M1 Mac Mini). A possible solution would be to make a lightweight version of the AI model, which could run more efficiently in real-time, and then switch to the full, high-quality version when you bounce.

 

Trying to prioritize whether to work on the plugin next, or add a spectrogram-based editing interface for offline work with individual clips. Eventually I’d like to do both, but I have to start somewhere, and the plugin may well turn out to be the more useful path.

Link to comment
Share on other sites

21 minutes ago, ElanorR said:

Thank you for your hard work.  It might be helpful to me in podcast editing.  Some if these are not recorded in the best environment.

 

You’re very welcome! A few podcast editors tried out the beta over the last couple months, and they said it worked really well. It’s trained to handle common types of indoor noise (HVAC, fans, etc.) as well as room reflections, which are probably the main culprits in home podcasting setups. If the noise and reverb are moderate (e.g. using a mic in cardioid at a reasonable distance from the source), the reduction is typically really subtle, without any audible artifacts. That said, it’d probably struggle with audio recorded on, say, a cellphone, or with a laptop mic far away (which can happen, I suppose, with some remote podcast interviews). I don’t do podcasts myself, but I record voiceover and audiobook narration at home, and the app has been super useful in getting rid of all the room tone.

Link to comment
Share on other sites

1 hour ago, berniebeaudry said:

Just listened to the demo on the website.  Very impressive!  I didn't hear any artifacts and the nuances of the recording were all intact.  I work with most of the noise reduction programs that are out there and I look forward to potentially adding this one to the arsenal.

 

Thanks for the kind words! That was my goal: to make the model subtle enough that you don’t lose any of the details of the original speech. The processing is designed to kick in only where needed: clean audio passes through unchanged, and moderately noisy audio gets processed pretty gently. (Very loud noise will still produce some artifacts, but hopefully less so than with traditional noise reduction algorithms.) Curious to hear what you think, if you get a chance to try it out :).

Link to comment
Share on other sites

Ah brilliant! I am an old timer on mojave 10.14.6 any chance this might work on my system? I can run a trial when i get home. I love all things noise reduction would love to give this a trial. Pipe dream would be aax for pro tools even if audiosuite only but standalone app is just fine for me as well. 

Link to comment
Share on other sites

9 hours ago, osa said:

Ah brilliant! I am an old timer on mojave 10.14.6 any chance this might work on my system? I can run a trial when i get home. I love all things noise reduction would love to give this a trial. Pipe dream would be aax for pro tools even if audiosuite only but standalone app is just fine for me as well. 

 

Sadly, the minimum macOS version is 12.0 (Monterey). If you’re curious to hear what it sounds like on some of your audio, though — and if you have something you don’t mind sharing — I’d be happy to process it and send it back. Yes, AAX would be cool for sure, and going the AudioSuite route would make it easier to get things working without the real-time constraint. Not on the immediate horizon, but I’ll look into it!

Link to comment
Share on other sites

7 hours ago, Philip Perkins said:

If you decide to make a plugin (which is how I think most post-pros would use it), please consider making a VST version so non PT and Windows users can use it!

 

VST and Windows support would be great for sure! At the moment, Hush is pretty heavily optimized for Mac (both at the software level, with CoreML, and at the hardware level, with the Neural Engine). Getting the AI to perform with any reasonable efficiency was only really possible by targeting a specific architecture, and taking advantage of the massive acceleration for machine learning on M1 and M2 Macs. I imagine you could get similar performance on PC using a discrete GPU, but that’d also mean rewriting the app from the ground up. Not impossible, of course! But not likely to happen soon, without help from other developers who know Windows better than I do:p.

Link to comment
Share on other sites

21 hours ago, Mark LeBlanc said:

If you could come down to the deep south and train this to insects!... This demo does sound impressive

 

Haha I just may! I actually have a massive library of insect recordings from another sound design project — if getting rid of cicadas & crickets is desirable I could probably train a model to do that :p.

Link to comment
Share on other sites

  • 2 weeks later...
  • 5 months later...
  • 4 months later...

Amazing bit of software - just used it to recover some audio from an interview I've done where my Lav Mic failed so only had the in camera audio which was so poor! But that poor audio was transformed beyond what I thought could be possible!! Amazing.

Link to comment
Share on other sites

On 9/19/2023 at 1:22 PM, Ian Sampson said:

Sorry for not replying before! This one slipped through the cracks. But yes, exactly — I take dry recordings and convolve them with impulse responses, then train the model to predict the dry signal from the wet one.

 

Also just released an AudioSuite version of Hush — covered in a new thread.

Quite interesting!  I was curious what approach would work.  I don't work directly with ML models, but am generally interested in how they work.  Thanks for sharing

Link to comment
Share on other sites

This plugin is truly fabulous.

The way it can clean up scenes that are dramatically saturated with noise is impressive.

I have been involved in audio post-production for over 20 years. I am an Italian sound editor for films and TV series.

Now I work with Pro Tools 2023.12.0 and a Mac Studio with Osx Sonoma and I can say that the plugin is perfectly compatible with my system.

After testing the trial version for a few days, I purchased it immediately.
"Jan you are a genius."
Thanks to this plugin I can now save many scenes that I would definitely have dubbed. I use it mainly in the "mix" function, precisely to correct ruined ciak.

To have a good result, however, it is essential that the dialogue is sufficiently present, that is, that it has a good consistency of frequencies, otherwise the result is not satisfactory.

The work therefore consists of a passage with "Hush Pro" and then with RX10 to eliminate any clicks present in the engraving. It's interesting to know how this plugin will evolve in the future

It would be nice to be able to add a spectrogram to be able to analyze the waveform and correct any clicks on the dialogue more precisely.
I will wait patiently for the next update.

Link to comment
Share on other sites

  • 3 months later...
On 2/2/2024 at 12:17 AM, Enrico66 said:

It would be nice to be able to add a spectrogram to be able to analyze the waveform and correct any clicks on the dialogue more precisely.
I will wait patiently for the next update.

 

Sorry for the slow reply — I really should check this forum more often! Really appreciate the kind words about the plugin. And yes, working on some updates. I’ve got a prototype VST that can run inside RX, so you can take advantage of the spectrogram view there. Hopefully ready sometime this Fall. And experimenting with making another plugin (built on the same engine) to handle mouth clicks & the like.

 

On 5/26/2024 at 9:12 AM, andrewtb4 said:

@Ian Sampson, I think you should look at making this work on a live mic feed and advertise it to users of meeting software like Zoom or Google Meets. Since they can run this locally, it could be an excellent alternative to something like https://krisp.ai/, which is cloud-based.

 

Cool idea! That’s certainly a possibility — the new VST algorithm works in real-time, albeit with high latency, so maybe a live mic feed would work. To compete with Krisp though I’d definitely need a pretty big marketing & support team — right now it’s still just me, and I’m pretty happy for the time being keeping things small and focusing on tools for audio post. We’ll see :)

Link to comment
Share on other sites

On 5/29/2024 at 5:57 PM, Philip Perkins said:

That's great about being able to have a VST that can work within RX.   Will it run on Windows RX?

 

Sadly no — the underlying AI/ML engine is very dependent on Apple frameworks & hardware (using Metal for GPU acceleration, etc.) and would need a total rewrite to work on Windows. Maybe one day!

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...