Jump to content

Karaoke Lovers Rejoice! Croonify is here!


NewEndian

Recommended Posts

I did a small passion project the last month. And you all might find it interesting even if you're not a karaoke lover like myself

 

https://croonify.com

 

You submit a song audio file and the lyrics to it and it'll remove the vocals and generate a karaoke lyric video

 

Apparently, I overestimated how excited even karaoke lovers would be about it, so I'm keeping it free for a little bit while it takes hold in the zeitgeist. Feel free to check it out if you're curious

Link to comment
Share on other sites

Well I cheated somewhat, submitting 'Daisy (Bell) - Bicycle Built for Two' as programmed/performed by the IBM 704 ... and it struggled both with the audio split and with the timing.

 

But since so much pop has so much synthesis built in I'm wondering how much this approach would affect your programming to recognise, time and split anyway?

 

Should I try Cher 'Believe' (a bit of verse, a bit of chorus) next?

 

Thanks for posting!

 

Jez

 

 

 

 

Link to comment
Share on other sites

5 hours ago, The Immoral Mr Teas said:

Well I cheated somewhat, submitting 'Daisy (Bell) - Bicycle Built for Two' as programmed/performed by the IBM 704 ... and it struggled both with the audio split and with the timing.

 

But since so much pop has so much synthesis built in I'm wondering how much this approach would affect your programming to recognise, time and split anyway?

 

Should I try Cher 'Believe' (a bit of verse, a bit of chorus) next?

 

Thanks for posting!

 

Jez

 

I did encounter this in a broadway song called Space Age Bachelor Man actually. But it's just not a human voice

Link to comment
Share on other sites

Here's Cher!

 

By the way, I used a pretty high fidelity track here (legally purchased from iTunes). But a very interesting thing happens when vocals are removed: both the dynamic range compression and the lossy format compression come through. If you're not sure what they sound like, it's worth your while to take a listen

 

Brief explanation: dynamic range compression that used to blanket the entire song during mastering pushes the instrumental content down when the foreground vocal is there. Lossy format compression removes audio information that can't be heard due to masking, and it also makes approximations where it knows they will be masked. All of it can be heard once you remove audio that's supposed to be covering it all up

 

 

 

Link to comment
Share on other sites

Impressive, but not perfect.  I tried it on two songs:  and old aac encode of Bad, Bad Leroy Brown by Jim Croce, and a flac / mp3 encode of the Bird Song by The Wailin' Jennys.

Both successfully removed the vocal track and more-or-less correctly timed the lyrics.  Both would have been quite usable for karaoke.

 

Bad, Bad Leroy Brown suffered from obvious quality degradation in the vocal patches, which I attributed to the old 128kbps AAC encode (originally an iTunes rip from a CD circa 2001).  There may have been some dynamic compression artifacts, but there was far more missing from the signal, so I assumed that some of the missing signal was due to AAC compression.

Bird Song was a flac rip from the CD, so codec compression shouldn't have been an issue.  The difference between the vocal and non-vocal sections was much more subtle, but there seemed to be an overall degradation in quality, even in the instrumental sections where no vocals were removed.  The degradation sounded like was the "smeary" sound of a low bitrate codec.  In addition to the vocals, Croonify also successfully removed the violin backing track, which made it a bit hard to keep time.  The violin did come back in the non-vocal sections.

 

I also tested a 128kbps mp3 transcode of Bird Song (transcoded with lame / ffmpeg, with the flac as the source) to see if I could notice any additional degradation due to the mp3 codec.  It sounded close enough to the flac version, that I couldn't pinpoint any difference in a single listen.


I'm not sure how much better LAME is at encoding in 2023 than Apple's AAC encoder was in the early 2000's, but I would have expected the mp3 to have more audible artifacts than an aac file at the same bitrate.  On that basis, I'm inclined to attribute the degradation in Bad, Bad Leroy Brown more to the AI algorithm being overly-aggressive, and not to audio information missing due to the codec.  I suppose it would be fairer to compare the same song though...

So, my overall impression is that it creates usable karaoke files, but is perhaps overly aggressive in removing more than just the vocals.  It's better than other vocal removers I've played with by quite a margin, but it's definitely not flawless.

I also ran into quite a number of bugs:  It refused to read a couple other flac files and a wav file that I fed it ("metadata could not be read"), and I also ran into a "song is over 6 minutes" a couple times on files that were definitely not longer than 6 minutes.

 

-------------

 

Thanks to The Immortal Mr Teas, I was curious whether synthetic voices were inherently more difficult for the algorithm, so I tried Croonify on Stephen Hawking singing The Galaxy Song.

This was actually the most transparent removal of the songs I've tried (though perhaps it was less effective at completely obliterating the vocals, as there were occasional hints of vocals that could be heard).

However, the timing was completely wrong from the start, and it never really recovered.

Link to comment
Share on other sites

11 hours ago, The Documentary Sound Guy said:

Impressive, but not perfect.  I tried it on two songs:  and old aac encode of Bad, Bad Leroy Brown by Jim Croce, and a flac / mp3 encode of the Bird Song by The Wailin' Jennys.

Both successfully removed the vocal track and more-or-less correctly timed the lyrics.  Both would have been quite usable for karaoke.

 

Bad, Bad Leroy Brown suffered from obvious quality degradation in the vocal patches, which I attributed to the old 128kbps AAC encode (originally an iTunes rip from a CD circa 2001).  There may have been some dynamic compression artifacts, but there was far more missing from the signal, so I assumed that some of the missing signal was due to AAC compression...

 

Thanks for the review! Yep, it is definitely an aggressive algorithm, but works for karaoke, I think. There wasn't a way to get it to only remove the lead vocal, sadly.

 

Hahaha, it really does not like the 100% computer generated voices, it seems

 

Would you let me know which song file got rejected for length?

Link to comment
Share on other sites

This is awesome! About 20 years ago I was hanging out in Austin and there was a little bar that had punk rock karaoke. Obviously there were no actual karaoke tracks, they just played the regular tracks and performers sung over it. This app would have catapulted that club into a massive hang out.

 

I tried it on The Manges "I Will Always Do" and another one.  The backing vocals are untouched, they come through clearly.  The main vocals are gone and the sound quality goes down a little.  It's a little "digital" sounding, not exactly muffled and having kind of 8-bit compressed artifacts.  I think it sounds great though!  That little club would have created a million tracks on this.

Link to comment
Share on other sites

  • 2 months later...

First off, quality joke Jim Feeley. Maybe I'm not in the right circles but I can't say I hear John Cage jokes all that much

 

Secondly, in case people are curious. I have added a second algorithm to my Auto-Karaoke page: croonify.com. This one geared toward keeping more background vocals in the resulting video. I'm almost certain it still won't like robot voices though

Link to comment
Share on other sites

  • 1 month later...

Hi! 
My daughter is trying to get into music school, and she was tasked with bringing a karaoke version of a song to perform with. There was none to be found, so I tried Croonify. Worked pretty much flawlessly, even though the song is in Swedish. Tempo with the lyrics were a bit off but that's moot anyway since she won't be looking at that. 

VERY COOL! We were all in total awe in the family when the video was done. Very surprised to see the Like counter was at like 60 something!? This is an awesome thing. 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...