Separating different parts of a song without having the actual stem is difficult, but there's a tool called LALAL.AI that's quite capable of handling the process. It splits songs between vocals and instrumentals with minimal effort and no audio engineering skills required.

And while LALAL.AI was already quite solid, it recently took a huge step forward thanks to the introduction of its new neural network architecture called Cassiopeia. It takes Rocknet, the service's previous-generation neural network, and makes it better in just about every way.

What Does LALAL.AI's Cassiopeia Bring to the Table?

To put it really simply: Cassiopeia provides improved splitting results with significantly fewer audio artifacts. The whole purpose of LALAL.AI is to pull and separate vocals and instruments from a track, so having a feature that can improve the capability is awesome.

LALAL.AI

With the new neural network, LALAL.AI will take a little longer to generate the split tracks, but that's a small tradeoff for the vast improvement in quality.

So what's different? Basically, Rocknet, which is still usable on LALAL.AI, only considers the amplitude component while ignoring the phase component. The newer Cassiopeia neural network considers the input signal's phase component and generates the phase for the output signal. Through this process, the split tracks will contain fewer audio artifacts.

To put all that in simple terms, the new algorithm goes deeper analyzing the song to create a better split.

To prove that its service works more effectively, LALAL.AI tested it against Spleeter, OpenUnmix, and Extended Unmix. It also compared the results to its own Rocknet neural network. You can view the full results of the test on LALAL.AI's blog, but basically, Cassiopeia outperformed all the others in most categories across various randomly-selected genres like jazz, soft rock, pop, and so on.

Interestingly, Rocknet still performs better in the vocal channel. Cassiopeia has slightly more infiltration from the instrumentals into the vocals. However, LALAL.AI pointed out that numbers don't always tell the whole story, and sometimes the sound quality can actually be different than what the tests show.

LALAL.AI results

Here's what the company said on the matter:

Although Cassiopeia lags behind Rocknet in terms of formal metrics for vocals, both the instrumental part and especially the vocal stem separated by Cassiopeia sound much more natural and softer than Rocknet's, without the metallic-sounding artifacts that are so characteristic of the other solutions.

I tested the results for myself, and I did find that the Cassiopeia neural network did result in cleaner audio splits. The vocal track had almost no perceptible infiltration from the instrumentals, which is exactly what you want from a tool like LALAL.AI

With that said, the results from Rocknet were still quite good, and they were absolutely usable for isolating the vocal track from the instrumentals.

How Do You Try LALAL.AI's New Cassiopeia Feature?

If you want to give the new neural network a shot, you can go to LALAL.AI and make sure the Use the new algorithm box is checked near the bottom of the screen when you upload a song.

You can also choose the level of aggression used by the algorithm to split the tracks. Normal is good for most tracks, but you can experiment with Mild and Aggressive to see what creates a better track for you.