Artificial Intelligence applied to music generation

Just as AI technologies such as ChatGPT have emerged for the generation of text in natural language in the form of a chat, such as Stable Diffusion, Dalle 3, Midjourney and similar for the generation of images, it has also been developed in the field of audio, voice and musical production.

For the generation of tracks and music by AI, the following technologies have emerged at the beginning of year 2023:

* MusicLM
* MusicGen
* Riffusion
* Mousai
* JEN-1
* Stable Audio

Music generation has attracted increasing interest with the advancement of deep generative models. However, generating music conditional on textual descriptions, known as text-to-music, remains challenging due to the complexity of musical structures and high sampling rate requirements. Despite the importance of the task, prevailing generative models have limitations in music quality, computational efficiency, and generalization.

The most notable ones lately are:

JEN-1

It stands out for being a diffusion model that incorporates autoregressive and non-autoregressive training. Through in-context learning, JEN-1 performs various generation tasks, including text-guided music generation, music painting, and continuation. This model is only in research and at the moment it is not possible to test it.

Stable Audio

Is a model developed by Stability AI, and is its first product for generating music and sound effects. This model can be tried for free by creating an account and generating only 20 tracks per month with a duration of 45 seconds in non-commercial projects. Professional use has a monthly subscription cost of $11.99, with 500 tracks per month and 90 seconds long, there is a third business plan, but it has a personalized cost.

Conclution

These technologies are under development and at the moment to use them in a production environment, the guidance of a professional in the area of audiovisual production is required to see if the result obtained from these models is optimal and if it is worth paying a monthly subscription. There are other open source alternatives that can be used in local computing centers or Workstations without paying for a subscription, but a certain computing capacity is required to run these Artificial Intelligences.

Companies like Adobe are also integrating certain AI capabilities into their programs like Photoshop, Premiere, etc., Google doing the same on YouTube and Meta on Facebook and Instagram.

In the following months and years we will see the progress of these tools and the improvements they present with the advancement of new hardware and software that allows these tools to be executed faster and more optimally.

Share this article:

Artificial Intelligence applied to music generation

JEN-1

Stable Audio

Conclution

Cómo instalar KVM en Ubuntu 24.04 LTS paso a paso (Spanish)