Deep Learning Lecture

Lecture

Generative Music/Song

11 Sep 2025 • Richard Kuo

Introduction to Text to Music / Song.

Text to Music

MusicLM

Paper: MusicLM: Generating Music From Text
Code: https://github.com/lucidrains/musiclm-pytorch
Code: https://github.com/zhvng/open-musiclm

AudioCraft

Paper: Simple and Controllable Music Generation
Code: https://github.com/facebookresearch/audiocraft

LLM2Vec

Paper: LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

Yue

Paper: YuE: Scaling Open Foundation Models for Long-Form Music Generation

Instruct-MusicGen

Paper: Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
Code: https://github.com/ldzhangyx/instruct-MusicGen
Demo: https://bit.ly/instruct-musicgen

Stable Audio Open

Paper: Stable Audio Open
Code: https://github.com/Stability-AI/stable-audio-tools

Seed-Music

Paper: Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Text to Song

SongCreator

Paper: SongCreator: Lyrics-based Universal Song Generation

Dual-sequence language model (DSLM)

The DSLM can utilize specific attention mask strategy to achieve different song generation tasks.

SongGen

Paper: SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
Code: https://github.com/LiuZH-19/SongGen

DiffRhythm

Paper: DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion
Code: https://github.com/ASLP-lab/DiffRhythm

DiffRhythm+

Paper: DiffRhythm+: Controllable and Flexible Full-Length Song Generation with Preference Optimization

ComfyUI DiffRhythm

JAM

Paper: JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
Code: https://github.com/declare-lab/jamify

ACE-Step

Paper: ACE-Step: A Step Towards Music Generation Foundation Model
Code: https://github.com/ace-step/ACE-Step

https://github.com/billwuhao/ComfyUI_ACE-Step

This site was last updated October 02, 2025.