Vocal / Instrumental AI Separation
SOTA model — BS-Roformer + ensemble · fast · free to start
How it works
- Upload — Drag an MP3 / WAV / M4A — up to 100 MB, up to 15 minutes
- Wait for AI — Studio pipeline (BS-Roformer + Mel-Roformer + MDX23C ensemble) takes ~5-6 minutes
- Download — You get vocals.wav + instrumental.wav (karaoke) separately
Free 1 song/day · join Patreon → Pro 20/day · audio stays in Thailand · local GPU
Frequently asked questions
Is AIVoiceSeparator really free?
Yes. Anonymous users get 1 song per day at full Studio quality (3-model AI ensemble + loudnorm). Patreon Pro raises the limit to 20/day.
How does AIVoiceSeparator compare to LALAL.AI or vocalremover.org?
Our 3-model ensemble (BS-Roformer + Mel-Roformer + MDX23C) measures SDR 12.97 dB — roughly 3 dB better than the open Demucs baseline. Output is EBU-R128 loudness-normalized so the stems sit naturally in any mix. Audio is processed on a private GPU in Thailand — never sent to a third-party cloud.
Do you store my uploaded songs?
Every job (input + outputs) is automatically deleted after 24 hours. We never use your audio to train AI models, and we never share results between users — access is gated by an opaque job_id.
What audio formats do you support?
Input: MP3, WAV, M4A, FLAC, OGG, WebM, Opus. Max 100 MB, max 15 minutes. Output: MP3 320 kbps (default), WAV, or FLAC (lossless).
Can I separate a song directly from a YouTube link?
Yes. Paste a YouTube, SoundCloud, TikTok, Bandcamp, or Vimeo URL and the server downloads the audio for you. You are responsible for having the rights to the content you process.
Can I get lyrics / subtitles from the song?
Yes. Enable the 'Generate lyrics' toggle before processing. We run Whisper on the isolated vocal stem and return SRT (subtitle), LRC (karaoke), and TXT (plain) files. Adds ~30 seconds to the run.