AIVoiceSeparator
🌐 ภาษาไทย

Vocal / Instrumental AI Separation

SOTA model — BS-Roformer + ensemble · fast · free to start

SDR 12.97 dB · ~3 dB better than Demucs 🔒 No cloud · GPU in Thailand 🎯 hi-freq retention on par with vocalremover.org
🎚️ Studio quality
3-model ensemble (BS-Roformer + Mel-Roformer + MDX23C) + EBU-R128 loudnorm — ~5-6 min / song
⚡ Standard

How it works

  1. UploadDrag an MP3 / WAV / M4A — up to 100 MB, up to 15 minutes
  2. Wait for AIStudio pipeline (BS-Roformer + Mel-Roformer + MDX23C ensemble) takes ~5-6 minutes
  3. DownloadYou get vocals.wav + instrumental.wav (karaoke) separately

Free 1 song/day · join Patreon → Pro 20/day · audio stays in Thailand · local GPU

Frequently asked questions

Is AIVoiceSeparator really free?

Yes. Anonymous users get 1 song per day at full Studio quality (3-model AI ensemble + loudnorm). Patreon Pro raises the limit to 20/day.

How does AIVoiceSeparator compare to LALAL.AI or vocalremover.org?

Our 3-model ensemble (BS-Roformer + Mel-Roformer + MDX23C) measures SDR 12.97 dB — roughly 3 dB better than the open Demucs baseline. Output is EBU-R128 loudness-normalized so the stems sit naturally in any mix. Audio is processed on a private GPU in Thailand — never sent to a third-party cloud.

Do you store my uploaded songs?

Every job (input + outputs) is automatically deleted after 24 hours. We never use your audio to train AI models, and we never share results between users — access is gated by an opaque job_id.

What audio formats do you support?

Input: MP3, WAV, M4A, FLAC, OGG, WebM, Opus. Max 100 MB, max 15 minutes. Output: MP3 320 kbps (default), WAV, or FLAC (lossless).

Can I separate a song directly from a YouTube link?

Yes. Paste a YouTube, SoundCloud, TikTok, Bandcamp, or Vimeo URL and the server downloads the audio for you. You are responsible for having the rights to the content you process.

Can I get lyrics / subtitles from the song?

Yes. Enable the 'Generate lyrics' toggle before processing. We run Whisper on the isolated vocal stem and return SRT (subtitle), LRC (karaoke), and TXT (plain) files. Adds ~30 seconds to the run.