AIVoiceSeparator
🌐 ภาษาไทย

Free YouTube Vocal Remover — Isolate Vocals from Any YouTube Video

Paste any YouTube link. Our AI separates the vocals from the instrumental in about 6 minutes — studio-grade quality, no signup, no watermark, free for one song every day.

SDR 12.97 dB · ~3 dB better than Demucs 🎥 youtu.be · youtube.com · YouTube Music 🎚️ MP3 / WAV / FLAC output

Paste a YouTube URL and we'll do the rest

🔗 Open the YouTube vocal remover

Free 1 song/day · no signup · Patreon Pro = 20 songs/day

How to remove vocals from a YouTube video — 4 steps

  1. Copy the YouTube URL. Open the video in your browser or the YouTube app and copy the link.
  2. Open AIVoiceSeparator and switch to the Paste YouTube / SoundCloud / TikTok link tab.
  3. Paste and click "Separate audio". Our server downloads the audio with yt-dlp and queues it for the AI ensemble.
  4. Wait ~6 minutes, then download the isolated vocals.wav and instrumental.wav. You can also pick MP3 320 kbps or FLAC.

The whole flow runs on a private GPU in Thailand — your audio is never sent to a third-party cloud, and every job is deleted automatically after 24 hours.

Why use AIVoiceSeparator for YouTube videos

🎚️ Studio-grade quality

Three-model ensemble — BS-Roformer + Mel-Roformer + MDX23C — measured at SDR 12.97 dB, about 3 dB better than the open-source Demucs baseline.

🔗 Direct YouTube link support

No need to download MP3s with sketchy third-party converters first. Just paste the URL — we handle yt-dlp on the server.

🎤 Lyrics transcription

Enable the "Generate lyrics" toggle to get SRT, LRC, and TXT subtitle files from the isolated vocal stem (Whisper-powered).

🥁 BPM and key detection

Every output ships with detected tempo and musical key — useful for remixing, DJing, and music production.

🔒 Privacy first

Inputs and outputs are auto-deleted after 24 h. We never train models on your audio, and there is no upload-to-share feature.

💸 Genuinely free

1 song every 24 hours, anonymous, at full Studio quality. No watermark, no email signup, no time-limit preview.

YouTube vocal remover comparison

FeatureAIVoiceSeparatorLALAL.AIvocalremover.org
Quality (separation SDR)12.97 dB · 3-model ensemble~11 dB · Phoenix model~9 dB · single Spleeter model
YouTube link supportYes — paste and goNo (download first)No (download first)
Free tier1 song/day, full quality10 min preview only1 song free, low quality
Output formatMP3 320 / WAV / FLACMP3 / WAV (paid)MP3 only
Lyrics / subtitle exportSRT + LRC + TXTNoNo
Signup requiredNoYes (paid features)No

Common uses for a YouTube vocal remover

Frequently asked questions

Is this YouTube vocal remover really free?

Yes. Anonymous users get 1 song per day at full Studio quality. Patreon Pro raises it to 20 songs per day and adds priority queueing.

How long does it take to process a 5-minute YouTube video?

About 5–6 minutes end-to-end — that includes downloading the audio via yt-dlp and running the three-model AI ensemble.

What YouTube URL formats do you accept?

Standard youtube.com/watch?v=…, short youtu.be/… links, YouTube Music URLs, and Shorts. Also SoundCloud, TikTok, Bandcamp, and Vimeo.

Are there length limits?

15 minutes maximum per source, 100 MB maximum file size after the audio is downloaded. Most full songs are well under both limits.

Can I get the lyrics as a subtitle file?

Yes — toggle "Generate lyrics" before processing and we'll run Whisper on the isolated vocal stem. You get SRT (video subtitle), LRC (karaoke), and TXT (plain) files.

Do you store my YouTube downloads?

No. Every job (input audio + the separated stems) is deleted automatically after 24 hours. We never use your audio for AI training and never share outputs between users.

Is it legal to remove vocals from a YouTube video?

You are responsible for having the rights. Personal use such as karaoke practice or transcription is generally considered fair; redistribution or commercial use of someone else's track is not. See our terms of use.

Which AI models do you use?

A weighted ensemble of three state-of-the-art models: BS-Roformer (40%), Mel-Band Roformer (35%), and MDX23C InstVoc (25%). Outputs are EBU-R128 loudness-normalized so the stems sit naturally in any mix.

Related free tools

Ready to try it on your YouTube link?

🔗 Open the YouTube vocal remover

Free, no signup, no watermark — 1 song every 24 hours