Which YouTube URLs work?

Public youtube.com, youtu.be, and YouTube Music links work. We also support SoundCloud, TikTok, Bandcamp, and Vimeo. Private or age-restricted videos may fail.

How long does it take to process a YouTube video?

About 5 to 6 minutes for a typical 5-minute song, including the audio download, on our RTX 5060 GPU in Thailand.

What is the maximum YouTube video length?

15 minutes. The downloader also enforces a 100 MB cap on the source audio file.

Do you keep my downloaded files?

Every job is deleted after 24 hours. We never use your audio to train AI models.

Free YouTube Vocal Remover — Isolate Vocals from Any YouTube Video

Q: Is the YouTube vocal remover free?

Yes. Free users get 3 songs per month at full Studio quality. Patreon Pro raises the limit to 2 songs per day.

Q: Is it legal to remove vocals from a YouTube video?

You are responsible for having the rights to the content. Personal use (practice, karaoke at home) is generally fine; redistribution or commercial use of someone else's track is not.

Paste any YouTube link. Our AI separates the vocals from the instrumental in about 6 minutes — studio-grade quality, no signup, no watermark, free for one song every day.

⚡ SDR 12.97 dB · ~3 dB better than Demucs 🎥 youtu.be · youtube.com · YouTube Music 🎚️ MP3 / WAV / FLAC output

Paste a YouTube URL and we'll do the rest

🔗 Open the YouTube vocal remover

Free 3 songs/month · no signup · Patreon Pro = 2 songs/day

How to remove vocals from a YouTube video — 4 steps

Copy the YouTube URL. Open the video in your browser or the YouTube app and copy the link.
Open AIVoiceSeparator and switch to the Paste YouTube / SoundCloud / TikTok link tab.
Paste and click "Separate audio". Our server downloads the audio with yt-dlp and queues it for the AI ensemble.
Wait ~6 minutes, then download the isolated vocals.wav and instrumental.wav. You can also pick MP3 320 kbps or FLAC.

The whole flow runs on a private GPU in Thailand — your audio is never sent to a third-party cloud, and every job is deleted automatically after 24 hours.

Why use AIVoiceSeparator for YouTube videos

🎚️ Studio-grade quality

Three-model ensemble — BS-Roformer + Mel-Roformer + MDX23C — measured at SDR 12.97 dB, about 3 dB better than the open-source Demucs baseline.

🔗 Direct YouTube link support

No need to download MP3s with sketchy third-party converters first. Just paste the URL — we handle yt-dlp on the server.

🎤 Lyrics transcription

Enable the "Generate lyrics" toggle to get SRT, LRC, and TXT subtitle files from the isolated vocal stem (Whisper-powered).

🥁 BPM and key detection

Every output ships with detected tempo and musical key — useful for remixing, DJing, and music production.

🔒 Privacy first

Inputs and outputs are auto-deleted after 24 h. We never train models on your audio, and there is no upload-to-share feature.

💸 Genuinely free

3 songs a month, anonymous, at full Studio quality. No watermark, no email signup, no time-limit preview.

YouTube vocal remover comparison

Feature	AIVoiceSeparator	LALAL.AI	vocalremover.org
Quality (separation SDR)	12.97 dB · 3-model ensemble	~11 dB · Phoenix model	~9 dB · single Spleeter model
YouTube link support	Yes — paste and go	No (download first)	No (download first)
Free tier	3 songs/month, full quality	10 min preview only	1 song free, low quality
Output format	MP3 320 / WAV / FLAC	MP3 / WAV (paid)	MP3 only
Lyrics / subtitle export	SRT + LRC + TXT	No	No
Signup required	No	Yes (paid features)	No

Common uses for a YouTube vocal remover

Karaoke practice. Strip the vocal track from your favorite song and sing along to the instrumental.
Cover versions. Use the instrumental as a backing track for your own vocal recording or AI voice cover.
Remixing and sampling. Extract clean acapellas to build a remix, mashup, or beat from existing material you own the rights to.
Music transcription. Isolating the vocals makes it much easier to transcribe lyrics, harmonies, and melody lines.
DJ stem-mixing. Use vocals.wav and instrumental.wav as live-mixable stems in Serato, Rekordbox, or VirtualDJ.
Language learning. Listen to the vocal track in isolation to catch every word, then cross-check against the auto-generated lyrics.

Frequently asked questions

Is this YouTube vocal remover really free?

Yes. Anonymous users get 3 songs per month at full Studio quality. Patreon Pro raises it to 2 songs per day and adds priority queueing.

How long does it take to process a 5-minute YouTube video?

About 5–6 minutes end-to-end — that includes downloading the audio via yt-dlp and running the three-model AI ensemble.

What YouTube URL formats do you accept?

Standard youtube.com/watch?v=…, short youtu.be/… links, YouTube Music URLs, and Shorts. Also SoundCloud, TikTok, Bandcamp, and Vimeo.

Are there length limits?

15 minutes maximum per source, 100 MB maximum file size after the audio is downloaded. Most full songs are well under both limits.

Can I get the lyrics as a subtitle file?

Yes — toggle "Generate lyrics" before processing and we'll run Whisper on the isolated vocal stem. You get SRT (video subtitle), LRC (karaoke), and TXT (plain) files.

Do you store my YouTube downloads?

No. Every job (input audio + the separated stems) is deleted automatically after 24 hours. We never use your audio for AI training and never share outputs between users.

Is it legal to remove vocals from a YouTube video?

You are responsible for having the rights. Personal use such as karaoke practice or transcription is generally considered fair; redistribution or commercial use of someone else's track is not. See our terms of use.

Which AI models do you use?

A weighted ensemble of three state-of-the-art models: BS-Roformer (40%), Mel-Band Roformer (35%), and MDX23C InstVoc (25%). Outputs are EBU-R128 loudness-normalized so the stems sit naturally in any mix.

Related free tools

Ready to try it on your YouTube link?

🔗 Open the YouTube vocal remover

Free, no signup, no watermark — 3 songs a month