Back to list

How I made an AI Song using my own Voice (in less than 24 hours)

Aug 30, 2024

#SunoAI #KitsAI #ClaudeAI

I recently participated in an org-wide competition to create a 60-second YouTube Shorts video explaining why I'm the best candidate to attend Made on YouTube, where YouTube releases its newest products and features to the world. My first thought was to create a music video with the help of AI. Since I only had less than 24 hours to complete it (due to learning about the contest too late), I had to be very efficient in my content creation process. So I followed this 3-step process:

Step 1: Generate Lyrics

I first asked Claude AI for help to generate lyrics using a prompt like this:

"I want to make a song using Suno. This song is for [purpose] and it needs to contain [specific information] in the lyrics. What should the lyrics be and what style of song would you recommend?"

After a few iterations, I was pretty happy with this version:

Style: Upbeat pop with electronic elements

Verse 1: Data flows like streams, I'm decoding dreams YouTube's my scene, where creators gleam Crunching numbers, finding trends Helping channels reach new ends
Chorus: I'm Ying Wang, data's my game YouTube Data Science, that's my fame Send me to NYC, I'll represent with pride Made On YouTube, let's take this ride
Verse 2: Creator by night, analyst by day I make videos too, I know the way From views to likes, I analyze it all Turning insights into a creator's call
(Repeat Chorus)
Bridge: Mingling with creators, that's where I shine Sharing stories, yours and mine Let me show the world what our platform can do With numbers and passion, I'll see it through
(Repeat Chorus)
Outro: YouTube's the stage, data sets the pace Send me to NYC, I'll make our case!

Step 2: Create a Song

After finalizing the lyrics, the next step was to turn them into a song. My favorite tool is Suno, which I've used many times previously to create children's songs with my 3-year-old son ([video link]). I've recently heard it added powerful features like uploading your own voice, called audio inputs, so I was excited to try it out.

First, without my own voice, I added the lyrics and set the style of music to what Claude had recommended. I also added a BPM (beats per minute) setting to keep it more fast-paced, as well as some instructions on instruments (a mix of electronic and pop instruments) and specified the voice (clear, energetic female voice). This might be better left unspecified because later I'm going to replace the voice with my own.

As you can see the typical length of a song is between 2-3 minutes and I want to keep it under 1 min so I later also added [Big Finish] [Fade] [End] after outro to make the ending of the song more like an ending. After creating several songs, I'm quite happy with this one before moving onto adding my own vocals. Note that we could have used "Upload Audio" and sing out the music but it didn't work that well and I really liked a particular melody but I couldn't replicate it. I wonder if Suno also has some kind of "seed" value to keep the melody of the song consistent - maybe another future product release idea :)

Step 3: Incorporate My Own Vocals

For this task, I used kits.ai. To accomplish this, I needed to first train my own voice model by uploading at least 10 minutes of me singing something. (The recommended length is actually 30 minutes, but I was too lazy to do so much singing in one sitting!)

Here's how I did it:

I went to the Voices tab in kits.ai.
I chose the "Clone a Voice" option (clone a unique, custom voice from your dataset).
I uploaded my voice samples.
It took a couple of hours for them to train the model.

After training my voice model, the next step was to use it in our song. Here's how I did it:

I went to the Convert tab in kits.ai and clicked on Song input.
The AI first separated the vocal from the backing track, then replaced the original vocal with my trained voice.

However, I noticed that my voice sounded too masculine - probably because I only uploaded 10 minutes of my singing, and I don't have a very feminine voice to begin with. To address this:

I created a variant of the cloned voice and made sure to set it as female.
You can also further tune other aspects of the voice using the voice triangle, including breath, power, and warmth. I didn't play around with these settings this time.

It took a couple of minutes to generate the new music, and the result was quite good (song). However, the voice wasn't as energetic as I expected, so I made one more adjustment:

I used kits.ai's AI mastering feature.
I chose the preset "Light & Bright".
The result was even better!

Conclusion

I had a lot of fun making this song and will probably train a better voice model by uploading more minutes of my singing. This experience has opened up some interesting possibilities:

I wonder if I can use this cloned voice for other purposes, such as voiceovers, using Kits.AI or perhaps other AI voice cloning services like ElevenLabs.
The potential for creators to generate custom music with their own voices quickly is exciting and could revolutionize content creation.

Have questions?

Twitter

San Francisco