AI Heroes Dev Diary #1

Crafting a Music Maker – Turning Speech into Song

Welcome to the first entry in our dev diary, where we’re excited to share the journey behind our latest project—a speech-to-song generator. This tool is all about turning your spoken words into unique songs, blending AI’s technical capabilities with a creative touch. If you’ve ever wondered how AI can make music or what it’s like to develop a project like this, you’re in the right place.

Where It All Began: From Idea to Action

It started with a question that got us curious: What if we could take someone’s words and turn them into music? It felt like one of those ideas with great potential, but we knew that piecing the technicalities together would be a journey. We knew AI could process language and create melodies, but the challenge was blending the two into something people would actually want to use—not just as a gimmick, but as an experience.

The idea quickly evolved into something bigger: Could we make a tool that feels personalized and creative, while still using AI’s strengths? This became the driving force behind our project—using AI to co-create music with the user, not just for them.

Designing an Experience, Not Just a Product

Once we had the concept, the next step was designing a user experience that felt fun and intuitive, not complicated or cold. Our designer, Demi, brought this to life by focusing on how the process feels. We wanted it to be colorful, playful, and easy to use—something that invites you in and keeps you engaged.

Here’s what we came up with:

A bright, rainbow-colored interface that reflects the creative energy of making music
A friendly AI bot to guide users through, so it feels like a collaboration
Visual effects that let you see the song being made in real-time, turning the process into part of the fun

Starting Page

Loading Page

Preview Page

The flow is simple:

You’re greeted with a warm welcome from our AI mascot.
You answer a few fun, interactive questions about what kind of song you want.
As your song generates, you see dynamic visuals that bring the process to life.
Finally, you get to listen to your custom song and share it if you want.

Starting Page

Recording Page

Preview Page

How AI Helps Us Build Smarter

Behind the scenes, we used AI not just to create the final product but to shape how we built it. We relied on AI models to help us think through potential roadblocks and explore new ideas for features, which made the whole process faster and more flexible. Instead of spending hours debating what the user experience might be like, we could simulate it and see how it felt.

This approach let us:

Catch potential issues early, so we didn’t waste time fixing things later.
Experiment with a ton of ideas before settling on the ones that worked best.
Keep things moving quickly, which meant we could test, tweak, and improve faster.

For example, we used AI to identify all the pages we would need to design:

PROJECT CASE: We’re creating a speech to song generator app, where users will answer a predefined number of questions, and then the model will generate an accompanying song. What screens and intermediate pages do we need to develop for this tool?

To create a speech to song generator app, we’ll need several screens and intermediate pages to guide users through the process. Here’s a suggested flow for the app:

Welcome/Home Screen
Question Sequence Pages
Generation Loading Screen
Results Page
Song Playback and Controls
Save/Share Options
History/Library Page

Let’s break these down:

Welcome/Home Screen:
- App title and brief description
- “Start New Song” button
- Access to History/Library
- Settings or user account (if applicable)
Question Sequence Pages:
- Series of pages, each containing one question
- Progress indicator (e.g., “Question 3 of 7”)
- Input field for user’s answer (text, multiple choice, etc.)
- “Next” and “Back” buttons
- Option to skip questions
Generation Loading Screen:
- Animated loading indicator
- Status updates (e.g., “Analyzing responses”, “Composing melody”, “Adding lyrics”)
- Estimated time remaining
Results Page:
- Generated song title
- Brief summary of the song’s theme or style
- Option to play the song
- Lyrics display (if applicable)
Song Playback and Controls:
- Play/Pause button
- Seek bar
- Volume control
- Repeat option
Save/Share Options:
- Save to library button
- Share buttons (social media, messaging, email)
- Download options (e.g., MP3, lyrics as text)
History/Library Page:
- List of previously generated songs
- Search and filter options
- Ability to replay, edit, or delete saved songs

Additional considerations:

Tutorial or onboarding screens for first-time users
Error handling pages (e.g., if generation fails)
Feedback/rating system for generated songs
Settings page for app preferences

Would you like me to elaborate on any of these screens or discuss the user flow in more detail?

What’s Next?

Right now, we’re working on fine-tuning the AI models to make the songs even more nuanced and varied. We’re diving deeper into natural language processing to better understand the subtleties of what users want. We’re also improving the music generation itself, so the songs feel more dynamic, and upgrading the text-to-speech system so the vocals sound more natural.

On the frontend, we’re refining the design to make sure the interface works smoothly across devices, and that the experience is as fast and responsive as we want it to be.

Be Part of the Journey

We’re excited about where this project is headed, and we’d love for you to be part of it. Keep an eye out for updates in future dev diaries, where we’ll share more about the technical side of things, new features we’re testing, and how you can get involved in beta testing.

At AI Heroes, we’re not just building AI tools—we’re exploring new ways to create with them. If you’re curious about what we’re doing or want to chat about how AI can be part of your next idea, reach out to us.

Crafting a Music Maker – Turning Speech into Song

Where It All Began: From Idea to Action

Designing an Experience, Not Just a Product

How AI Helps Us Build Smarter

What’s Next?

Be Part of the Journey

Ralitsa Todorova

Next Post

AI Agents: Building Your Digital Team

Our Approach to Teaching Ethics in AI: The Erasmus+ Ethical Engineer Project

Phone

Offices

Let's work together

Career

AI Heroes Dev Diary #1

Crafting a Music Maker – Turning Speech into Song

Where It All Began: From Idea to Action

Designing an Experience, Not Just a Product

How AI Helps Us Build Smarter

What’s Next?

Be Part of the Journey

Ralitsa Todorova

Next Post

Related Posts

AI Agents: Building Your Digital Team

Our Approach to Teaching Ethics in AI: The Erasmus+ Ethical Engineer Project