Be part of our every day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Be taught Extra
Hume AI, the startup specializing in emotionally clever voice interfaces, has launched Voice Management, an experimental function that empowers builders and customers to create customized AI voices by means of exact modulation of vocal traits — no coding, AI immediate engineering, or sound design abilities required.
This launch builds on the muse laid by the corporate’s earlier Empathic Voice Interface 2 (EVI 2), which launched superior capabilities in naturalness, emotional responsiveness, and customization.
Each EVI 2 and Voice Management keep away from the dangers of voice cloning, a follow that Cowen has said carries moral and sensible challenges.
As a substitute, Hume focuses on offering instruments for creating distinctive, expressive voices that align with consumer wants, corresponding to customer support chatbots, digital assistants, tutors, guides, or accessibility options.
Transferring past preset AI voices towards customized bespoke options
Voice Management gives builders the power to regulate voices alongside 10 distinct dimensions, together with:
“Masculine/Female: The vocalization of gender, ranging between extra masculine and extra female.
Assertiveness: The firmness of the voice, ranging between timid and daring.
Buoyancy: The density of the voice, ranging between deflated and buoyant.
Confidence: The assuredness of the voice, ranging between shy and assured.
Enthusiasm: The joy throughout the voice, ranging between calm and enthusiastic.
Nasality: The openness of the voice, ranging between clear and nasal.
Relaxedness: The stress throughout the voice, ranging between tense and relaxed.
Smoothness: The feel of the voice, ranging between clean and staccato.
Tepidity: The liveliness behind the voice, ranging between tepid and vigorous.
Tightness: The containment of the voice, ranging between tight and breathy.”
This no-code software permits customers to fine-tune voice attributes in actual time by means of digital onscreen sliders. It’s at the moment obtainable in Hume’s digital playground, which requires a free consumer sign-up to entry.
The discharge addresses a key ache level within the AI {industry}: the reliance on preset voices, which regularly fail to satisfy the particular wants of manufacturers or purposes, or the dangers related to voice cloning.
This give attention to customization aligns with Hume’s broader aim of growing emotionally nuanced voice AI.
The corporate’s efforts to advance voice AI have been highlighted in September 2024 with the launch of EVI 2, which the corporate described as a big improve to its predecessor.
EVI 2 improved latency by 40%, decreased prices by 30%, and expanded voice modulation options, providing builders a safer various to voice cloning.
Sliders > textual content prompts
Hume’s research-driven strategy performs a central position in its product growth. The corporate, co-founded by former Google DeepMinder Alan Cowen, makes use of a proprietary mannequin based mostly on cross-cultural voice recordings paired with emotional survey information.
This system, rooted in emotion science, kinds the spine of each EVI 2 and the newly launched Voice Management.
Voice Management extends these rules by addressing the granular, typically ineffable methods people understand voices.
The software’s slider-based interface displays frequent perceptual qualities of voice, corresponding to buoyancy or assertiveness, with out making an attempt to oversimplify these attributes by means of text-based prompts.
Voice Management is instantly obtainable in beta and integrates with Hume’s Empathic Voice Interface (EVI), making it accessible for a variety of purposes.
Builders can choose a base voice, regulate its traits, and preview the leads to actual time. This course of ensures reproducibility and stability throughout classes, key options for real-time purposes like customer support bots or digital assistants.
EVI 2’s affect is obvious in Voice Management’s capabilities. The sooner mannequin launched options like in-conversation prompts and multilingual capabilities, which have broadened the scope of voice AI purposes.
For instance, EVI 2 helps sub-second response occasions, enabling pure and quick conversations. It additionally permits dynamic changes to talking fashion throughout interactions, making it a flexible software for companies.
Differentiating in a aggressive market
Hume’s give attention to voice customization and emotional intelligence positions it as a powerful competitor within the voice AI area, even towards well-funded rivals corresponding to OpenAI with its Superior Voice Mode and ElevenLabs, each of which provide libraries of pre-set voices.
Hume continues to construct on its revolutionary strategy to voice AI. Plans for increasing Voice Management embrace introducing extra modifiable dimensions, refining voice high quality underneath excessive changes, and growing the vary of base voices obtainable.
With the launch of Voice Management, Hume reinforces its place as a pacesetter in voice AI innovation, providing instruments that prioritize customization, emotional intelligence, and real-time adaptability. Builders can entry Voice Management at the moment through Hume’s platform, marking one other step ahead within the evolution of AI-driven voice options.