Be a part of our every day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Be taught Extra
Groq and PlayAI introduced a partnership immediately to carry Dialog, a complicated text-to-speech mannequin, to market by way of Groq’s high-speed inference platform.
The partnership combines PlayAI’s experience in voice AI with Groq’s specialised processing infrastructure, creating what the businesses declare is without doubt one of the most natural-sounding and responsive text-to-speech programs accessible.
“Groq offers a whole, low latency system for automated speech recognition (ASR), GenAI, and text-to-speech, multi function place,” mentioned Ian Andrews, Chief Income Officer at Groq, in an unique interview with VentureBeat. “With Dialog now operating on GroqCloud, this implies prospects received’t have to make use of a number of suppliers for a single use case — Groq is a one cease answer.”
Groq powers first Arabic voice AI, increasing Center East tech presence
Dialog is notable for being accessible in each English and Arabic, with the Arabic model representing the primary voice AI particularly designed for the Center East area. The inclusion of Arabic as one of many preliminary choices was strategic for each corporations.
“Arabic is the fourth most spoken language globally — by partnering with PlayAI to supply an Arabic TTS mannequin, Groq is unlocking a key world market and enabling broader entry to quick AI inference,” Andrews informed VentureBeat.
The businesses declare their answer addresses key shortcomings in current voice AI applied sciences, notably round pure speech patterns and response velocity. In accordance with benchmark testing carried out by third-party evaluator Podonos, Dialog was most well-liked by customers at a price of 10:1 versus ElevenLabs v2.5 Turbo and over 3:1 in opposition to ElevenLabs Multilingual v2.0.
Modern ‘adaptive speech contextualizer’ transforms conversational AI
What units Dialog aside is its subtle method to context. Relatively than treating every vocalization as an remoted occasion, the system maintains consciousness of all the dialog stream.
“We constructed a novel structure that we name an ‘adaptive speech contextualizer‘ (ASC), which permits the mannequin to make use of the total context and historical past of a dialog,” mentioned Mahmoud Felfel, co-founder and CEO of PlayAI, in an interview with VentureBeat. “Because of this each response isn’t only a standalone output; it’s enriched with applicable prosody, tone, and emotion that mirror the stream of the dialog.”
For enterprises seeking to implement conversational AI, latency — the delay between request and response — has been a persistent problem. Groq’s specialised Language Processing Models (LPUs) seem to supply a major benefit on this space.
“Primarily based on preliminary inside testing, Groq is delivering as much as 140 characters per second on PlayAI’s Dialog mannequin, a major increase in comparison with the identical mannequin operating on GPUs at 86 characters per second,” defined Andrews. “That implies that Dialog generates textual content as much as 10 instances sooner than real-time.”
Groq secures $1.5 billion Saudi funding to construct world-class AI infrastructure
The partnership comes at a time of great growth for Groq, which not too long ago secured a $1.5 billion dedication from Saudi Arabia to fund extra infrastructure. The corporate has established an information heart in Dammam, which it describes as “the area’s largest inference cluster.”
“Partnering with Groq was a no brainer; they’re the {industry} chief in superior AI inference infrastructure,” mentioned Felfel. “With TTS and brokers, low latency is vital. We’ve already optimized Dialog for these real-time purposes, however partnering with Groq permits us to ship the bottom latency voice mannequin available on the market.”
The voice AI market has seen fast progress as companies look to automate buyer interactions whereas sustaining a pure, human-like expertise. Functions vary from customer support and gross sales automation to voice-overs and accessibility options for the visually impaired.
Enterprise purposes prolong past conventional customer support use instances
“Past customer support, different enterprise use instances embrace automating gross sales and appointment scheduling, on-boarding and private assistants, creating voice overs to current content material, translating English audio and video content material into Arabic, rising web site and static content material accessibility for the visually impaired, and extra,” Andrews mentioned.
For PlayAI, which was based by entrepreneurs from the Center East and North Africa area, the inclusion of Arabic language capabilities was notably significant.
“As MENA founders, we all know the area is closely investing in AI capabilities and infrastructure as inflected in investments like Groq, but additionally world-leading adoption,” mentioned Felfel. “Arabic is a worldwide enterprise language and one which we grew up talking, so it was a pure alternative as certainly one of our core languages.”
The businesses have made the Dialog know-how accessible by way of GroqCloud’s tiered service mannequin, which incorporates each free and paid choices. This method permits builders to experiment with the know-how earlier than committing to bigger implementations.
“GroqCloud provides each free and paid plans. Anybody can create an account and create an API code free of charge,” Andrews defined. “Our paid Developer Tier is self-serve, which means anybody with a bank card can join themselves.”
As voice turns into an more and more necessary interface for AI programs, this partnership positions each corporations to capitalize on the rising demand for extra pure and responsive conversational experiences. By addressing the technical challenges of latency and pure speech patterns, Groq and PlayAI could have eliminated vital obstacles to wider adoption of voice AI in enterprise settings.