Be a part of our day by day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Be taught Extra
Qodo, an AI-driven code high quality platform previously generally known as Codium, has introduced the discharge of Qodo-Embed-1-1.5B, a brand new open supply code embedding mannequin that delivers state-of-the-art efficiency whereas being considerably smaller and extra environment friendly than competing options.
Designed to boost code search, retrieval, and understanding, the 1.5-billion parameter mannequin achieves top-tier outcomes on {industry} benchmarks, outperforming bigger fashions from OpenAI and Salesforce.
For enterprise improvement groups managing huge and complicated codebases, Qodo’s innovation represents a leap ahead in AI-driven software program engineering workflows. By enabling extra correct and environment friendly code retrieval, Qodo-Embed-1-1.5B addresses a essential problem in AI-assisted improvement: context consciousness in large-scale software program methods.
Why code embedding fashions matter for enterprise AI
AI-powered coding options have historically targeted on code technology, with massive language fashions (LLMs) gaining consideration for his or her capability to jot down new code.
Nevertheless, as Itamar Friedman, CEO and co-founder of Qodo, defined in a video name interview earlier this week: “Enterprise software program can have tens of hundreds of thousands, if not a whole lot of hundreds of thousands, of traces of code. Code technology alone isn’t sufficient—you want to make sure the code is top of the range, works accurately, and integrates with the remainder of the system.”
Code embedding fashions play a vital position in AI-assisted improvement by permitting methods to look and retrieve related code snippets effectively. That is notably vital for big organizations the place software program tasks span hundreds of thousands of traces of code throughout a number of groups, repositories, and programming languages.
“Context is king for something proper now associated to constructing software program with fashions,” Friedman mentioned. “Particularly, for fetching the proper context from a extremely massive codebase, you must undergo some search mechanism.”
Qodo-Embed-1-1.5B gives efficiency and effectivity
Qodo-Embed-1-1.5B stands out for its steadiness of effectivity and accuracy. Whereas many state-of-the-art fashions depend on billions of parameters—OpenAI’s text-embedding-3-large has 7 billion, as an illustration—Qodo’s mannequin achieves superior outcomes with simply 1.5 billion parameters.
On the Code Data Retrieval Benchmark (CoIR), an industry-standard check for code retrieval throughout a number of languages and duties, Qodo-Embed-1-1.5B scored 70.06, outperforming Salesforce’s SFR-Embedding-2_R (67.41) and OpenAI’s text-embedding-3-large (65.17).

This stage of efficiency is essential for enterprises in search of cost-effective AI options. With the power to run on low-cost GPUs, the mannequin makes superior code retrieval accessible to a wider vary of improvement groups, lowering infrastructure prices whereas bettering software program high quality and productiveness.
Addressing the complexity, nuance and specificity of various code snippets
One of many greatest challenges in AI-powered software program improvement is that similar-looking code can have vastly totally different features. Friedman illustrates this with a easy however impactful instance:
“One of many greatest challenges in embedding code is that two practically equivalent features—like ‘withdraw’ and ‘deposit’—could differ solely by a plus or minus signal. They should be shut in vector house but additionally clearly distinct.”
A key difficulty in embedding fashions is guaranteeing that functionally distinct code is just not incorrectly grouped collectively, which might trigger main software program errors. “You want an embedding mannequin that understands code nicely sufficient to fetch the proper context with out bringing in related however incorrect features, which might trigger severe points.”
To unravel this, Qodo developed a singular coaching method, combining high-quality artificial knowledge with real-world code samples. The mannequin was educated to acknowledge nuanced variations in functionally related code, guaranteeing that when a developer searches for related code, the system retrieves the proper outcomes—not simply similar-looking ones.
Friedman notes that this coaching course of was refined in collaboration with NVIDIA and AWS, each of whom are writing technical blogs about Qodo’s methodology. “We collected a singular dataset that simulates the fragile properties of software program improvement and fine-tuned a mannequin to acknowledge these nuances. That’s why our mannequin outperforms generic embedding fashions for code.”
Multi-programming language assist and plans for future enlargement
The Qodo-Embed-1-1.5B mannequin has been optimized for the highest 10 mostly used programming languages, together with Python, JavaScript, and Java, with extra assist for an extended tail of different languages and frameworks.
Future iterations of the mannequin will develop on this basis, providing deeper integration with enterprise improvement instruments and extra language assist.
“Many embedding fashions battle to distinguish between programming languages, typically mixing up snippets from totally different languages,” Friedman mentioned. “We’ve particularly educated our mannequin to forestall that, specializing in the highest 10 languages utilized in enterprise improvement.”
Enterprise deployment choices and avail
Qodo is making its new mannequin broadly accessible by means of a number of channels.
The 1.5B parameter model is accessible on Hugging Face beneath the OpenRAIL++-M license, permitting builders to combine it into their workflows freely. Enterprises needing extra capabilities can entry bigger variations beneath industrial licensing.
For firms in search of a completely managed resolution, Qodo gives an enterprise-grade platform that automates embedding updates as codebases evolve. This addresses a key problem in AI-driven improvement: guaranteeing that search and retrieval fashions stay correct as code adjustments over time.
Friedman sees this as a pure step in Qodo’s mission. “We’re releasing Qodo Embed One as step one. Our aim is to repeatedly enhance throughout three dimensions—accuracy, assist for extra languages, and higher dealing with of particular frameworks and libraries.”
Past Hugging Face, the mannequin can even be obtainable by means of NVIDIA’s NIM platform and AWS SageMaker JumpStart, making it even simpler for enterprises to deploy and combine into their present improvement environments.
The way forward for AI in enterprise software program dev
AI-powered coding instruments are quickly evolving, however the focus is shifting past code technology towards code understanding, retrieval, and high quality assurance. As enterprises transfer to combine AI deeper into their software program engineering processes, instruments like Qodo-Embed-1-1.5B will play a vital position in making AI methods extra dependable, environment friendly, and cost-effective.
“When you’re a developer in a Fortune 15,000 firm, you don’t simply use Copilot or Cursor. You’ve gotten workflows and inner initiatives that require deep understanding of enormous codebases. That’s the place a high-quality code embedding mannequin turns into important.”
Qodo’s newest mannequin is a step towards a future the place AI isn’t simply helping builders with writing code—it’s serving to them perceive, handle, and optimize it throughout complicated, large-scale software program ecosystems.
For enterprise groups seeking to leverage AI for extra clever code search, retrieval, and high quality management, Qodo’s new embedding mannequin gives a compelling, high-performance various to bigger, extra resource-intensive options.