Be part of our each day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Be taught Extra
Microsoft is bringing much more database choices into the Microsoft Material fold, alongside a collection of initiatives that purpose to assist deal with enterprise information complexity.
For actually generations of databases, compute and storage had been at all times tightly coupled. That brought on every kind of scalability and information silo points for enterprises. In 2023, Microsoft Material was first launched as a technique to assist overcome that problem. The essential thought behind Microsoft Material is to be a standard information layer throughout Microsoft’s information and analytics instruments. In November 2024, Microsoft Material expanded with assist for the Azure SQL transactional database platform.
Microsoft, identical to its rivals at Google at Amazon, has loads of totally different database platforms. Whereas Azure SQL is extensively used, on the subject of AI there may be one other extra influential database platform and that’s CosmosDB. On the Construct 2025 convention in the present day, Microsoft is saying that CosmosDB is lastly coming to Microsoft Material. CosmosDB is among the many most important databases in use in the present day for AI as it’s the database that’s on the basis for OpenAI’s ChatGPT service. CosmosDB can be getting a lift through integration with Azure AI Foundry, giving extra direct entry for agentic AI to information.
There are additionally a collection of extra information updates together with assist for Microsoft Copilot within the PowerBI enterprise intelligence platform. SQL Server 2025 database is being previewed and the DiskANN (Disk Approximate Nearest Neighbor) vector index is being open sourced.
These improvements immediately tackle the combination complexity that plagues enterprise information groups when constructing AI purposes. A key focus is to get rid of the info fragmentation that hampers enterprise AI initiatives.
“After I discuss to prospects, the message I persistently get is, please unify, I’m Chief Data Officer, I don’t wish to be the Chief Integration Officer serving to translate AI into my aggressive benefit,” Arun Ulag, Company Vice President for Azure Information at Microsoft, informed VentureBeat.
Material accelerates enterprise AI by eliminating information silos
Microsoft Material, the corporate’s unified information platform, continues its fast progress trajectory by bringing beforehand separate merchandise collectively in a cohesive ecosystem.
“We’re bringing all of our merchandise collectively and unifying them right into a single product, which is Microsoft Material,” Ulag mentioned. “In some methods, you possibly can take into consideration Material as virtually like what we did with Workplace 30 years in the past.”
This technique has clearly resonated with enterprises. Ulag mentioned that Microsoft Material now has over 21,000 organizations as paying prospects worldwide, together with 70% of the Fortune 500.
“It’s rising very, in a short time,” he mentioned.
CosmosDB in Material eliminates NoSQL infrastructure overhead
The headline addition to Material is CosmosDB, Microsoft’s NoSQL doc database that powers many high-profile AI purposes.
“CosmosDB is, by far, usually turning into the database of alternative for the world’s AI workloads,” Ulag mentioned. “ChatGPT itself is constructed on CosmosDB… Walmart’s e-commerce retailer runs on CosmosDB as effectively.”
By bringing CosmosDB into Material, Microsoft allows organizations to deploy NoSQL databases with out managing complicated infrastructure. A key problem of getting a disaggregated compute and storage method is sustaining efficiency with out latency.
Microsoft has taken very particular technical steps to take care of efficiency via an modern caching system.
“Inside Material, we preserve a extremely performant cache, which handles all of the quick updates that CosmosDB does,” Ulag defined. “We’ve a really quick synchronization mechanism that’s utterly clear to the shopper, the place the info is replicated in close to real-time into OneLake.”
This method delivers millisecond response occasions required for AI purposes whereas eliminating infrastructure administration duties.
Why open supply information codecs are key to Material’s success
Whereas Microsoft connects all its information merchandise via the Material technique, OneLake know-how truly shops the info.
There’s super complexity in having a unified information lake that handles a number of totally different information varieties and codecs from SQL, NoSQL and unstructured information. It’s a problem that Microsoft is fixing with an open supply method.
“Microsoft has utterly embraced open supply information codecs, so all the pieces in Material, no matter whether or not which workload it’s, by default, is at all times in Apache Parquet and Delta Lake,” Ulag mentioned.”It’s actually a unified product, with the unified structure and a unified enterprise mannequin, with all the information sitting in a worldwide SaaS information lake, which is OneLake in open supply information codecs.”
This optimization means all Material providers, from SQL to Energy BI to CosmosDB, can entry the identical underlying information with out conversion or duplication, eliminating the standard efficiency penalty related to open codecs.
DiskANN open supply launch brings enterprise-grade vector search to all
Microsoft isn’t simply utilizing open supply for information codecs, it’s additionally contributing its personal code too.
At Construct, Microsoft is saying that it’s open sourcing the DiskANN vector search know-how. Microsoft’s choice to open supply DiskANN represents a major contribution to the AI ecosystem, making enterprise-grade vector search capabilities out there to all builders.
“We’ve a really, very sturdy vector functionality referred to as DiskANN, it was initially created in Microsoft Analysis, and it’s utilized in Bing… constructed into CosmosDB and constructed into Material,” mentioned Ulag.
DiskANN implements approximate nearest neighbor (ANN) search algorithms optimized for disk-based operations, making it ultimate for large-scale vector databases that exceed reminiscence limitations. By open sourcing DiskANN, Microsoft allows builders to implement the identical high-performance vector search utilized by ChatGPT and different main AI purposes. This helps tackle one of many key challenges in constructing retrieval-augmented era (RAG) programs, the place discovering semantically comparable content material shortly is crucial for grounding AI responses in enterprise information.
“We’re permitting all people to have the ability to get the advantages of the vector retailer that we’re utilizing internally,” Ulag mentioned.
Why it issues for enterprise information leaders
For enterprises main in AI adoption, these bulletins allow extra subtle purposes that seamlessly combine a number of information varieties.
The complexity and the challenges of coping with information silos aren’t nearly totally different places however totally different codecs too. The continued evolution of Microsoft Material immediately addresses that concern in a approach that no different hyperscaler is doing in the present day.
The main target and dedication to open supply requirements on the core can be necessary for enterprises because it removes some lock-in danger that will be current if the info was caught in proprietary codecs.
As enterprises more and more compete on AI capabilities, Microsoft’s unified method removes a major barrier to innovation. Organizations that embrace this integration can shift their focus from sustaining complicated information pipelines to creating AI purposes that ship tangible enterprise worth—probably outpacing opponents nonetheless combating fragmented architectures.