InterviewAlly

Executive Statements

Databricks CEO: AI Infra Is Most Valuable Layer

Databricks CEO Ali Ghodsi makes the case that the data infrastructure layer will capture more long-term value than AI model makers or application developers in the emerging AI economy.

February 26, 2026 · 6 min read · Source: TechCrunch

Databricks · AI infrastructure · data platforms · Ali Ghodsi · enterprise AI · tech leadership

Server room with glowing blue lights and rows of data infrastructure hardware representing cloud computing

Ghodsi's Infrastructure Thesis

In a wide-ranging interview published February 26, 2026, Databricks CEO Ali Ghodsi laid out his vision for where lasting value will accrue in the AI technology stack. His argument is provocative and direct: the data infrastructure layer -- not the model providers, not the application builders -- will ultimately capture the most economic value in the AI era.

Ghodsi's reasoning centers on a simple observation: models are commoditizing rapidly, and applications are easy to replicate, but the infrastructure that manages, processes, and serves the data underlying all AI systems creates deep and durable competitive advantages.

"Everyone is focused on who has the best model this week. But models change every quarter. The company that controls the data layer -- how data is stored, governed, transformed, and fed to any model -- that company wins for decades."

-- Ali Ghodsi, CEO, Databricks

The Three-Layer AI Stack

Ghodsi presented a framework that divides the AI technology stack into three layers, each with different value capture dynamics. The bottom layer is compute infrastructure -- GPU clouds, chip makers, and hardware providers. The middle layer is data infrastructure -- data lakes, warehouses, governance platforms, feature stores, and vector databases. The top layer is models and applications -- foundation models, fine-tuned models, and the products built on top of them.

His argument is that the compute layer is capital-intensive but increasingly commoditized as more providers (AWS, Azure, GCP, Oracle, CoreWeave, Lambda) compete on price. The model layer is seeing rapid commoditization as open-source models approach proprietary model performance. But the data layer has natural lock-in effects: once an enterprise builds its data infrastructure on a particular platform, migration costs are enormous.

To support this thesis, Ghodsi cited Databricks' own financials. The company reached $2.4 billion in annualized revenue in Q4 2025, growing at 55% year-over-year. Gross retention (the percentage of revenue retained from existing customers, excluding expansion) stands at 97%, indicating extremely low churn. Net revenue retention exceeds 140%, meaning existing customers spend 40% more each year.

Why Models Are Commoditizing

Central to Ghodsi's thesis is the belief that AI models are on an inevitable path toward commoditization. He pointed to several data points: the gap between the best open-source model (currently Llama 4 from Meta) and the best proprietary model (Anthropic's Claude or OpenAI's GPT-5) on standard benchmarks has narrowed from roughly 25 percentage points in 2023 to less than 5 points in 2026.

Furthermore, the proliferation of model training frameworks, training data, and compute access means that the barriers to building competitive models are falling. Ghodsi estimates that within 18 months, the top 10 foundation models will be functionally equivalent for 90% of enterprise use cases.

"When every model can do everything, the differentiator becomes the data you feed it. That's our layer. That's where the moat is."

-- Ali Ghodsi, CEO, Databricks

Not everyone agrees with this view. Several prominent AI researchers have pushed back, arguing that model architecture innovations and training techniques still create meaningful differentiation. Anthropic CEO Dario Amodei has repeatedly stated that safety and alignment research creates durable advantages that can't be commoditized.

Enterprise AI Adoption Validates the Thesis

Enterprise AI adoption patterns lend some credibility to Ghodsi's position. According to McKinsey's 2026 Global AI Survey, the top barrier to AI adoption cited by enterprises is not model quality (ranked 5th) but data quality and data infrastructure readiness (ranked 1st, cited by 67% of respondents). Companies report that the most time-consuming and expensive part of any AI initiative is not building or fine-tuning models, but preparing, cleaning, governing, and serving the data those models need.

Databricks has capitalized on this dynamic with its Unity Catalog for data governance, Delta Lake for reliable data storage, and Mosaic ML (acquired in 2023) for enterprise model training. The combination allows enterprises to manage their entire AI lifecycle on a single platform, creating significant switching costs.

The company's latest product, announced in January 2026, is an AI agent platform that allows enterprises to build, deploy, and monitor AI agents that operate on their proprietary data. The platform has already attracted over 200 enterprise customers in private preview, according to Ghodsi.

Implications for AI Careers

If Ghodsi's thesis proves correct, the career implications are significant. Data engineering, data platform development, and infrastructure engineering would be among the most valuable and in-demand skill sets in the AI economy -- potentially more so than ML research or application development. Roles focused on data governance, data quality, and MLOps would command premium compensation as enterprises invest heavily in their data foundations.

This perspective aligns with what hiring managers at large enterprises are already reporting. A survey by Dice found that data engineer roles saw a 38% increase in job postings in 2025, outpacing the growth of ML engineer roles (28%) and data scientist roles (15%). For professionals preparing to enter or advance in the AI infrastructure space, practicing system design interviews that cover distributed data systems is essential -- tools like InterviewAlly provide targeted practice for these technical rounds.

Databricks vs. the Field

Databricks is not alone in pursuing the AI data infrastructure opportunity. Snowflake, its primary competitor, has made aggressive moves into AI with its Cortex platform and acquisition of Neeva. Google BigQuery, Amazon Redshift, and Microsoft Fabric are all adding AI-native features to their data platforms. Upstart competitors like MotherDuck, Firebolt, and StarRocks are targeting specific niches.

The competitive intensity underscores the value of the market Ghodsi is describing. With Databricks now valued at $62 billion following its December 2025 funding round, and Snowflake's market capitalization hovering around $55 billion, investors are clearly betting that the data infrastructure layer will be worth hundreds of billions of dollars in the AI era.

Whether Ghodsi's prediction that data infrastructure becomes the "most valuable layer" proves correct will likely take several more years to resolve. But the early evidence -- enterprise spending patterns, vendor financials, and hiring trends -- all point in the same direction: the data layer is where the real AI action is.