The AI industry is experiencing a seismic shift. Five years ago, the bottleneck was computing power and algorithms. Companies competed on who could build the best ML models. Today, that’s table stakes. The new bottleneck is data-specifically, high-quality domain-specific data that trains AI systems to actually work on real problems.
This shift has created an explosion in demand for data operations expertise. Venture capital is pouring into data services companies. Margins are expanding. Customer acquisition costs are dropping. And valuations are soaring. Surge AI reportedly reached $1 billion ARR with just 110 employees. That’s not sustainable unit economics-that’s the economics of something genuinely valuable that’s impossible to scale with traditional approaches.
This trend is reshaping how organizations build AI. Instead of treating data as a commodity-“get whatever labels you can afford”-sophisticated organizations are treating data as strategic. They’re partnering with expert data operations firms that understand their domain, can maintain quality guarantees, and can adapt flexibly to their evolving needs.
For organizations serious about AI, the question isn’t whether to invest in data operations expertise. It’s who to partner with.
The Explosion of AI Consulting and Data Services
The numbers tell the story. Five years ago, data labeling and annotation were small, fragmented markets dominated by low-cost providers. Today, the market is consolidating around quality providers. Startups are raising Series A and B rounds specifically for “AI data infrastructure” or “labeling operations.”
What’s driving this explosion?
Frontier AI Requires Better Data: The companies training the largest language models and most advanced AI systems have discovered that data quality directly predicts model quality. Scaling to 70B parameter models doesn’t help if training data quality is poor. Cutting corners on data preparation is false economy. Organizations training frontier models are willing to pay premium prices for expert data operations because the alternative-poor-quality models-is worse.
Domain-Specific AI Requires Domain Expertise: General-purpose AI models trained on internet-scale data work reasonably well for general tasks. But domain-specific AI-for medical imaging, financial forecasting, robotics, regulatory compliance-requires training data that reflects real-world domain complexity. This data can’t be sourced from mechanical turk or crowdsourcing platforms. It requires annotators who understand the domain.
Quality Standards are Increasing: Early AI projects had modest quality requirements. 85% accuracy was acceptable. Today, organizations deploying AI in high-consequence domains (healthcare, finance, autonomous systems) require 95%+ accuracy. Maintaining this level of accuracy at scale requires expert infrastructure.
Regulatory Pressure: As AI is regulated more heavily, organizations need to document their data processes and be able to audit and explain their training data. This pushes them toward specialized partners who can provide compliance-ready operations.
Vertical Expertise Premium: Organizations will pay 2-3x more for a data partner who understands their specific vertical than for a generalist. An annotation provider who understands financial services terminology, regulatory requirements, and domain-specific nuances is worth far more than a generic annotation provider.
Why Surge AI Reached $1B ARR with 110 Employees: The Quality Over Volume Model
Surge AI is a data services company that reportedly reached $1 billion ARR with an extremely lean team. How?
The traditional annotation services business model is volume-based: hire lots of annotators, do lots of labeling, charge per label. Margins are thin because labor is commoditized. You compete on price. To grow, you need to hire more annotators proportionally.
Surge took a different approach: quality over volume. They positioned themselves as a premium data services provider for frontier AI organizations. Instead of doing millions of cheap labels, they did thousands of high-quality labels for organizations building the most advanced AI systems.
The economics transform:
- Premium pricing: Customers pay 5-10x more because the quality and specialization are so high
- Flexible engagement: Rather than contracts for millions of labels, engagements are for outcomes (e.g., “achieve 95%+ accuracy on your validation set”)
- High leverage: With expert annotators and rigorous processes, very few people can deliver what would take 100+ generalist annotators
- Defensible positioning: It’s hard to compete on quality. Easy to compete on price. Surge’s positioning is defensible.
This model has profound implications: if Surge achieved $1B ARR with 110 employees, that implies $9M ARR per employee. In traditional annotation services, the norm is maybe $100-200K ARR per employee. The difference is massive and reflects the fundamental shift from commodity labor to expert operations.
Other successful companies are following similar playbooks. Snorkel AI raised massive funding for data labeling infrastructure. Scale AI raised funding for data services. Companies are shifting from competing on cost to competing on quality and specialization.
The Shift from Commodity Annotation to Expert Consulting
This evolution is happening in real time. The old model is becoming untenable:
Phase 1: Commodity Annotation (2015-2019)
Annotation was simple: hire people, have them label data, deliver results. Companies competed entirely on price and speed. Profit margins were thin. Quality variability was high.
The value proposition was: “We can do a lot of labeling quickly and cheaply.”
Phase 2: Managed Services (2020-2022)
As companies realized that cheap labeling created poor models, they started buying managed services. Partners would handle annotation end-to-end, including quality assurance and process management. Pricing increased. Margins improved.
The value proposition was: “We’ll manage your annotation and guarantee quality.”
Phase 3: Expert Consulting (2023-Present)
Leading organizations now buy expert consulting on their data strategy. Rather than “do this annotation,” it’s “help us build high-quality training data for our specific domain and use case.”
Partners now:
- Advise on what data to collect and label
- Design annotation guidelines
- Manage quality assurance
- Provide domain-specific expertise
- Adapt to the customer’s evolving needs
- The value proposition is: “We’ll help you build world-class training data.”
This shift from commodity to expertise is fundamental. It changes what customers want, how much they’ll pay, and what competitive advantages matter.
What Frontier AI Labs Actually Need from Data Partners
To understand where the market is heading, it helps to understand what the most sophisticated AI organizations actually need.
Domain Expertise: A company building AI for medical imaging doesn’t need generic annotation. They need partners who understand radiology, know what features radiologists care about, understand edge cases in medical imaging, and can catch errors that a generalist would miss.
Quality Guarantees: “We’ll achieve 95%+ accuracy on your test set” is a meaningful commitment. Organizations building frontier AI are willing to pay for this guarantee because the cost of poor-quality training data is so high.
Flexibility and Iteration: AI development is iterative. Today you need labels for one type of data. Next month, you’ve discovered a new edge case category and need labels for that. Data partners need to adapt quickly rather than requiring long procurement cycles.
Outcomes Alignment: Instead of “we’ll label 1M items,” it’s “we’ll help you achieve your accuracy target within your timeline and budget.” This outcome-alignment makes the data partner a true partner rather than a vendor.
Process Transparency: Organizations want visibility into how their data was labeled, what quality checks were applied, and what the accuracy actually is. Black-box annotation services are becoming unacceptable.
Regulatory Compliance: Depending on the domain, customers need partners who understand regulatory requirements and can ensure that data handling meets compliance standards.
Scaling Capability: As models scale, data requirements scale. Partners need ability to grow with customers, not hit capacity limits.
The Managed Outcomes Model: Outcome-Aligned SLAs vs. Hourly Billing
The old model: “Pay us $X per label.”
The new model: “We’ll achieve this outcome. You only pay if we succeed.”
Outcome-aligned SLAs are revolutionary because they:
- Align Incentives: The data partner only wins if the customer succeeds. This creates alignment that hourly billing never achieves.
- Reduce Risk for Customers: If a data partner fails to hit quality targets, they don’t get paid. Customers have downside protection.
- Reward Efficiency: A partner who can achieve outcomes with less work increases their margin without increasing customer costs. The incentive is to be efficient, not to maximize billable hours.
- Enable Flexible Engagement: Since payment is outcome-based, engagements can be more flexible. Need faster turnaround? You’ll pay more. Need to adjust scope? Easy to renegotiate based on new outcomes.
Outcome-aligned SLAs require partners to have:
- Confidence in their ability to deliver outcomes (which means having robust processes and experienced teams)
- Willingness to take on risk (if they fail, they don’t get paid)
- Deep understanding of their customer’s business (to set realistic, achievable outcomes)
This model is increasingly standard in leading companies. When BergLabs engages with frontier AI organizations, we typically propose outcome-aligned engagements rather than labor-based pricing.
How BergLabs Positions in This New Landscape as a Full-Stack Partner
As this market has evolved, we’ve positioned BergLabs not as an annotation provider but as a full-stack AI operations partner. This positioning reflects both the market evolution and our own evolution as an organization.
Rather than specializing in any single service (annotation, moderation, reconciliation), we position ourselves as partners across the entire data and operations lifecycle:
Upstream: We help customers think through data strategy, what to collect, how to prepare it, and what quality targets make sense.
Core Operations: We execute high-quality data operations-annotation, labeling, content moderation, quality assurance-at scale.
Downstream: We help customers operationalize their AI systems in production, including monitoring, quality assurance, and continuous improvement.
This full-stack positioning is valuable because it recognizes that annotation doesn’t exist in isolation. It’s part of a larger lifecycle. Customers want partners who understand that lifecycle and can optimize across the entire flow.
The Six Pillars: Scale, Speed, Savings, Quality, Specialization, Support
BergLabs differentiates on six dimensions that matter to sophisticated customers:
Scale: Ability to handle millions of labels, large teams of annotators, and global operations. When you need to accelerate your timeline, we can scale capacity in days, not months.
Speed: Beyond just scale, we can compress timelines. Most annotation vendors deliver in 8-16 weeks. We can often deliver in 4-8 weeks without sacrificing quality because of our infrastructure and experienced teams.
Savings: Full-stack approach reduces overhead for customers. Instead of coordinating with multiple vendors, customers have one partnership. Instead of building internal annotation infrastructure, they leverage ours. The total cost is typically lower than the alternative.
Quality: Our quality assurance infrastructure-gold sets, multi-layer QA, real-time monitoring, expert feedback-maintains 95%+ accuracy at scale. We guarantee quality outcomes, not just effort.
Specialization: We’ve developed deep expertise in specific verticals (e-commerce, UGC, PropTech, BFSI, robotics). Vertical expertise enables us to deliver superior results because we understand domain-specific nuances.
Support: We operate 24/7 across global centers. We provide transparent communication, real-time dashboards showing progress, and direct access to expert teams who understand customer needs.
These six pillars differentiate us from both low-cost commodity providers and boutique specialists who excel in one area but lack scale or operational breadth.
The Future of Data Operations as Strategic Function
Looking ahead, data operations is becoming more strategic, more important, and more central to AI success. Organizations will increasingly:
- Invest in data as a core competency
- Partner with specialized firms for execution rather than building entirely in-house
- Demand outcome-aligned partnerships with clear quality guarantees
- Expect deep domain expertise from their partners
- Require full transparency and auditability
- Integrate data operations into their product development cycle
The winners in this market will be organizations that can offer full-stack capabilities, maintain high quality at scale, specialize in specific verticals, and align incentives with their customers.
