Primary data · sourced from public filings·700+ listed companies · India-first·
Open screener
ἀλήθεια · aletheiaAncient Greek for truth — literally “un-forgetting”: the act of revealing reality, not merely stating it
← All posts
Sector Thesis·4 min read·Week 26

Voice Tech Growth Stage: Team Structure That Survives Scaling

Voice tech in India is at an inflection point. 285 million vernacular internet users need voice-first products. Scaling from 20 to 100 people demands ruthless team structure choices. Culture drift kills more voice startups than bad tech.

ByAmit Tyagi·Fitoor Capital
Aletheia Insights · Weekly

Get 1 unfair insight every week from India's startup ecosystem.

Read by serious founders and investors. No fluff.

The Inflection Moment

285 million vernacular internet users exist in India today. Most use phones with ≤2GB RAM. Your architecture for Mumbai doesn't scale to Meerut. Team structure must reflect this reality, not San Francisco org charts.

Voice tech is uniquely local. A Hindi speaker evaluates your UX differently than an engineer in Bangalore. By the time you're at 50 people, your Tier-2 user feedback gets filtered through three layers. It arrives distorted.

The Span of Control Problem

At 20 people, your founder manages product, growth, and part of engineering. This works because decisions are synchronous. Everyone sits in one room or one Slack thread.

At 50 people, that model collapses. You need a VP of Product, VP of Eng, and a Growth Lead. But spans matter more than titles.

Keep engineering pods to 5-7 people maximum. One senior engineer leads acoustics. Another leads server-side ranking. Another owns mobile SDK. Each reports to a tech lead, not directly to your VP Eng. This sounds hierarchical. It isn't. It's clarity.

A VP with 15 direct reports becomes a bottleneck within six months. Your acoustic engineer waits two weeks for architecture guidance. Meanwhile, a competitor ships a vernacular ASR model trained on regional accents.

The VP Eng Inflection

You'll hire a VP Eng between 30-40 people. This hire defines your next 3-4 years.

Wrong hire: Someone from a consumer app who optimizes for velocity. They flatten structure, move fast, break things. Voice tech breaks differently. An ASR model trained on bad data breaks quietly. Inference latency creeps up 40ms at a time. These aren't visible in sprint reviews.

Right hire: Someone with systems thinking. Telecom background, ML infra experience, or voice-adjacent (think Jio's infrastructure builders). They ask uncomfortable questions: "What's our inference SLA?" "How do we version acoustic models?" "What's our fallback when the API times out in Punjab?"

Your VP Eng will either embed data rigor into culture or enable cowboy engineering. There's no middle ground at this stage.

Culture Drift Is Real

Voice products require patience. Your Tier-2 user has a 4G connection that drops three times per call. Optimization feels pointless to a new hire from IIT who last used 5G in Hyderabad.

By the time you're 80 people, half your team has never shipped a voice product. They optimize for metrics they understand: DAU, sessions, engagement. These are wrong proxies for voice.

Your retention at day-7 might be 25%. Your retention at day-30 might be 40%. That gap isn't product. It's users learning your app works when connection is stable.

You need ritualized founder presence in product calls. Not as oversight. As a grounding force. The founder asks: "Would a Lucknow user understand this error message?" This question is not scalable. Embed it anyway.

Practical Structure at 50-70 People

Think in pods, not departments:

Core Platform Pod: 5 people. ASR, NLP, ranking. Owned by your best engineer.

Mobile SDK Pod: 4 people. Integration, offline sync, battery optimization. Owned by someone who's shipped voice apps.

Backend Reliability Pod: 4 people. Inference serving, fallbacks, monitoring. Not exciting. Essential.

Product & Analytics: 3-4 people. One person owns Tier-2 user research (travels quarterly). One owns metrics rigor.

Growth: 4-6 people. But 50% of their time supports product discovery through voice usage data, not paid channels.

Notice: No department has more than 7 people reporting to one person. Each pod has a clear decision-maker.

The Founder's Remaining Job

Delegate product, not vernacular voice design.

Your VP Product can own roadmap. You own user research methodology. You decide how feedback from Indore changes a model trained on Mumbai data.

This seems inefficient. It's actually the constraint that keeps you honest.

The Hard Implication

If you want to scale voice tech to 10 million users, accept that your org chart at 70 people will look different from a FinTech or B2B SaaS at the same scale. Your VP Eng won't manage 15 people. Your product lead will spend 20% of time on decisions that don't fit a spreadsheet.

This is not dysfunction. This is structure aligned to the problem.

Hire for it explicitly.

Amit Tyagi

Founder, AletheiaAI & GP, Fitoor Capital

Veteran of India's startup ecosystem. Writing about fundraising, investor psychology, and what it takes to build fundable startups in India.

Run a fundability check

India's only MRE-backed platform for founders and investors. Analyse your deck, find investors, and validate your raise strategy.

#voice-tech#scaling#team-structure#india-startups

Don’t miss the next one

One insight every week. No fluff.

Aletheia Insights · Weekly

One contrarian insight. Every week. No generic startup advice.

Join founders and investors building with better information.