I watched a user test last month. Banking app. Cairo.
Task: "Transfer 500 EGP to Ahmed."
The user opened the app. Navigated to transfers. Tapped the amount field. Stared at the keyboard.
Arabic keyboard on iOS. 28 keys. Numbers hidden behind a layer. English and Arabic mixed on contact names.
She switched to English keyboard. Typed "500". Switched back to Arabic. Typed "أحمد" (Ahmed). Autocorrect changed it. Fixed it. Submitted.
38 seconds.
We gave her the same app with voice enabled.
"حول 500 جنيه لأحمد" ("Transfer 500 EGP to Ahmed").
4 seconds.
The Arabic Keyboard Problem
Typing in Arabic on mobile is objectively slower than English.
Why:
A study we ran with 200 users:
Voice isn't just faster. It's 4x faster.
When Voice Wins
1. Repeat Transactions
"Order my usual." "Recharge 100 EGP." "Book my regular ride."
These are high-frequency, low-variance actions. Users know what they want. Voice removes friction.
Example: A taxi app in Dubai added voice booking. 60% of rides from repeat routes now use voice. Why? Because saying "home" is faster than typing your address.
2. Hands-Busy Scenarios
Cooking. Driving. Carrying groceries.
Users can't type. But they can talk.
Example: A recipe app added voice navigation. Users say "next step" while cooking. Engagement time went up 34%. Not because voice is magical—because typing with oily hands sucks.
3. Numeric Input
Phone numbers. Account numbers. OTP codes.
Arabic keyboards hide numbers behind a layer switch. English keyboards do too. Voice skips that.
Example: A telecom app added voice for recharge amounts. Users say "عشرين جنيه" ("twenty EGP"). No keyboard. No typos. Recharge completion rate went up 18%.
4. Forms (With Caveats)
Multi-field forms are slow to type. Voice can help—if done right.
Bad: "Please say your full name, then your phone number, then your address."
Good: "Tell me your name and phone number."
Users don't want to recite a form. They want to state their intent and move on.
When Voice Fails
1. Exploration & Discovery
Browsing products. Scrolling feeds. Window shopping.
Voice is linear. You can't "scroll" with your voice. You can't casually explore.
Example: A fashion app tried adding voice search. "Show me blue dresses." Users saw 200 results. Then what? They still had to scroll and tap. Voice added friction, not speed.
2. Precise Editing
Changing one word in a sentence. Fixing a typo. Adjusting a number.
Voice is terrible at precision. Users end up repeating the entire input.
Example: A note-taking app added voice input. Users loved it for long paragraphs. Hated it for short edits. You can't say "change 'Tuesday' to 'Wednesday'" and expect it to work.
3. Private Contexts
Coffee shops. Offices. Public transit.
Users won't say sensitive info out loud.
Example: A banking app added voice transfers. Adoption in public places: 12%. Adoption at home: 68%. Context matters more than features.
4. Complex, Multi-Step Flows
Booking a flight with 6 filters. Configuring app settings. Anything with branching logic.
Voice requires users to hold too much state in their head. Tapping is faster.
Example: A travel app tried voice booking. "Find me a flight to Dubai on Friday under $200 with one stop." Returned 40 flights. User still had to tap to compare. Voice didn't save time—it added a step.
The UX Trap
Most teams think: "Voice is cool. Let's add it everywhere."
That's wrong.
Voice isn't a feature. It's an input method. You don't add voice—you replace typing where typing sucks.
The test:
If the action takes more than 3 taps and involves typing, voice probably helps.
If it's exploration, editing, or context-sensitive, voice probably hurts.
What We Got Wrong
We built Voqal thinking voice would replace 80% of app interactions.
It doesn't.
Best case: 30-40% of actions become voice-first. The rest stay tap-based.
And that's fine.
Users don't want voice everywhere. They want voice where it's faster than typing.
In Arabic apps, that's a big surface area. But it's not infinite.
The Data
We analyzed 40,000 voice sessions across 8 apps in MENA.
| Voice Adoption | Use Cases |
|---|---|
| **High (>50%)** | Money transfers, Bill payments, Taxi booking (repeat routes), Food ordering (repeat orders), Phone recharges |
| **Low (<15%)** | Product search, Account settings, Form editing, Browsing feeds, Reading content |
The pattern: Voice wins for known actions. Fails for discovery.
What To Build
If you're adding voice to your app:
1. Start with one high-frequency action. Don't add voice to 10 screens. Add it to the one action users do daily.
2. Make voice optional, not required. Users should be able to ignore it. If voice fails, typing should be one tap away.
3. Optimize for repeat users, not first-timers. New users will tap. Power users will talk. Design for retention, not activation.
4. Test in public contexts. If users won't use voice on the metro, don't ship it.
5. Measure completion rate, not adoption rate. 90% of users trying voice means nothing if 60% give up halfway.
The Bottom Line
68% of MENA users prefer voice to typing in Arabic.
But only 12% of apps offer it.
Not because it's hard to build—because teams don't know when to use it.
Voice isn't the future of UX. It's the present for high-frequency, low-variance actions in Arabic.
The rest? Still better with taps.