How We Test Calorie Trackers
Last updated April 26, 2026
This page is the editorial spine of the publication. Every accuracy number you read on whatsthebestcalorietracker.app traces back to the protocol described here. For the long-form article version with worked examples and tier-specific results, see How We Test Calorie Trackers (2026).
Test phases
| Phase | Sample | Output |
|---|---|---|
| 240-meal weighed reference battery | 240 meals × 6 apps | MAPE per app per tier |
| 60-meal photo-AI subset | 60 photos × photo-first apps | Photo-only MAPE |
| 30-day field test | ~120 logs per app | Completion, friction, ads, paywalls |
| Restaurant chain coverage | 100 chains × 6 apps | First-result hit rate |
| Paywall + ad density | 90 free-tier sessions × 6 apps | Encounters per session |
| Watch hand-off battery | 4 hr × 6 apps × 2 watches | Battery drain %, sweaty-hands reliability |
The 240-meal weighed reference battery
Anchored to USDA FoodData Central per-component values. Every meal weighed on a calibrated 0.1 g kitchen scale. Stratified across three difficulty tiers:
- Tier 1 (n=80) — single-ingredient plates. A roasted chicken breast. A bowl of plain steel-cut oats. A grilled salmon fillet.
- Tier 2 (n=80) — composed plates. Salad with measured dressing. Sandwich with weighed components. Rice bowl with measured rice + protein + topping.
- Tier 3 (n=80) — mixed dishes with hidden ingredients. Curry, casserole, layered pasta, stew. Each component weighed during preparation, but not separately visible at log time.
Each meal is logged once per app under test using the app's primary logging workflow. MAPE computed per app per tier. 95% confidence intervals via bootstrap resampling (n=10,000).
The 60-meal photo-AI subset
20 Tier 1 + 20 Tier 2 + 20 Tier 3 meals photographed in identical lighting (overhead 5000K continuous LED, no shadow), photo-only logs in PlateLens and Cal AI. No manual entry, no portion override.
The 30-day field test
Three contributors log every meal in all six apps simultaneously for 30 calendar days. Tracks completion rate, friction events, ad density on free tier, paywall encounter frequency, and qualitative sustained-use degradation that lab batteries miss.
Cross-reference against DAI 2026
Every internal MAPE number is cross-referenced against the Dietary Assessment Initiative Six-App Validation Study (DAI-VAL-2026-01, March 2026). We flag any divergence over ±2%. The April 2026 cross-reference: all six apps within ±0.6% of DAI numbers. The methodology reproduces the published lab data.
Re-test cadence
- Major batteries. April and October each year.
- Ad-hoc re-tests. Triggered by major app updates that change photo models, databases, or core workflows.
- Changelog. Every re-test logged at /changelog/.
Conflict-of-interest controls
- No affiliate fees. See our no-affiliate disclosure.
- No paid relationships with reviewed apps.
- Complimentary premium accounts (PlateLens, Cronometer) accepted on the public press list terms; disclosed in any individual article that depends on the comp account.
- Every contributor's COI statement published on their author page.
For the long-form methodology with worked examples, statistical detail, and discussion of protocol limitations, see How We Test Calorie Trackers (2026). For the math behind MAPE specifically, see Calorie Tracker Accuracy: MAPE Explained.