Methodology — the Xodexa Humanoid Index (XHI)
v1.0.0The XHI is a transparent, reproducible composite score from 0–100 for general-purpose humanoid robots. The design goal is to be defensible and hard to game: every input is an observable, published spec; every transform is documented here; and the pillar weights reflect where the field agrees value is actually created — autonomy and manipulation dominate, while raw locomotion, long since solved enough to be table stakes, no longer wins on its own.
Pillar weights
| Pillar | Weight | |
|---|---|---|
| Autonomy & Intelligence | 25% | |
| Manipulation & Dexterity | 20% | |
| Mobility & Locomotion | 15% | |
| Hardware & Engineering | 15% | |
| Commercial Readiness & Deployment | 15% | |
| Ecosystem & Viability | 10% |
Why these weights
- Autonomy & Intelligence — 25%. The single biggest open problem and value driver. A teleoperated robot is a puppet; a task-autonomous one is a worker.
- Manipulation & Dexterity — 20%. Useful work in human spaces is bottlenecked on hands.
- Mobility & Locomotion — 15%. Necessary but no longer the differentiator it was in the ASIMO era.
- Hardware & Engineering — 15%. DoF, actuation modernity, payload-to-weight, onboard compute.
- Commercial Readiness — 15%. Real paid pilots beat announcement videos.
- Ecosystem & Viability — 10%. Capital decides who survives to iterate — weighted lowest because money ≠ capability.
The anti-hype rule: teleoperation is capped
The most common way humanoid demos mislead is by showing a human-piloted robot as if it were autonomous. XHI scores the demonstrated autonomy mode on an explicit ladder, and teleoperation sits near the floor. This is why a polished consumer robot that relies on remote VR operators ranks below a plainer machine that genuinely does its own work.
| Autonomy level (base score) | Points |
|---|---|
| research | 20 |
| teleoperated | 35 |
| supervised-autonomy | 70 |
| task-autonomous | 92 |
Bonuses: +8 if the robot ships a named end-to-end / vision-language-action (VLA) / foundation policy; +5 for dedicated onboard AI compute. Capped at 100.
How each pillar is computed
- Autonomy = autonomy-ladder base + AI/compute bonuses.
- Manipulation = hand-DoF (55%, vs a 22-DoF human-hand reference) + payload (30%) + dexterous-hands flag (15%).
- Mobility = walk speed (55%, vs 2.5 m/s) + runtime (45%, vs 5 h).
- Hardware = total DoF (50%, vs 60) + payload-to-weight efficiency (25%) + actuation modernity (electric > hybrid > hydraulic) + onboard compute.
- Commercial = maturity ladder + 5 pts per named deployment (max +20).
- Ecosystem = funding + valuation, with a viability floor for deep-pocketed corporate parents.
Commercial maturity ladder
| Status (base score) | Points |
|---|---|
| research | 10 |
| retired | 15 |
| prototype | 28 |
| pilot | 52 |
| limited-production | 76 |
| commercial | 94 |
Corporate-backed viability floor applies to: Boston Dynamics, Honda, Hyundai, LG, Samsung, Tesla, XPeng, Xiaomi.
Normalisation reference points
Sub-metrics are min-max normalised against frontier reference values, so a perfect 100 means "at or beyond the best demonstrated humanoid", not merely best in this list. A missing spec scores 0 for that sub-metric (absence is treated as informative, never imputed).
app/ranking.py and via the methodology API, so anyone can re-weight and recompute.