Special BriefingThematic (one-off; revisit on each AI/robotics index refresh)June 16, 2026

What the Product Is For — Robotics and AI at the Harm Frontier

Sort the 50 robotics labs and 50 AI labs not by rank but by what their core product is *for*, and one gradient appears in both indexes at once: defense, surveillance, and weapons cluster at the floor; healthcare, accessibility, and assistive technology cluster at the ceiling. Compassion Benchmark is the only institution that scores robotics labs at all — there is no comparator. This briefing examines what that gradient is actually measuring, and where conduct and purpose come apart.

Scope: The two technology indexes — 50 robotics labs (`robotics-labs.json`, by `category`) and 50 AI labs (`ai-labs.json`, by `sector`), 100 entities. The analysis re-sorts both by the *purpose of the core product* rather than by rank.

Cohort: 100 entities across the two technology indexes — 50 robotics labs, 50 AI labs. · The defense / surveillance / weapons cohort (9 entities) has a median composite of 20.3 (Critical/Developing). · The robotics healthcare / accessibility / assistive cohort (9 Exemplary entities) has a median of 83.0; AI healthcare and drug-discovery sit at 60.9. · All four entities at the absolute 0.0 floor across both indexes are product-defined: Ghost Robotics (weaponized quadrupeds), Palantir AI (military/ICE targeting), xAI/Grok (un-guardrailed model), Character AI (un-bounded companion bots).

If you remember one thing

In both technology indexes, the purpose of the product tracks the score. Re-sorted by end-use, robotics labs run from a Defense/Security median of 9.4 to a Healthcare/Accessibility median of 95.9; AI labs run from AI/Government (0.0) and AI/Surveillance (10.9) up to AI Safety/Research (70.3) and AI/Open Source (69.0). The gradient is the single most legible pattern in either index.

Key Findings

In both technology indexes, the purpose of the product tracks the score. Re-sorted by end-use, robotics labs run from a Defense/Security median of 9.4 to a Healthcare/Accessibility median of 95.9; AI labs run from AI/Government (0.0) and AI/Surveillance (10.9) up to AI Safety/Research (70.3) and AI/Open Source (69.0). The gradient is the single most legible pattern in either index.
Every entity at the absolute floor is there because of what it builds, not merely how it behaves. The four 0.0 entities — Ghost Robotics, Palantir AI, xAI/Grok, Character AI — share the identical all-minimum profile. In each case the floor designation rests on a product whose core function leaves no remediation surface to credit: a weaponized robot, a targeting system, a model stripped of guardrails, a companion bot deployed to minors without bounds.
The same lab, two products, a 45-point gap. Boston Dynamics appears twice: its research/industrial line scores 65.6 (Established), while its weaponized "SPOT-demo" defense entry scores 20.3 — a 45.3-point spread within one institution. Purpose, not corporate identity, is what moves the number.
Purpose flows into three specific dimensions — Boundaries, Accountability, Integrity. The floor designations name the same primary drivers: BND (refusing to restrict harmful use), ACC (no published safety evaluation, no harm-disclosure), INT (renouncing downstream-harm responsibility). The benchmark is not scoring "defense work" as a category; it is scoring the refusal-to-restrict, the absent accountability framework, and the values-conduct gap that cluster around weaponized and surveillance products.
A pro-social product is necessary but not sufficient at the top. The healthcare/accessibility ceiling is real, but it is reached on a narrow surface — a single assistive product line — without the whole-of-population test a state faces. Robotics is 26% Exemplary, far above any other index, partly because a narrow pro-social mandate satisfies the band easily.
The conduct-versus-purpose line is the live question. Anduril (AI/Defense, 31.3) and Moog (Defense/Industrial, 48.4) build for defense yet sit well above the floor, because they retain published policy, restriction, and accountability structure. The floor is reserved for product-purpose plus refusal-to-restrict plus absent accountability — not for defense work as such.
This is an unoccupied citation lane. No other institution scores robotics labs on harm accountability at all. The purpose-to-score gradient is a finding only this record can produce.

The field

1,156 entities across the five bands — the full distribution this briefing draws from.

Source: Compassion Benchmark · CC-BY

1. Frame

The Compassion Benchmark scores 50 robotics labs and 50 AI labs on the same eight dimensions it applies to states, corporations, and cities. Both indexes carry a field most others lack: a label for what the lab's core product is for — category in robotics (Healthcare/Accessibility, Defense/Security, Industrial, Education…) and sector in AI (AI/Healthcare, AI/Surveillance, AI/Defense, AI/Government…). This briefing ignores rank and re-sorts both indexes by that field, asking one question of the existing record:

Does the purpose of the product predict the compassion score — and if so, what is the benchmark actually measuring when it does?

The thesis is that it does, cleanly and in both indexes at once: products built to harm, surveil, or kill cluster at the floor; products built to restore mobility, assist, or care cluster at the ceiling. This is defensible — a weapon has no remediation surface, and the benchmark's floor test is precisely "no remediation surface to credit." But it raises a real tension. If the product floors an entity at the bottom and lifts it at the top, is the benchmark measuring compassion conduct or product teleology? The honest answer the record supports is: mostly conduct, channeled through purpose — purpose determines which conduct is even possible, and then conduct (refusal-to-restrict, absent accountability, renounced responsibility) decides where on the gradient an entity lands. Anduril and Ghost Robotics both build for defense; only one is at the floor.

A second, structural point: Compassion Benchmark is the only institution that scores robotics labs on harm accountability at all. There is no comparator index, no peer ranking, no external floor designation for a company like Ghost Robotics. Whatever this record says about the harm frontier of automation, it says alone.

2. The cohort — the purpose gradient, both indexes

Re-sorted by category / sector and reported as medians (both indexes are quantized — many entities share identical composites — so means would be misleading; see §6).

Robotics labs, by category (median composite)

Category band	Median	n	Range	Representative entities
Defense/Security	9.4	3	0.0–20.3	Ghost Robotics (0.0), Paladin AI/Shield AI (9.4), Boston Dynamics SPOT-demo (20.3)
Industrial/Defense	35.9	2	35.9	Kawasaki Heavy, Sarcos
Consumer / Labor	~32	4	23.4–35.9	Tesla Optimus (31.2), UBTECH (35.9), Hanson (34.4)
Industrial	48.4	8	23.4–60.9	Universal Robots, Omron, Figure AI
Research / Service	60.9	11	35.9–65.6	Boston Dynamics research line (65.6), Naver, Neura
Healthcare/Rehab	83.0	7	60.9–83.0	Cyberdyne, Ekso, ReWalk, Wandercraft
Education	85.0	1	85.0	Apexica (RoboKind)
Healthcare/Accessibility	95.9	3	83.0–97.5	Open Bionics (97.5), Ottobock (95.9), Kinova

AI labs, by sector (median composite)

Sector band	Median	n	Range	Representative entities
AI/Government	0.0	1	0.0	Palantir AI
AI/Surveillance	10.9	1	10.9	Clearview AI
AI/Consumer	21.9	3	0.0–48.4	Character AI (0.0), Replika (21.9), Inflection (48.4)
AI Research/Social	26.3	1	26.3	Meta AI
AI/Defense	31.3	1	31.3	Anduril
AI/Creative	32.8	5	21.9–35.9	Runway, Stability, Midjourney
AI/Healthcare	60.9	2	60.9	Abridge, Tempus AI
AI/Drug Discovery	60.9	2	60.9	Isomorphic Labs, Recursion
AI/Open Source	69.0	2	50.0–88.1	Hugging Face (88.1), Together AI
AI Safety/Research	70.3	2	59.1–81.4	Imbue (81.4), Anthropic (59.1)

The single most important structural fact: across both indexes, all four entities at the absolute 0.0 floor are product-defined, and all four carry the identical all-minimum profile ([1/1/1/1/1/1/1/1]):

Entity	Index	Product purpose	Floor designation date
Ghost Robotics	robotics	Weaponized quadrupeds (Defense/Security)	2026-05-06
Palantir AI	ai-labs	Mass-scale targeting / ICE enforcement (AI/Government)	2026-04-30
xAI/Grok	ai-labs	Un-guardrailed public LLM (AI Research)	2026-04-30
Character AI	ai-labs	Un-bounded companion bots, deployed to minors (AI/Consumer)	2026-05-07

The floor across these indexes is not "a very low score." It is the simultaneous collapse of all eight dimensions to the anchor minimum, which the canonical composite renders as exactly 0.0 — reserved by ruling for a product whose core function is the unremediable harm.

3. The floor four — the product is the harm

The defining feature of the bottom of both technology indexes is that the floored entities are not there for an isolated incident or an unadjudicated allegation. They are there because the product itself, by design, has no remediation surface. The floor designations are explicit on this point, and they name the same three primary drivers in every case: Boundaries (BND), Accountability (ACC), Integrity (INT).

Ghost Robotics (0.0, Defense/Security). The floor rationale cites "explicit refusal to restrict military payload use," refusal to sign the 2022 industry anti-weaponization pledge that six peer firms signed, a "sniper-rifle-equipped quadruped deployed to US-Mexico border," no published model or system card, and "entity-level renunciation of downstream-harm responsibility." Primary drivers: BND, ACC, INT. The fresh record confirms each load-bearing fact: the company's CEO position ("If it is a weapon that they need to put on our robot to do their job, we are happy for them to do that"), the October 2021 sniper-rifle quadruped, the February 2022 border deployment, and the company's pointed absence from the October 2022 Boston Dynamics-led pledge are all independently documented.

Palantir AI (0.0, AI/Government). The rationale cites "AI products built specifically to enable mass-scale lethal targeting and immigration enforcement," leadership rhetoric endorsing lethal use, no published model behavior or safety policy, and "no third-party accountability." "Composite resolves at zero because compassion infrastructure is absent by design." Primary drivers: AWR, EMP, ACC, EQU, INT.

xAI/Grok (0.0, AI Research). The rationale cites "deliberate removal of safety guardrails," public deployment of a model that "produces antisemitic and violent content on demand," founder-directed alignment toward propaganda objectives, and "zero functional accountability or evidence-of-care infrastructure." Primary drivers: AWR, EMP, ACC, INT.

Character AI (0.0, AI/Consumer). The rationale cites the Pennsylvania AG May 2026 enforcement action (a chatbot posing as a licensed psychiatrist with a fabricated license number), the January 2026 settlement of five wrongful-death/harm lawsuits — two involving minors who died by suicide — 20M+ monthly users with no model card or safety policy, and a "sustained reactive remediation pattern." Primary drivers: BND, ACC, EMP, INT. The fresh record confirms the January 7, 2026 mediated settlement covering the Sewell Setzer III case and four others.

The common thread is not the sector label. It is the combination the rulings keep naming: a harmful-by-design product, plus a structural refusal to restrict its use, plus the absence of any accountability framework. That is what crosses the floor — and it is conduct, expressed through and amplified by the purpose of the product.

4. How purpose flows into the dimensions

The benchmark does not have a "defense penalty." It has eight dimensions, and the purpose of a product determines which of them are even reachable. The harm-frontier entities fail in a recognizable, repeating shape — and it is concentrated in exactly the three dimensions the floor designations name.

Dimension	What it asks (per `dimensions.ts`)	How a harm-frontier product fails it
BND (Boundaries)	"Does this entity refuse harmful practices even when profitable? Does it decline with dignity, set scope, obtain consent?"	A weaponized or surveillance product is the refusal-to-set-a-boundary. Ghost Robotics' refusal to restrict payloads is a B4/B3 collapse by design.
ACC (Accountability)	"Does this entity own its failures, disclose performance and harm, make repair?"	No published model/system card, no third-party harm evaluation, no incident-disclosure process — the recurring ACC finding across all four floored entities.
INT (Integrity)	"Is conduct consistent regardless of who is watching; is the values-behavior gap acknowledged?"	"Entity-level renunciation of downstream-harm responsibility" (Ghost), founder-directed propaganda alignment (xAI) — direct I2/I4 failures.

The clearest demonstration that the benchmark is scoring purpose-channeled conduct rather than corporate identity is Boston Dynamics, which appears twice:

Entry	Category	Composite	BND	INT
Boston Dynamics (research line)	Research/Industrial	65.6 (Established)	3.5	4.0
Boston Dynamics (SPOT demo)	Defense/Security	20.3 (Developing)	1.5	1.5

The same institution, two products, a 45.3-point gap — concentrated precisely in BND and INT, the dimensions a weaponized deployment implicates. Boston Dynamics signed the 2022 anti-weaponization pledge; the gap is the cost of the deployment that pulls against that commitment. No corporate-identity variable could produce this; only product purpose does.

5. The ceiling — pro-social, but on a narrow surface

The top of the robotics index is a clean healthcare/accessibility cluster: Open Bionics (97.5, assistive prosthetics), Ottobock (95.9), and seven mobility/rehab firms (Cyberdyne, Ekso, ReWalk, Wandercraft, Kinova, et al.) at 83.0. The robotics Healthcare/Accessibility median is 95.9; Healthcare/Rehab is 83.0. In AI, the healthcare and drug-discovery sectors (Abridge, Tempus, Isomorphic, Recursion) sit at 60.9, and the safety/open-source sectors top out (Hugging Face 88.1, Imbue 81.4).

But the ceiling is reached on a narrower surface than the floor is reached, and this is the symmetric tension. A floored entity is judged on a refusal-to-restrict that runs through its whole conduct. A ceiling entity is, in several cases, an assistive-prosthetics company with a single intrinsically pro-social product line — and the band does not impose anything like a sovereign's whole-of-population test or a diversified corporation's stakeholder-breadth test. This is why robotics is 26% Exemplary (13 of 50) — far above any other index (the Fortune 500 is under 2%). A narrow pro-social mandate satisfies the band easily.

The reading the record supports: a pro-social product is necessary but not sufficient at the top. The healthcare cluster also carries real conduct evidence (consistent AWR/EMP/ACT/ACC at 4.0+). But the benchmark should be explicit that the top of these indexes is partly a product-category effect, not purely a demonstrated-compassion effect — the exact symmetric counterpart to the floor mechanic.

6. Conduct vs. purpose — the line, and the quantization caveat

The line is real, and the record draws it. Not every defense or surveillance entity is at the floor:

Entity	Sector/Category	Composite	Why above the floor
Moog Inc.	Defense/Industrial	48.4	Diversified industrial; defense is one line, not the renounced-responsibility posture
Kawasaki / Sarcos	Industrial/Defense	35.9	Industrial-primary; defense adjacent, not weaponized-product-defined
Anduril	AI/Defense	31.3	Defense-primary, but retains published structure; not a floor designation
Clearview AI	AI/Surveillance	10.9	Surveillance product, near-floor — but retains some detectable structure (not all-1)
Boston Dynamics SPOT-demo	Defense/Security	20.3	Weaponized demonstration, not refusal-to-restrict + renounced responsibility

The floor (0.0) is reserved for product-purpose + refusal-to-restrict + absent accountability framework, together. Anduril builds for defense and sits at 31.3 because it has not renounced downstream-harm responsibility wholesale or refused every restriction; Ghost Robotics has, and is at 0.0. That distinction — purpose alone does not floor you; purpose plus the refusal-and-absence conduct does — is the load-bearing one, and it is what keeps the gradient an analysis of conduct rather than a category penalty.

The caveat that disciplines every cluster number here: both indexes are heavily quantized. Many entities sit at pixel-identical composites (six robotics labs at exactly 83.0; clusters at 60.9, 48.4, 35.9) reflecting uniform-anchor profiles consistent with placeholder first-baselines rather than independently measured assessment. This is why this briefing reports medians and named leaders/laggards, never treats identical composites as independent measurements, and rests its argument on the named anchor entities (the floor four, the Boston Dynamics split, the Open Bionics/Ghost Robotics poles) rather than on cluster means. The gradient holds on the named entities regardless of how the quantized middle is eventually re-baselined.

8. Forward view — what to watch

Humanoid and automation scaling is the fastest 2026 deployment story, and the harm-frontier cohort is where it will be tested first. Tesla Optimus (31.2, Labor/Consumer) and Figure AI (37.5 in AI / 48.4 in robotics) are the labor-displacement entities to watch; a documented weaponization or surveillance contract at any of them is the most likely new floor-conversion event.
The conduct line at the defense midband. Anduril (31.3) and the Industrial/Defense cluster (Kawasaki, Sarcos, 35.9) are the entities where the conduct-vs-purpose distinction will be stress-tested: a refusal-to-restrict statement or a renounced-responsibility posture from any of them would test whether the floor test (Q1) holds as conduct rather than category.
Floor exits require product change, not statements. The Ghost Robotics and Character AI designations specify exit criteria — binding restrictions on use, published safety evaluation, an accountability framework, institutional acknowledgment. No floored technology entity has met them. A documented, audited product-level change at any of the four would be the most significant possible movement at the bottom of these indexes.
The unoccupied lane. Because no comparator scores robotics labs, any external adoption of a robotics harm-accountability standard — an industry pledge revision, a procurement rule, a regulator's safety-card requirement — would be the first external validation of the gradient this record describes alone.

Sources

Canonical scores (ground truth): site/src/data/indexes/robotics-labs.json, site/src/data/indexes/ai-labs.json — all category/sector medians, the floor-four roster and their floorDesignation rationales (Ghost Robotics, Palantir AI, xAI/Grok, Character AI), the Boston Dynamics research-vs-SPOT split, and the Anduril/Moog/Clearview conduct-line entries were recomputed directly from rankings[].
Dimension definitions: site/src/data/dimensions.ts (BND/ACC/INT subdimensions and anchors; band vocabulary; integration-premium explainer).
Theme brief / candidate analysis: docs/SPECIALBRIEFINGCANDIDATESMASTER2026-06-16.md (#3), docs/SPECIALBRIEFINGCANDIDATESDATA2026-06-16.md (Candidate 2, "What the Product Is For"), docs/SPECIALBRIEFINGCANDIDATESCOMPETITIVE2026-06-16.md (#5, "Robotics at the Harm Frontier").
Ruling corpus / methodology provenance: research/PENDING_CHANGES.md — (corporate floor / no-remediation-surface), (narrow-mandate Exemplary), (cross-index entity attribution).
Prior briefings (for the floor mechanic and Exemplary structure): research/special-briefings/floor-and-critical-2026-06-11.md (product-is-the-harm floor family), research/special-briefings/exemplars-2026-06-11.md (narrow-mandate Exemplary pattern).
Fresh web evidence (grounding the public claims):
Robot dog manufacturers urge militaries not to weaponise the technology — Forces News
Boston Dynamics and other firms pen open letter against weaponized robots — New Atlas
Exclusive: Boston Dynamics pledges not to weaponize its robots — Axios
Robot dog armed with sniper rifle unveiled at US Army trade show — Fox News
Homeland Security eyes robot dogs to patrol the southern border — TechCrunch
Israel's military is using robot dogs from Philly's Ghost Robotics — Billy Penn
Shield AI raises $2B at $12.7B for autonomous combat pilot Hivemind — The Next Web
Google, Character.AI to settle suits involving minor suicides and AI chatbots — CNBC
Character.AI and Google agree to settle lawsuits over teen mental health harms and suicides — CNN Business

How to read the scores

The 0–100 scale — five bands

Every entity — state, corporation, AI lab, robotics lab, or city — is scored 0–100 across 8 dimensions and 40 subdimensions. The composite score places the entity in one of five bands:

Critical0–20Foundational compassion practices are absent or documented active harm is present.

Developing20–40Some practices are emerging but remain inconsistent, reactive, or unevenly applied.

Functional40–60Core practices exist and meet a basic bar, with significant gaps remaining.

Established60–80Practices are systematic, documented, and supported by consistent evidence.

Exemplary80–100Practices are independently verified, consistent, and sustained under pressure.

The 8 dimensions

Each dimension is scored 1–5 across 5 subdimensions (40 subdimensions total), then converted to a 0–100 composite. A score of 1.0 on a subdimension represents the minimum anchor; 5.0 is exemplary conduct.

AWRAwarenessDoes this entity reliably detect when others are in pain or need — before they name it?

EMPEmpathyDoes this entity genuinely connect with the inner experience of those it serves?

ACTActionDoes compassionate understanding translate into real, proportional, effective help?

EQUEquityIs care distributed fairly — especially toward those with greatest need and least power?

BNDBoundariesIs helping sustainable, ethical, and autonomy-preserving — not dependency-creating?

ACCAccountabilityDoes this entity own its failures, correct course, and make genuine repair?

SYSSystemic ThinkingDoes compassion extend to root causes and structural change — not only symptom relief?

INTIntegrityIs compassion genuine, consistent, and non-performative — especially when it costs something?

Scores are based on public evidence — government reports, regulatory filings, independent audits, judicial findings, and verifiable third-party records. Entities never pay for inclusion, score changes, or suppression of findings. Full methodology

What the Product Is For — Robotics and AI at the Harm Frontier

Key Findings

The field

1. Frame

2. The cohort — the purpose gradient, both indexes

Robotics labs, by category (median composite)

AI labs, by sector (median composite)

3. The floor four — the product is the harm

4. How purpose flows into the dimensions

5. The ceiling — pro-social, but on a narrow surface

6. Conduct vs. purpose — the line, and the quantization caveat

8. Forward view — what to watch

Sources

The 0–100 scale — five bands

The 8 dimensions

Continue reading

America at 250: The Compassion Score of a Founding Promise

Famine as a Scored Event — One Hunger Evidence, Three Different Scores

Introducing the University Index — How We Score Universities on Compassion, Not Prestige

Aid Obstruction — When Institutions Stop Relief and Silence the Witnesses

The Denial Machine — When Coverage Becomes the Harm

The University Index — The Prestige–Compassion Gap

Allegation, Indictment, Ruling — How the Benchmark Scores Accusations vs Proof

The Equity Tax — The One Dimension That Drags Almost Everyone Down

The Middle of the Scale — What a 50 Actually Means

State of Exception — When Governments Codify Impunity

The State of Institutional Compassion — 2026

AI Governance Under Pressure — What a Shutdown, a Subpoena, and a Union Vote Actually Tell the Benchmark

Layoffs Despite Profits — When a Layoff Becomes a Compassion Failure

What Good Looks Like — Exemplars Across Entity Types

The Floor and the Critical Band — How the Benchmark Judges the Worst

Weekly score highlights — institutional compassion findings