AI Governance Under Pressure — What a Shutdown, a Subpoena, and a Union Vote Actually Tell the Benchmark
In a single fortnight, the US government forced Anthropic to pull its two most powerful models, 42 state attorneys general subpoenaed OpenAI, and Google DeepMind's UK staff voted to unionize over military AI. The benchmark scores how institutions recognize and reduce suffering — not how much external pressure they attract. This briefing examines what each of those events does, and does not, say about an AI lab's compassion score.
Scope: The 50-entity AI Labs index, plus the six big-tech AI actors carried in adjacent indexes (Microsoft, Alphabet/Google, Amazon, Meta — Fortune 500; OpenAI, Anthropic, DeepMind/Google, Meta AI, Amazon AWS AI — ai-labs). The analysis centers on the four labs that took the cycle's heaviest external pressure: Anthropic (59.1), OpenAI (27.5), DeepMind/Google (56.9), and the Alphabet/Microsoft/Amazon big-tech cluster.
Cohort: 50 labs in the AI Labs index — mean composite 43.6; 4 Exemplary, 8 Established, 15 Functional, 18 Developing, 5 Critical (3 at the 0.0 floor: xAI/Grok, Palantir AI, Character AI). · The four pressure-tested labs: Anthropic 59.1 (Functional), DeepMind/Google 56.9 (Functional), OpenAI 27.5 (Developing), and the Alphabet/Google corporate entity 40.0 (Developing). · Big-tech AI cross-carry: Microsoft 65.3 (Established), Alphabet/Google 40.0, Amazon 12.8 (Critical), Meta Platforms 7.8 (Critical) — all Fortune-500 composites.
If you remember one thing
Pressure arrived faster than practice matured. In a single fortnight the four marquee labs absorbed a government-mandated model shutdown, a 42-state subpoena, and a worker union vote — none of which moved a published composite. External governance is now testing these labs faster than their internal compassion infrastructure is changing.
Key Findings
- Pressure arrived faster than practice matured. In a single fortnight the four marquee labs absorbed a government-mandated model shutdown, a 42-state subpoena, and a worker union vote — none of which moved a published composite. External governance is now testing these labs faster than their internal compassion infrastructure is changing.
- A government-compelled shutdown is not self-inflicted harm. Anthropic disabling Fable 5 and Mythos 5 under a US Commerce export-control order is scored on Anthropic's *conduct in the event* — prompt compliance, public disclosure, an apology, and stated disagreement — which is mildly positive on Accountability. It held at 59.1 (one sub-threshold −0.3 day), not downgraded.
- A 42-state subpoena is an allegation, not a ruling. The coalition probe into OpenAI — naming model sycophancy a "design flaw" and targeting marketing to minors and seniors — is the cycle's broadest accountability signal, but it is pre-adjudication. It applied downward sub-dimension pressure (a conservative reconstruction of 25.9) that stayed below the scoring threshold; OpenAI held at 27.5.
- The same word — "pressure" — covers three different evidentiary bars. A signed export order (operative, compelled), a subpoena (alleged, unadjudicated), and a union vote (worker-voice signal, prospective) are not interchangeable evidence. The benchmark scores operative conduct, discounts allegations, and treats worker-voice as a forward indicator — and the record applies all three distinctions in this one cycle.
- Refusal that costs money reads as integrity; acceptance of the same terms reads as a gap. Anthropic's refusal of Pentagon terms (a documented +1 Integrity signal even as the Pentagon labeled it a supply-chain risk) and OpenAI's acceptance of terms Anthropic refused (a −1.7 Integrity-gap downgrade) are the clearest demonstration that the benchmark scores conduct under pressure, not the pressure itself.
- The compassion gap inside big tech is wide. Microsoft (65.3, Established) took a documented human-rights accountability step; Amazon (12.8) and Meta Platforms (7.8) sit in the Critical band of the Fortune 500. The label "AI lab" spans a 57-point range once the parent corporations are read alongside the model builders.
- Worker voice is now a named pressure channel, not yet a scored one. The DeepMind/Google union vote (a reported 98% of ~300 London staff) targets military AI and Project Nimbus. It is currently a forward indicator — DeepMind/Google holds at 56.9 — and the open question is what, if anything, converts worker-voice into a scorable accountability signal.
The field
1,156 entities across the five bands — the full distribution this briefing draws from.
1. Frame
The Compassion Benchmark scores how an institution recognizes, responds to, and reduces suffering. For an AI lab, that is a slow-moving thing to demonstrate: model cards, red-teaming, refusal ethics, harm-incident transparency, equitable access. In mid-June 2026, the external governance environment around the largest US labs moved much faster than any of that internal practice could. Inside a single fortnight (June 1–15):
- the US Commerce Department issued an export-control directive forcing Anthropic to suspend its two most powerful models for all foreign nationals, and Anthropic disabled them for all users to comply;
- a coalition of 42 state attorneys general served OpenAI with a subpoena naming model sycophancy a "design flaw" and targeting its marketing to minors and seniors — four days after OpenAI's confidential IPO filing;
- Google DeepMind's UK staff voted overwhelmingly to unionize over the company's military-AI contracts and the Project Nimbus deal with Israel;
- and the running backdrop — Pentagon contracting fights, the EFF's human-rights pressure on Google and Amazon, Microsoft's contrasting accountability step, and the EU AI Act's Digital Omnibus — kept the structural pressure on.
The central tension this briefing examines: what does a government-mandated shutdown, a 42-state subpoena, or a union vote actually say about a lab's compassion score? The answer the record gives is disciplined and counter-intuitive: in this cycle, none of these events moved a published composite. That is not the benchmark ignoring them. It is the benchmark applying three different evidentiary rules — conduct-vs-coercion, the pre-adjudication discount, and worker-voice-as-forward-indicator — to three different kinds of pressure. This briefing asks three questions of the existing record:
- Conduct vs. coercion — is a compelled shutdown (Anthropic) scored the same as self-inflicted harm? (No — and §3 shows why.)
- The pre-adjudication discount — does a 42-state subpoena (OpenAI) move a score? (Not yet — §4.)
- Worker-voice and military-AI accountability — does a union vote (DeepMind) or contracting posture register? (As a forward indicator and an integrity signal, not a composite move — §5.)
The thesis: the benchmark is currently absorbing a faster, more adversarial governance environment with rules that were already in place — and it is holding the line that pressure is evidence to be classified, not a verdict to be scored. The strain is showing not in the scores but in the questions the cycle raises about where the discounts end.
2. The cohort
Recomputed directly from the index JSONs. The AI Labs index holds 50 entities, mean composite 43.6, distributed as: 4 Exemplary, 8 Established, 15 Functional, 18 Developing, 5 Critical — of which 3 are at the 0.0 floor (xAI/Grok, Palantir AI, Character AI). The pressure-tested labs and their big-tech parents:
| Entity | Index | Composite | Band | Weakest dimensions | Pressure event this cycle |
|---|---|---|---|---|---|
| Microsoft | fortune-500 | 65.3 | Established | EMP 3.2, EQU 3.2, INT 3.0 | Human-rights accountability step (suspended services) |
| Anthropic | ai-labs | 59.1 | Functional | EQU 3.1, SYS 3.2, INT 3.2 | Govt export-control shutdown (Fable 5 / Mythos 5) |
| DeepMind/Google | ai-labs | 56.9 | Functional | ACC 3.0, INT 3.0, BND 3.2 | UK staff union vote; Project Nimbus / Pentagon |
| Alphabet/Google | fortune-500 | 40.0 | Developing | INT 2.1, EQU 2.4, EMP 2.5 | EFF "acknowledged risks, ignored responsibilities" |
| Amazon AWS AI | ai-labs | 35.9 | Developing | EMP 2.0, BND 2.0, INT 2.0 | EFF Nimbus pressure (parent) |
| OpenAI | ai-labs | 27.5 | Developing | INT 1.7, ACC 1.9, BND 2.0 | 42-state AG subpoena |
| Meta AI | ai-labs | 26.3 | Developing | ACC 1.5, INT 1.8, EMP 2.1 | (parent Meta Platforms 7.8, Critical) |
| Amazon | fortune-500 | 12.8 | Critical | EMP 1.4, ACC 1.4, AWR 1.5 | EEOC findings (pre-adjudication) |
| Meta Platforms | fortune-500 | 7.8 | Critical | ACC 1.0, EMP 1.1, EQU 1.2 | Platform-harm cluster |
The single most important structural fact about this cohort: the four labs that took the heaviest external pressure this fortnight (Anthropic, OpenAI, DeepMind/Google, and the Alphabet parent) span a 31.6-point range (27.5 → 59.1), and not one of them moved a published composite on the pressure event. The amount of governance pressure an entity attracts is, on this record, uncorrelated with its compassion score — which is exactly what the methodology intends. A lab can be high-scoring and heavily pressured (Anthropic) or low-scoring and heavily pressured (OpenAI); the pressure is an input to be classified, not a score in itself.
A second structural fact: read alongside their parents, the big-tech AI actors span a 57.5-point range — Microsoft (65.3, Established) to Meta Platforms (7.8, Critical). "Big tech AI" is not a band; it is the full width of the scale.
3. Conduct vs. coercion — the Anthropic shutdown
On June 12, 2026, Commerce Secretary Howard Lutnick wrote to Anthropic CEO Dario Amodei directing that Fable 5 and Mythos 5 be subject to export controls for all foreign persons, inside or outside the US, after another company reportedly demonstrated a jailbreak of Mythos. To comply, Anthropic disabled both flagship models for all customers on June 12–13. The naive read — "the government just shut down a lab's most powerful products" — sounds like a maximal negative event. The benchmark's read is the opposite of naive, and it is the cleanest live demonstration of the conduct-vs-coercion rule in the corpus.
The benchmark scores Anthropic's behavior in the event, not the government's action or market sentiment about it. Anthropic's conduct was, verbatim:
- Compliance: "We are complying with the government's legal directive and are removing access to Fable 5 and Mythos 5 for all users."
- Apology: "We apologize for this disruption to our customers."
- Stated disagreement (transparency): "We disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people."
That conduct profile — same-day compliance, public disclosure of the order, an apology with no legal deflection, and a transparent statement of disagreement — is mildly positive on Accountability/Transparency (AB1 Harm Acknowledgment, AB3 Transparency), not negative. The June 13 assessment confirmed the composite at 59.1, nudging only ACC (3.5 → 3.6) without changing the composite. The history shows one sub-threshold day: a −0.3 documented on June 14 (58.8, still Functional), then a hold back at 60.0/59.1. A compelled shutdown that the entity handles transparently is net-neutral-to-mildly-positive, because the harm (loss of model access) was inflicted by the government, and the only scorable surface is how Anthropic responded.
This is the same logic the benchmark applies to states: OCHA evidence of harm in Gaza is attributed to Israel's conduct, not the Palestinian Authority's, and so it reinforces but does not lower Palestine's score. Harm caused to an entity by an external actor is not harm caused by that entity. The Anthropic case is the AI-lab instance of that attribution rule.
The longitudinal record makes the contrast sharper still. Anthropic has spent most of the last month at the exact Functional/Established boundary (60.0), held there by an unresolved DC Circuit appeal (Anthropic v. Hegseth) over the Pentagon's designation of Anthropic as a supply-chain risk for refusing to remove safety guardrails. On May 29 the record logged a +1 Integrity signal precisely for that refusal — the lab maintained a costly safety posture under direct government and commercial pressure. The export-control shutdown sits inside that same arc: a lab repeatedly absorbing external coercion while keeping its own conduct intact. The pressure is intense; the conduct, on the record, holds.
4. The pre-adjudication discount — the OpenAI subpoena
On June 12, 2026, a coalition of 42 state attorneys general, led by New York AG Letitia James, served OpenAI with a subpoena — four days after its confidential S-1 IPO filing (June 8). The subpoena's scope is broad and squarely compassion-relevant: consumer and health data handling, marketing to vulnerable populations (children and seniors), age verification, safety-testing policies, and the behavioral properties of OpenAI's models — including model sycophancy named as a design flaw. A separate Florida civil suit (June 1) alleges ChatGPT validated a 16-year-old's suicidal ideation and supplied self-harm methods.
This is the broadest accountability signal in OpenAI's record. It is also, by methodology, an allegation under investigation, not an adjudicated finding — and that distinction is load-bearing. The June 15 reassessment is the cleanest live application of the pre-adjudication (Tier) discount in the corpus:
- The probe applies genuine downward pressure on the right dimensions — a conservative reconstruction nudges ACC (1.9 → 1.7) for harm-acknowledgment / marketing-to-vulnerable-populations and AWR (2.2 → 2.0) for anticipatory awareness of harm to minors — yielding a reconstructed composite of 25.9.
- That is a −1.6 delta, below the 5-point scoring threshold. The 42-state probe is a sub-dimension intensifier within the Developing band, not a scorable composite move. OpenAI holds at 27.5.
The ruling tag in the assessment frontmatter states it plainly: ALLEGATION-NOT-ADJUDICATED — sub-dimension pressure (ACC/AWR) within band, no scorable composite move. The watch conditions are pre-registered: any AG enforcement action, settlement, or adjudicated finding — as distinct from the current investigative subpoena — converts to a scorable ACC/INT downgrade, as would the Florida suit reaching adjudication.
This is consistent with how the benchmark treats corporate enforcement generally: Amazon's EEOC findings this same cycle (pregnant-worker February, disabled-worker April) were held as "pre-adjudication and consistent with the 17.8 Critical profile," and Oracle's 30,000-layoff coercive-severance event was scored only once the structure (sign-or-forfeit) was documented, not on allegation alone. The subpoena is the AI-lab instance of the same FILED-BUT-UNADJUDICATED discipline that keeps the Fortune-500 corporate cluster jammed just above the Critical line until a merits ruling lands.
The tension worth naming: OpenAI is already low (27.5, Developing, with INT at 1.7 — near the Critical band). Its score got there through adjudicated and structural conduct, not allegation — a −1.7 Integrity-gap downgrade (May 20) for accepting Pentagon terms Anthropic refused, and a −1.9 (May 27) for a documented failure-to-report leadership override. The subpoena is consistent with that trajectory but cannot, by itself, drive it lower. The pre-adjudication discount is doing real work here: it is the difference between an entity whose conduct is scored down and an entity that merely attracts a high-profile allegation. OpenAI is, on the record, both — but only the former moves the number.
5. Worker voice and military-AI accountability — DeepMind, Nimbus, and the big-tech split
The third pressure channel is the one the benchmark currently treats most cautiously: worker voice. Around 300 London-based Google DeepMind employees voted (a reported 98% in favor) to unionize with the CWU and Unite, demanding that DeepMind's Gemini models be blocked from military uses, that the Pentagon classified-network contracts end, and that the $1.2 billion Project Nimbus deal with Israel be terminated. The workers also sought an independent ethics oversight body and a right to refuse to contribute to projects on moral grounds. Management faced a 10-day window to respond voluntarily or face legal proceedings.
DeepMind/Google holds at 56.9 (Functional), with its weakest dimensions ACC (3.0), INT (3.0), and BND (3.2) — precisely the dimensions a worker revolt over military AI implicates (accountability for end-use, integrity of stated values, boundary-setting on harmful applications). The union vote is currently a forward indicator, not a scored event: it is a strong internal signal that workers perceive a values-conduct gap, but it is not, by itself, an adjudicated finding of harm or a change in the lab's operative conduct. The open question is what converts worker-voice into a scorable signal — a management refusal, a contract termination, or a documented retaliation would each register differently.
The military-AI accountability theme runs across the big-tech cluster and exposes a wide compassion split that the "AI lab" label conceals:
- Microsoft (65.3, Established) took a documented step the benchmark already logged as a positive-watch: per EFF, it "suspended certain services after initial investigations raised serious concerns" about misuse of its cloud/AI infrastructure, and its Israel chief departed "amid an ethical controversy." Note the score is held sub-threshold within Established under COMPELLED-REMEDY-NOT-SELF-CORRECTION — the remedy is credited, but as a compelled response, not a self-initiated one. Microsoft's INT (3.0) is its own weakest dimension; the human-rights step is the kind of conduct that, if sustained and self-initiated, would test the Established/Exemplary boundary.
- Alphabet/Google (40.0, Developing) and Amazon (12.8, Critical) are the EFF's named counter-example: the title of the April analysis is "Google and Amazon: Acknowledged Risks, and Ignored Responsibilities." Google's own internal assessments reportedly warned of Project Nimbus risks before signing; neither company responded substantively, and Amazon "fail[ed] to even acknowledge the request." That posture is consistent with their low scores — Alphabet's INT (2.1) is its weakest dimension; Amazon sits in the Critical band.
- The contrast is the finding. Under the same external pressure (military-AI contracting, a human-rights spotlight), Microsoft took a (compelled) remedial step and Google/Amazon did not — and the scores already encode that divergence. Worker voice (DeepMind) and civil-society voice (EFF) are pushing on the same INT/ACC dimensions across all three; the benchmark is registering the responses, not the pressure.
Forward view — what to watch
- The DC Circuit ruling (Anthropic v. Hegseth). This is the single highest-value forward trigger for the top of the AI-labs cohort. A favorable ruling credits Anthropic's costly refusal of Pentagon terms as an I1 governance signal and triggers the Functional/Established crossing (59.1 → 60.0+); an adverse ruling produces downward movement. It directly tests (how compelled vs. self-initiated conduct is scored at the top of the band).
- The OpenAI subpoena's adjudication arc. Any of the 42 AGs filing an enforcement action, or the Florida suit reaching settlement or a finding, converts the current sub-threshold pressure (reconstruction 25.9) into a scorable downgrade that would push OpenAI toward the Critical band (INT already 1.7). This is the live test of . IPO timing — the subpoena landed four days after the S-1 — makes the disclosure environment more consequential.
- DeepMind management's response to the union vote. A voluntary recognition, a refusal, a contract change, or any retaliation each register differently and are the live test of . The 10-day response window makes this the fastest-moving of the three pressure channels.
- The big-tech human-rights split. Whether Microsoft sustains its (compelled) accountability step toward self-initiated conduct — and whether Google/Amazon move at all under continued EFF and worker pressure — is the medium-term arc that would test the Established/Exemplary boundary (Microsoft) or deepen the Developing/Critical positions (Alphabet/Amazon).
- The EU AI Act Digital Omnibus. A structural regulatory backdrop rather than an entity event: as enacted obligations land, they convert from prospective posture (currently net-neutral) into operative compliance conduct that the benchmark can score — the regulatory analog of the conduct-vs-coercion rule, applied at scale.
Sources
- Canonical scores (ground truth):
site/src/data/indexes/ai-labs.json(50-entity roster, band distribution, Anthropic 59.1, DeepMind/Google 56.9, OpenAI 27.5, the three floor designations);site/src/data/indexes/fortune-500.json(Microsoft 65.3, Alphabet/Google 40.0, Amazon 12.8, Meta Platforms 7.8). All cohort counts and dimension vectors were recomputed directly fromrankings[]and reconcile with the canonical composite formula (site/scripts/lib/scoring.mjs). - Assessment corpus (provenance):
research/assessments/anthropic-2026-06-13.md(conduct-vs-coercion treatment of the export-control shutdown; ACC 3.5→3.6, composite confirm 59.1);research/assessments/openai-2026-06-15.md(ALLEGATION-NOT-ADJUDICATED ruling; reconstruction 25.9, −1.6 sub-threshold). - Daily-research artifacts:
research/scans/2026-06-13-assessor-summary.jsonandresearch/scans/2026-06-15-assessor-summary.json(Anthropic government-compelled net-neutral note; OpenAI 42-state subpoena confirm; Amazon EEOC pre-adjudication; Oracle coercive-severance). - Longitudinal context:
site/public/data/history/anthropic.json(DC Circuit boundary-watch arc, May 29 +1 Integrity for guardrail refusal, June 14 sub-threshold −0.3);site/public/data/history/openai.json(May 20 −1.7 Integrity-gap on Pentagon terms, May 27 −1.9 failure-to-report). - Ruling corpus:
research/PENDING_CHANGES.md(COMPELLED-REMEDY-NOT-SELF-CORRECTION for Microsoft; MILITARY-AI-BY-CONTRACT-GOVERNANCE; FILED-BUT-UNADJUDICATED discipline; Microsoft positive-watch 2026-06-07). - Fresh web evidence (fetched, verbatim quotes ≤50 words):
- Anthropic statement — anthropic.com/news/fable-mythos-access (compliance, apology, disagreement quotes).
- Export-control directive context — Axios, Time, Fortune.
- OpenAI 42-state subpoena — TechCrunch, TechTimes (sycophancy "design flaw").
- DeepMind/Google union vote & Project Nimbus — Fortune.
- Big-tech human-rights split — EFF: "Google and Amazon: Acknowledged Risks, and Ignored Responsibilities"; EFF: "Microsoft Took a Step Toward Human Rights Accountability".
How to read the scores
The 0–100 scale — five bands
Every entity — state, corporation, AI lab, robotics lab, or city — is scored 0–100 across 8 dimensions and 40 subdimensions. The composite score places the entity in one of five bands:
The 8 dimensions
Each dimension is scored 1–5 across 5 subdimensions (40 subdimensions total), then converted to a 0–100 composite. A score of 1.0 on a subdimension represents the minimum anchor; 5.0 is exemplary conduct.
Scores are based on public evidence — government reports, regulatory filings, independent audits, judicial findings, and verifiable third-party records. Entities never pay for inclusion, score changes, or suppression of findings. Full methodology
Continue reading
June 16, 2026
Allegation, Indictment, Ruling — How the Benchmark Scores Accusations vs Proof
In a single fortnight, OpenAI was hit by a 42-state attorney-general subpoena and its score did not move; Oracle's documented severance terms moved it into the Critical band. That is not inconsistency — it is the discipline that keeps the benchmark citable. This briefing examines six entities to show the exact line the record draws between what is alleged and what is proven, and between conduct an institution chose and conduct a government forced on it.
Read briefingJune 16, 2026
The Equity Tax — The One Dimension That Drags Almost Everyone Down
The benchmark scores eight dimensions of institutional conduct. One of them — Equity, the fair distribution of care toward those with the greatest need and least power — is the weakest score for nine of every ten entities assessed, from authoritarian states to model corporations. This briefing measures that pattern across all 1,156 entities, shows the exact mechanism by which a single weak equity score caps an otherwise strong profile, and asks what it means that the institutions which get everything else right still fail the most vulnerable.
Read briefingJune 16, 2026
The Middle of the Scale — What a 50 Actually Means
The benchmark's two foundational briefings spent the extremes: the 23 at the floor and the 64 at the top, together under 9% of the field. But almost every entity a reader looks up — their employer, their city, their country — lives in the vast Developing and Functional middle. This briefing is the on-ramp: what a middling score actually measures, why a balanced 50 and a spiky 50 are not the same thing, and why the "boring" middle is the hardest band to read.
Read briefingJune 16, 2026
State of Exception — When Governments Codify Impunity
A cluster of governments is not falling to the bottom of the scale through single atrocities. It is legislating its way there — converting emergency powers, "extremist" designations, and election repression into durable, signed-into-law impunity. This briefing tracks that pattern across the Critical-band countries and examines its sharpest case: Bolivia's descent from 28.4 to 6.3 across four scoring cycles, the benchmark's first sequence in which a predicted trigger was named in advance and then realized.
Read briefingJune 16, 2026
The State of Institutional Compassion — 2026
This is the first comprehensive read on how institutions worldwide recognize, respond to, and reduce suffering. Across seven indexes, 1,156 institutions — every kind, from sovereign states to single-product labs — are scored on one shared 0–100 framework. The headline is sobering and consistent: the modal institution is mediocre, the tails are thin, and almost every institution on Earth, from the worst to the very best, is weakest at the same thing — fairness to those with the least power. This is the state of the field as of mid-2026.
Read briefingJune 16, 2026
What the Product Is For — Robotics and AI at the Harm Frontier
Sort the 50 robotics labs and 50 AI labs not by rank but by what their core product is *for*, and one gradient appears in both indexes at once: defense, surveillance, and weapons cluster at the floor; healthcare, accessibility, and assistive technology cluster at the ceiling. Compassion Benchmark is the only institution that scores robotics labs at all — there is no comparator. This briefing examines what that gradient is actually measuring, and where conduct and purpose come apart.
Read briefingJune 15, 2026
Layoffs Despite Profits — When a Layoff Becomes a Compassion Failure
A 2026 Fortune 500 restructuring wave is testing a boundary the benchmark is only beginning to price: the difference between a layoff forced by distress and a layoff that protects margin while profits rise. Two cases set the new anchors — Procter & Gamble, downgraded out of the top tier for cutting 7,000 jobs "despite increasing profits," and Oracle, dropped into the Critical band for a 30,000-person cut wrapped in a "sign the release or forfeit your severance" ultimatum. This briefing examines what separates a Boundaries-neutral business decision from a scorable harm.
Read briefingJune 11, 2026
What Good Looks Like — Exemplars Across Entity Types
The same 0–100 scale that judges the worst also names the best. At the top, 64 entities across states, corporations, AI and robotics labs, and cities reach the Exemplary band. This briefing asks what high compassion actually looks like in the record — what dimension profile produces it, whether it is earned the same way across entity types, and why even the best institutions share a single, universal soft spot.
Read briefingJune 11, 2026
The Floor and the Critical Band — How the Benchmark Judges the Worst
A single 0–100 scale ranks states, corporations, AI and robotics labs, and cities together. At the bottom, that shared scale meets four entity types that fail in structurally different ways — and reach the bottom by different mechanics. This briefing examines the 176 entities in the Critical band and the 23 at the absolute floor, and asks what the record actually shows about how the worst are judged.
Read briefingRelated daily briefing
June 15, 2026 — daily benchmark
Cite this briefing
Copy-ready citation string for journalism, research, or academic use.
Compassion Benchmark. "AI Governance Under Pressure — What a Shutdown, a Subpoena, and a Union Vote Actually Tell the Benchmark." compassionbenchmark.com/updates/special/ai-governance-2026-06-15. Accessed [Month Year]. Independent — entities never pay for inclusion, score changes, or suppression of findings.
For methodology, see compassionbenchmark.com/methodology. Data terms: /data-licenses. Press resources: /media.
You just read a Special Briefing.
Weekly score highlights — institutional compassion findings
The week's top score movements and evidence-linked findings across 1,156 entities, delivered every Friday. Daily briefings publish on the site. Free.
No spam. Unsubscribe anytime. Your email is never shared.