Compassion Benchmark

Viewing archive: Apr 15

Back to latest
Daily evidence briefing · 2026-04-15

Daily Evidence Briefing

Evidence-linked score assessments, sector intelligence, and emerging risks from overnight research across all published benchmark indexes. Each finding is sourced from primary evidence — litigation records, regulatory filings, investigative reporting, and international legal instruments.

1,155Entities scanned
8Entities assessed
4Score changes
4Scores confirmed
3 band changes proposed tonight

How this works

Every night, research agents scan all 1,155 benchmarked entities for new evidence across litigation, regulatory filings, investigative reporting, and international legal instruments. Flagged entities receive full 40-subdimension assessments.

Score changes are proposals, not automatic updates. A human analyst reviews all proposals before published scores change. Confirmations — where research affirms the published score is accurate — are documented alongside changes.

Score movements

Entities with significant evidence-based score movement from overnight research. Each card is a dossier entry.

OpenAI

Ai LabsAppliedhigh confidence
60.840.6
-20.2 pts
establishedfunctional

New Yorker investigation alleging 'pattern of lying,' dissolved superalignment team, FL AG probe, whistleblower suppression, and DEI removal collectively warrant significant downgrade from Functional to Developing

Evidence record
  1. 1
    New Yorker investigation (100+ sources, secret Sutskever memos) alleges CEO Sam Altman exhibits 'consistent pattern of lying' about safety commitments; superalignment team received 1-2% compute vs promised 20%
  2. 2
    Florida AG investigation into ChatGPT's alleged role in FSU mass shooting, examining platform safety, CSAM enablement, and foreign adversary data risks
  3. 3
    Whistleblowers filed SEC complaint alleging OpenAI illegally barred employees from raising safety risks with regulators; whistleblower Suchir Balaji died by suicide
  4. 4
    DEI commitment page scrubbed from website (Jan 2025); only 16% of technical roles held by women
  5. 5
    70+ copyright lawsuits allege training on content without consent; ongoing litigation in Southern District of New York

xAI/Grok

Ai LabsAppliedhigh confidence
18.32.2
-16.1 pts
critical

Triple-front institutional failure: CSAM deepfake scandal (3M+ images, 23K minors), NAACP environmental racism lawsuit over unpermitted gas turbines in majority-Black communities, and mass departure of engineers/co-founders — the most acute harm profile in the AI Labs index

Evidence record
  1. 1
    Grok CSAM deepfake scandal: estimated 3M+ sexualized images generated Jan 2026, ~23K involving minors, triggering regulatory actions across EU, UK, California, and bans in Malaysia, Indonesia, Philippines
  2. 2
    NAACP + Earthjustice Clean Air Act lawsuit filed April 14, 2026 over 27 unpermitted gas turbines in majority-Black Memphis/Mississippi communities — potential 1,700 tons/year NOx, environmental racism pattern
  3. 3
    At least 11 engineers and 2 co-founders departed Feb 2026; Musk admitted company 'not built right from the foundations up'; TechCrunch reported safety is 'dead' at xAI
  4. 4
    No system cards, no transparency reports, no safety reports published; researchers from OpenAI and Anthropic publicly called safety culture 'reckless' and 'completely irresponsible'
  5. 5
    Five active civil lawsuits including Tennessee teen class action, Baltimore city suit, and Amsterdam court injunction; EU DSA formal proceedings opened

Johnson & Johnson

Fortune 500Appliedhigh confidence
48.427.5
-20.9 pts
functionaldeveloping

Third bankruptcy attempt to suppress 90,000+ talc lawsuits collapsed; $1.5B single-plaintiff verdict; decades of product safety concealment; Accountability dimension collapses from 37.5 to 10.0 — warrants band change from Functional to Developing

Evidence record
  1. 1
    Third bankruptcy attempt to settle 90,000+ talc lawsuits collapsed in March 2025 — court ruled proposed $9B settlement did not meet legal standards; J&J announced it would not appeal
  2. 2
    $1.5B verdict in December 2025 (Baltimore mesothelioma case) — largest single-plaintiff talc verdict; $966M California verdict; $40M California ovarian cancer verdict (2 plaintiffs)
  3. 3
    Pattern of product safety concealment: internal documents showed company aware of asbestos in talc products for decades; three bankruptcy attempts represent systematic avoidance of accountability
  4. 4
    Settlement talks scheduled April 2026 only after exhausting all bankruptcy strategies and losing at trial; 90,000+ claimants still uncompensated after decades
  5. 5
    J&J Credo states 'patients first' but three bankruptcy attempts to suppress cancer claims directly contradict stated values — most severe values-action gap in healthcare sector

Israel

CountriesAppliedhigh confidence
27.88.8
-19 pts
developingcritical

72,289 Palestinian deaths in Gaza, UNRWA banned since March 2025, ICJ declared occupation unlawful, ICC arrest warrants for war crimes, West Bank displacement accelerating — warrants band change from Developing to Critical

Evidence record
  1. 1
    72,289 Palestinians killed in Gaza since Oct 7, 2023; 172,040 injured; 1,079 killed in West Bank including 235 children — scale of harm dramatically exceeds any prior assessment period
  2. 2
    UNRWA banned from bringing international staff or aid into Gaza since March 2025; 127 facilities within Israeli-militarized zones; Kerem Shalom sole crossing creating massive humanitarian bottleneck
  3. 3
    ICJ advisory opinion (July 2024) declared occupation, settlements, and annexation unlawful; ICC issued arrest warrants for PM Netanyahu on war crimes and crimes against humanity charges (Nov 2024)
  4. 4
    West Bank displacement accelerating: 1,697 Palestinians displaced in Q1 2026 alone — more than all of 2025; 38 communities emptied by settler violence
  5. 5
    Security cabinet ratified measures facilitating expanded settlement land registration in Feb 2026 — accelerating annexation despite ICJ ruling; ICC investigation challenge rejected by appeals chamber Dec 2025

These findings arrive in your inbox every Monday. Free.

Source intelligence

Primary-source alerts from overnight scanning. Each alert is linked to original regulatory filings, court records, investigative reports, and international legal instruments.

AI Labs — xAI/Grok

xAI is now facing simultaneously: (1) global CSAM/deepfake regulatory actions and civil suits across US, EU, UK, and SE Asia; (2) an NAACP + Earthjustice Clean Air Act lawsuit filed April 14, 2026 over illegal gas turbines in majority-Black Memphis communities. This is the most acute multi-front harm profile of any AI lab in the index. Already scored Critical (18.3/100). Fresh evidence suggests score may need further downward revision.

xai-grok
  1. 1.cnbc.com
  2. 2.earthjustice.org
  3. 3.cnbc.com

Countries — Active Humanitarian Emergencies

Four countries at maximum severity: Sudan (world's largest humanitarian catastrophe, entering year 4 of war, April 15 reporting), Myanmar (5-year coup anniversary, junta atrocities surging), Russia (ongoing war crimes documentation, 5th year of full-scale Ukraine invasion), and Israel/Gaza (72,265 Palestinian deaths confirmed, humanitarian access blocked). Ethiopia also deteriorating rapidly with US aid cuts accelerating hunger crisis. These are all critical-band entities with overwhelming current evidence.

sudanmyanmarrussiaisraelethiopiachina
  1. 1.npr.org
  2. 2.hrw.org
  3. 3.ohchr.org
  4. 4.ochaopt.org

Fortune 500 — Corporate Accountability

Three new major Fortune 500 stories distinct from previous scan: (1) Boeing secured non-prosecution agreement avoiding criminal charges for 737 Max deaths, with Fifth Circuit sealing that outcome March 31, 2026; (2) J&J's third bankruptcy attempt to suppress 90,000 talc lawsuits collapsed, $1.5B verdict in December; (3) Wells Fargo's $85M settlement for fake diversity interviews heads to final approval May 2026. Exxon also faces landmark Supreme Court climate accountability case this fall.

boeingjohnson-amp-johnsonwells-fargoexxon-mobilchevron
  1. 1.usnews.com
  2. 2.lawsuit-information-center.com
  3. 3.hrdive.com

Robotics Labs — Deployment Safety

Boston Dynamics Atlas entering commercial production (CES 2026 announcement). Company explicitly publishing safety blueprint for humanoid deployment with fenceless human-detection systems — potential positive model for sector. Sector still lacks ISO standard for dynamically balancing legged robots (years from finalization). EU AI Act classifies as high-risk with Aug 2026 enforcement. No acute safety incident found but baseline assessment overdue.

boston-dynamicsfigure-ai
  1. 1.automate.org
  2. 2.digitimes.com

Get the full benchmark report

Daily briefings surface headline findings. Full benchmark reports include complete methodology documentation, all 40 subdimension scores, full evidence trails, certified assessments, and sector-level analysis packages.

Scores confirmed

Entities where research found published scores remain accurate. Confirmations are documented evidence, not silence.

EntityIndexBandPublishedAssessedDeltaDateFinding
Meta PlatformsFortune 500critical12.210.9-1.3Child safety verdicts ($375M+), content moderation rollback, and DEI elimination confirm Critical band; delta below 5-point threshold
AmazonFortune 500developing21.618.4-3.2Worker injury rates remain 2x industry average for 7 years; strong environmental program offsets worker safety failures; delta below 5-point threshold but near band boundary (Developing/Critical)
BoeingFortune 500critical9.15-4.1DOJ NPA sealed accountability for 346 deaths; Fifth Circuit closed families' last legal avenue March 2026; NTSB found same quality failures recurring; delta below 5-point threshold, both scores firmly Critical
SudanCountriescritical000World's worst humanitarian crisis confirmed at absolute floor: 34M needing aid, up to 400K dead, 14M displaced, famine confirmed, entering fourth year of war — published score of 0 is accurate

Key highlights

Editorial-level findings from the Apr 15 research cycle.

01

Three band changes proposed in a single night. OpenAI (Functional to Developing), Johnson & Johnson (Functional to Developing), and Israel (Developing to Critical). This is the highest concentration of band-change proposals in a single assessment cycle. Each is independently high-confidence with large deltas.

02

The Accountability dimension is in structural collapse across multiple sectors. Boeing (5.0), xAI (2.5), J&J (10.0), Israel (2.5), Meta (7.5) — every entity assessed tonight except Sudan (absolute floor) and Amazon (12.5) scores Accountability in the Critical band or near it. In every case, the pattern is identical: harm acknowledged only under legal or regulatory compulsion, never voluntarily. This is not entity-specific; it is a structural feature of large institutional behavior in the current environment.

03

xAI/Grok is the most acute harm profile in the AI Labs index. A score of 2.2/100 places it near the absolute floor — four dimensions at zero. The convergence of CSAM harm to minors, environmental racism in a majority-Black community, and organizational collapse represents a simultaneous multi-front failure with no precedent in the index. The NAACP lawsuit filed just yesterday (April 14) was not yet reflected in the published score.

04

Johnson & Johnson's downgrade reframes what a healthcare company can look like. A published score of 48.4 placed J&J in the top half of the Fortune 500 index, reflecting genuine healthcare contributions. The research reveals that the talc litigation — three failed bankruptcy attempts, internal knowledge of asbestos contamination concealed for decades, $1.5B verdict — is not an isolated legal matter but a fundamental character statement about how the company relates to harm it causes. The -27.5 point collapse in Accountability and Integrity dimensions is the most severe values-action gap in the healthcare sector.

05

Sudan's score of 0 is more meaningful than it appears. Of the 193 countries in the index, Sudan and a small group of crisis states occupy the absolute floor. Confirmation that Sudan remains at 0 after assessment — with confidence rated high — reflects that the evidence base is overwhelming and unambiguous. The assessment process works: it confirms what is already published when that score is accurate.

The weekly briefing on institutional compassion scores

Score changes, sector trends, and emerging risk signals from overnight research across 1,155 entities — every Monday. Free.

No spam. Unsubscribe anytime. Your email is never shared.

Sector intelligence

Analyst-level observations on patterns emerging across indexed sectors from the Apr 15 research cycle.

AI Labs

  • Safety infrastructure is unraveling at scale, not in isolation. OpenAI dissolved its superalignment team and failed to deliver on safety compute commitments. xAI launched image generation without child safety filters. The Tonight's scan also flagged Anthropic (RSP pledge dropped) and Character AI (child harm lawsuit settlement). This is a sector-wide retreat from safety-first positioning under commercial and political pressure.
  • The gap between published AI Labs scores and current evidence is likely the largest systematic gap in the benchmark. Published scores for top-tier labs were calibrated during the 2023-2024 safety-pledge era. The dismantling of those commitments in 2025-2026 has not yet been captured across the index. OpenAI's -20.2 delta may be representative, not exceptional.
  • Accountability and Equity are the two most persistently weak dimensions. OpenAI: Accountability 31.3, Equity 31.3. xAI: Accountability 2.5, Equity 0. The pattern: labs acknowledge harm only when regulators or courts compel them.
  • xAI's environmental harm is a new category of AI lab failure. No other AI lab in the index faces an environmental racism lawsuit for unpermitted infrastructure. The NAACP/Earthjustice Clean Air Act action introduces a dimension of harm — community environmental harm concentrated in communities of color — that the benchmark has not previously encountered in this sector.

Fortune 500

  • A new era of direct corporate liability is emerging. Tonight's confirmations (Boeing, Meta, Amazon) each reflect entities where legal accountability mechanisms have been the primary driver of disclosed harm. Boeing's $444.5M victim fund exists because regulators and courts demanded it; Meta's $375M child safety verdict came through civil litigation. Corporate voluntary disclosure of harm remains near-zero.
  • The healthcare sector has a systematic valuation problem. J&J's published score of 48.4 was among the highest in the Fortune 500 index for a company facing active large-scale litigation. ESG infrastructure, workforce diversity, and pharmaceutical R&D quality appear to be inflating scores for healthcare companies without adequately penalizing sustained concealment of product-caused harm. CVS (50.0 published, insulin pricing litigation) and UnitedHealth (published score; PxDx AI denial algorithm) should be assessed with the same scrutiny applied to J&J.
  • Boeing represents a critical-band floor confirmation. At 5.0 assessed vs. 9.1 published, Boeing's delta of -4.1 falls just below the change proposal threshold. Both scores are firmly Critical. The confirmation is methodologically useful: the published score accurately placed Boeing near the bottom of the index, and the research confirms the Critical-band characterization is still accurate.
  • Amazon remains a band-boundary risk. Assessed at 18.4 vs. published 21.6, Amazon sits 1.6 points below the Developing/Critical band boundary (20.0). The FTC antitrust trial (late 2026) and ongoing congressional investigation into injury data manipulation each represent independent vectors that could push a formal change proposal.

Countries

  • The countries index has an urgent assessment backlog in active conflict zones. Tonight's scan flagged Sudan, Israel, Russia, Myanmar, Ethiopia, and China — all with overwhelming current evidence of humanitarian crises. Only Sudan and Israel were assessed tonight. Russia (5th year of Ukraine invasion, documented war crimes), Myanmar (5-year coup anniversary, junta atrocities surging), and Ethiopia (US aid cuts accelerating hunger crisis) are all overdue for full assessment.
  • Israel's downgrade is methodologically significant for the countries index. The assessment explicitly applies the benchmark's scope to ALL populations an entity controls or governs — not just its own citizens. This is consistent with the methodology but represents a principled clarification: a country that maintains domestic governance while governing an occupied population under military control cannot score on domestic institutions alone.
  • Sudan's confirmed 0 score reflects the absolute-floor cluster. Multiple countries in the index (Sudan, Myanmar, Russia at 0) are confirmed at or near the floor. The relevant question for the benchmark is whether the floor cluster accurately captures relative severity within that tier — a question that formal assessment of Russia and Myanmar would begin to answer.

Robotics Labs

  • The humanoid deployment safety gap is widening. Boston Dynamics (Atlas production launch, fenceless human-detection safety systems, ISO/TC 299 participation) is emerging as a potential safety leader. Tesla Optimus, Figure AI, and other entrants are scaling to high volumes without comparable safety infrastructure disclosures.
  • The benchmark has limited current evidence for robotics labs because the sector generates less regulatory and litigation-based evidence than AI labs or Fortune 500 corporations. Assessment confidence for robotics entities will remain lower until the EU AI Act enforcement regime generates primary-source safety data.

Emerging risks

Forward-looking risk signals from the Apr 15 research cycle. These are not current findings — they are early warning flags.

Risk

EU AI Act enforcement (August 2, 2026 — 109 days away). High-risk system obligations become enforceable across all AI labs and robotics labs in the index. This is the single most consequential regulatory date in the benchmark's near-term horizon. Scanner monitoring frequency for EU regulatory actions should increase immediately. Entities with the most exposure: OpenAI, xAI/Grok, Figure AI, Tesla Optimus, Boston Dynamics.

Risk

Federal Take It Down Act (enforceable May 2026 — 15 days away). The Act makes it a federal crime to publish non-consensual intimate images, including AI-generated deepfakes. xAI/Grok's CSAM deepfake scandal makes it the highest-exposure entity for first enforcement actions under this statute. This could trigger additional regulatory action against xAI within weeks of this assessment.

Risk

Exxon v. Boulder (Supreme Court — Fall 2026). The Supreme Court agreed to hear Boulder's climate accountability lawsuit against Exxon and Suncor (February 23, 2026). If the Court finds state courts can hold fossil fuel companies liable for climate damages, it could trigger a wave of litigation against Exxon, Chevron, and other Critical-band energy companies. Both Exxon (9.1 published) and Chevron (9.1 published) are unassessed and high-priority for the next scanner cycle.

Risk

Israel-Gaza humanitarian collapse. UNRWA banned from Gaza since March 2025, the sole crossing (Kerem Shalom) creating bottlenecks, and West Bank displacement accelerating beyond all prior years. The evidence base continues to deepen. The proposed downgrade from 27.8 to 8.8 reflects conditions that are still actively deteriorating as of the assessment date.

Risk

J&J settlement talks (April 2026). Settlement talks began April 13, 2026 after the third bankruptcy attempt collapsed. If a settlement is reached, it could represent the first genuine reparative action in this case and might partially rehabilitate the Accountability dimension. Assessor should monitor for outcome — a completed, court-approved settlement with fair claimant compensation could warrant a re-assessment within 6 months.

Risk

Healthcare sector score inflation. Tonight's J&J assessment reveals a potential systematic pattern: healthcare companies with strong ESG infrastructure may be scoring significantly higher than current litigation evidence warrants. CVS Health (50.0 published, active insulin pricing RICO suit), UnitedHealth Group, and AbbVie should be prioritized for early assessment to test whether the J&J pattern is sector-wide.

Research insights

Analytical observations from the Apr 15 research cycle. These are assessor-level interpretations, not findings.

Note

The benchmark is detecting a structural accountability deficit, not isolated failures. Across all four sectors assessed tonight, the single most consistently weak dimension is Accountability. The pattern is not that these entities are unaware of harm — Boeing's quality failures were documented internally, J&J's contamination was known for decades, OpenAI's safety gaps were raised by internal whistleblowers, and xAI's founders admitted foundational problems. The failure is institutional unwillingness to acknowledge harm proactively, accept external accountability mechanisms, or take reparative action without legal compulsion. This is a systemic finding, not entity-specific.

Note

Published scores for prominent, well-resourced entities may be systematically overstated. Three of four entities receiving downgrade proposals tonight (OpenAI at 60.8, J&J at 48.4, Israel at 27.8) were scored significantly higher than the evidence warrants. The common factor: all three have robust public communications, ESG reports, or formal institutional infrastructure that generated positive scoring inputs during original assessment. Evidence from litigation, regulatory action, and investigative journalism — which tends to lag institutional self-reporting — consistently tells a more damaging story. The benchmark's research methodology is specifically designed to weight this primary-source evidence, which is why these deltas are large.

Note

International legal institutions are generating the benchmark's most reliable evidence. ICJ advisory opinions, ICC arrest warrants, OHCHR special rapporteur reports, and UN agency situation reports produced the most precise and independently verified evidence in tonight's assessments. For the countries index, these sources are particularly valuable because they apply legal standards to government actions rather than relying on self-reported data. The Israel assessment draws on six independent international bodies all documenting the same pattern.

Note

The AI Labs index is the most volatile in the benchmark. AI labs are moving faster — in both capability development and harm profile — than any other entity type in the index. OpenAI's -20.2 delta and xAI's -16.1 delta are the two largest proposed changes tonight. The sector's 24-month news cycle (safety pledges made in 2023-2024, then systematically dismantled in 2025-2026) is creating large, compounding divergences between published scores and current evidence. The AI Labs index needs a faster reassessment cycle than other indexes — quarterly rather than annual.

Note

Boeing and Sudan are floor-level confirmations with different implications. Boeing (5.0 confirmed vs. 9.1 published) represents a corporation that has reached a kind of stability in institutional dysfunction — same quality failures, same accountability avoidance, across multiple aircraft programs and multiple leadership regimes. Sudan (0.0 confirmed) represents a state government that has ceased to function as a compassionate institution entirely; the suffering is being inflicted, not merely failing to be mitigated. Both confirmations are methodologically valuable: they establish that the published scores accurately captured these entities' trajectories.

Want the complete picture?

Full benchmark reports include all 40 subdimension scores, complete evidence trails, and methodology documentation for every assessed entity.