Daily Briefing

Compassion contrast — xai-grok

Would worsen score

Today's analysis

The most significant editorial findings in the May 14 briefing.

Tonight produced the richest single-night methodology harvest in the benchmark's history: nine new conduct categories across fourteen assessments. The medical-science-denial enforcement sub-category (Senegal charging undetectable-viral-load patients with 'voluntary HIV transmission') and the aid-withdrawal-induced EMP compounding (South Sudan's famine trajectory amplified by upstream US aid policy) are the most analytically novel contributions.

Russia's bad-faith ceasefire-format compound now extends to five phases — the most architecturally dense conduct documentation in the floor cluster. The May 14 1,560+ drone deployment in 24 hours is the largest single-day strike package since the ceasefire expired. The five-phase template (announce, prepare, strike, sustain, large-scale escalation) is now available as a reference framework for any future ceasefire-format assessment.

xAI's third Grok governance failure in five months establishes a systemic pattern that no single-employee root-cause attribution can explain. The partial-external-accountability-reversal sub-anchor (GitHub system-prompt transparency pledge) is the first positive signal in an AI Labs floor entity — crediting it without moving the entity off floor required a new methodology sub-distinction between 'partial' and 'full' external-accountability-reversal.

The DRC Amnesty 71-interview primary-source report is the highest-quality new evidence generated in the May 14 cycle. It reverses the April 19 Switzerland-talks credit and documents sexual violence as a weapon (forced marriage, forced pregnancy) under state non-protection. 'Stated-commitment-operational-hollowing' now appears across three floor-cluster entities this cycle (DRC, Myanmar, Russia) — a cross-cluster pattern warranting unified methodology treatment.

Nigeria's proposed 21.9 sits 0.9 points above the Critical threshold — the narrowest active boundary margin in the countries index. The world's highest acute food-insecurity count (35 million) and 26% year-over-year lean-season worsening make this a credible boundary case, not a rounding artifact. One additional adverse evidence cycle (lean season May-October) could trigger the crossing.

Signal stack

9 signals

Countrieshigh

Senegal Medical-Science-Denial Enforcement — New Conduct Category

Senegal is charging patients with undetectable viral loads for 'voluntary HIV transmission' — a scientifically impossible harm.

Countriescritical

Russia Five-Phase Bad-Faith Ceasefire Compound Complete

May 14's 1,560+ drones in 24 hours — 670 attack drones and 56 missiles in a single overnight strike — constitute the fourth and largest phase of the post-ceasefire offensive surge.

Countrieshigh

DRC Amnesty 71-Interview Primary Source — 'I'd Never Seen So Many Bodies'

Amnesty International's May 5 report documents ADF war crimes and crimes against humanity through 71 interviews: civilian massacres, forced marriage, forced pregnancy, child recruitment, abductions.

Countrieshigh

Nigeria Boundary Case — 0.9 Above Critical Threshold

Nigeria's proposed score of 21.9 sits 0.9 points above the Critical band floor (21.0).

medium

AI Labs — xAI Third Governance Failure (Same-Day Event)

May 14 3:15 AM PST: unauthorized Grok system-prompt modification directing politically biased responses.

medium

AI Labs — Musk v. Altman Closing Arguments

Closing arguments May 14-15 in Oakland.

medium

Countries — Sudan-South Sudan Compound Humanitarian Crisis

Sudan (880+ drone deaths Jan-Apr 2026; survival-infrastructure targeting) and South Sudan (full-scale famine warning from UN ERC; Fangak hospital strike May 3) both document escalating compound crises in adjacent territories.

medium

Countries — South Asia Legislative Regression Cluster

India (Transgender Amendment Bill removing NALSA rights; -3.9 proposed) and Senegal (medical-science-denial enforcement; -3.9 proposed) both show four-vector harm expansions since their most recent assessments.

medium

Countries — Active Conflict Zone Multi-Entity

May 14 active conduct events: Russia (1,560+ drones, 1 killed/36 injured in Kyiv), Myanmar (three discrete airstrike events, 6 killed including child), Sudan (Kornoi water well; 880+ drone deaths documented), South Sudan (hospital strike, famine warning).

Score change detail

Full evidence record for entities with score changes in this cycle.

Countriesmedium confidence

34.430.5

-3.9 pts

developing

Evidence record

https://www.amnesty.org/en/latest/news/2026/03/india-presidential-approval-of-regressive-transgender-bill-a-major-step-backward-for-human-rights/
https://www.hrw.org/news/2026/04/17/india-proposed-rules-to-expand-online-censorship
https://eng.mizzima.com/2026/04/09/32997
https://scroll.in/article/1082411/how-india-allegedly-deported-40-rohingya-refugees-by-forcing-them-into-the-andaman-sea
scanner-aggregated

Boundary watch resolution

Composite at 30.5 sits 9.5 points above Critical threshold and 10.5 points below Functional. Not a boundary case.

Countriesmedium confidence

37.533.6

-3.9 pts

developing

Evidence record

https://www.thenewhumanitarian.org/feature/2026/05/05/senegal-anti-gay-law-criminalises-hiv-infection-hits-services
https://www.unaids.org/en/resources/presscentre/pressreleaseandstatementarchive/2026/march/20260318_Senegal_law_LGBTQ
https://www.hivjustice.net/news-from-other-sources/senegal-legal-and-human-rights-criminalisation-cases/
https://theworld.org/segments/2026/05/05/in-senegal-concerns-mount-over-impact-of-anti-lgbtq-laws-on-hiv-treatment

Boundary watch resolution

Composite at 33.6 sits 12.6 points above Critical threshold. Not a boundary case. Further downward movement in subsequent cycles could approach Critical if Article 319 enforcement intensifies.

Countriesmedium confidence

4.42.3

-2.1 pts

critical

Evidence record

https://www.amnesty.org/en/latest/news/2026/05/drc-rampant-adf-abuses-against-civilians-war-crimes-which-the-world-must-not-continue-to-ignore/
https://www.amnesty.org/en/documents/afr62/0860/2026/en/
https://www.aljazeera.com/news/2026/5/4/extensive-brutality-rebel-attacks-reap-hell-on-congolese-civilians
https://www.unocha.org/publications/report/democratic-republic-congo/facing-critical-funding-gap-humanitarian-community-drc-forced-strictly-prioritize-its-response-2026

Boundary watch resolution

Composite at 2.3 sits 18.7 points below Critical-Developing threshold (21). Not a boundary case. Floor-proximate.

Countriesmedium confidence

23.421.9

-1.5 pts

developing

Evidence record

https://news.un.org/en/story/2026/01/1166857
https://www.thenewhumanitarian.org/opinion/2026/04/27/deliver-better-humanitarian-response-fix-nigeria-state-dysfunction
https://www.hrw.org/world-report/2026/country-chapters/nigeria

Boundary watch resolution

BOUNDARY CASE — composite at 21.9 sits 0.9 points above Critical-Developing threshold (21.0). Logged per boundary-case protocol. Conservative interpretation preserves Developing-band placement. Any additional negative evidence in subsequent cycles risks Critical classification.

Next assessment triggers

lean-season-trajectory

Democratic Republic of Congo

Score movements

All entities assessed this cycle. No score changes.

14 assessed

Ai Labs

xAI

critical

Countries

Russia

critical

Countries

Myanmar

critical

Countries

Sudan

critical

Countries

South Sudan

critical

Countries

Israel

critical

Countries

India

34.430.5-3.8999999999999986

developingmedium

Countries

Senegal

37.533.6-3.8999999999999986

developingmedium

Countries

4.42.3-2.1000000000000005

criticalmedium

Countries

Hungary

41.4

developing

Countries

Pakistan

20.3

developing

Countries

Ukraine

functional

Fortune500

UnitedHealth Group

11.4

critical

Countries

Nigeria

23.421.9-1.5

developingmedium

Boundary watch2 entities near a band threshold

Entities approaching band boundaries

Countries

Nigeria

21.9

—

BOUNDARY CASE — proposed composite 21.9 sits 0.9 points above Critical threshold (21.0). One additional adverse evidence cycle risks band crossing to Critical.

Countries

Hungary

41.4

—

CARRY-FORWARD PENDING — May 13 band-crossing proposal (Developing → Functional, 37.5 → 41.4) remains pending. Today's EU Commission team dispatch to Budapest is sub-threshold supportive but not a new scoring trigger. Next material triggers: May 25 Brussels visit, May 31 Sulyok dismissal deadline.

Risk signals

Developments that may affect future scores. Watch items from the May 14 briefing.

Risk

Nigeria lean-season boundary crossing

Nigeria's proposed composite of 21.9 sits 0.9 points above the Critical band threshold (21.0). The May-October lean season is the immediate risk window: 5.8M are already in acute food insecurity at lean-season onset, up 26% year-over-year. If OCHA or UN IPC data documents deterioration in June-July, the EMP dimension will face further downward pressure on an entity already at the boundary. Nigeria entering the Critical band would be the most significant countries-index band change of the year after Hungary.

nigeria

Window2026-05 to 2026-10 (lean season)

Risk

Senegal EQU dimension approaching floor

Senegal's EQU dimension is now at 1.25 — approaching the 1.0 floor. Two consecutive cycles of downward EQU movement (May 9 band crossing, May 14 further regression) establish a clear trajectory. If Article 319 enforcement continues at the current rate (70+ detained, 24 charged, first 6-year conviction) or expands to additional target populations, EQU could reach floor in the next 1-2 cycles. An EQU floor designation would trigger a comprehensive review of Senegal's band placement.

senegal

Window2026-06 to 2026-09 (Article 319 prosecutorial pipeline)

Risk

Open Bionics formula audit publication integrity

Open Bionics now at 12 cycles without formula audit resolution. Published 97.5 carries a -10.0 reconstruction discrepancy. At 12 cycles this is an active publication-integrity crisis, not a maintenance item. Every cycle this remains unresolved, the benchmark publishes a score known to be materially incorrect by 10 points across one band boundary (Exemplary vs. Established).

open-bionics

WindowImmediate — audit must begin this week

Risk

Musk v. Altman verdict cascading AI Labs scoring event

Closing arguments begin May 14-15; advisory jury deliberates with Judge Gonzalez Rogers ruling expected mid-May. A charitable-trust-breach finding against OpenAI would be a direct ACC/SYS scoring event for OpenAI (hold expires May 21) and a contextual event for Microsoft (hold expires May 15). The trial has also produced the xAI admission of OpenAI model distillation — a potential INT/BND signal for xAI independent of the verdict. Three held AI Labs entities (Anthropic, Microsoft, OpenAI) and one expired hold (Google) create the largest single-week AI Labs scoring queue since April.

openai microsoft anthropicgooglexai-grok

Window2026-05-14 to 2026-05-21

Risk

Sudan-South Sudan compound regional collapse

Sudan (880+ drone deaths documented; airport and water-well strikes; conflict de-regionalized to all major population centers) and South Sudan (full-scale famine warning from UN ERC; hospital strike; US aid cuts compounding EMP failure) represent a compound regional collapse trajectory. Both entities are at floor; neither has a credible floor-exit pathway. The aid-withdrawal-induced EMP compounding in South Sudan introduces a new risk vector: external policy decisions (US aid policy) can amplify humanitarian failure in assessed entities independently of those entities' own conduct.

sudan south-sudan

Window2026-05 onwards

Failure modes in this briefing

Recurring patterns the ACB methodology tracks as structural barriers to institutional compassion. Detected from evidence documented in this cycle.

Failure mode

Stated commitment operational hollowing

Public commitments are maintained in language while the operational machinery to fulfill them is dismantled, under-resourced, or conditionally applied. The commitment becomes a rhetorical position rather than a behavioral constraint.

Detected inAnalysis

Methodology innovation9 new conduct categories

New analytical categories

The ACB framework is extended when conduct patterns appear that existing categories cannot capture. Each new category is dated and tied to its first-application entity, creating an auditable record of framework evolution.

Draft

compound-governance-failure-cluster-with-partial-external-accountability-reversal

A floor entity's third structurally distinct governance failure category combined with a same-day partial external-accountability-reversal sub-anchor positive (public disclosure of bad conduct under operational pressure with forward-looking remediation promise only). Distinguishes 'full' from 'partial' external-accountability-reversal: full requires structural remediation; partial requires only public disclosure and a remediation commitment not yet executed.

First applied toxAI

Dated2026-05-14

Draft

bad-faith-ceasefire-format-compound-with-large-scale-escalation

A five-phase compound format: (1) announce ceasefire; (2) prepare under cover of ceasefire window; (3) strike night 1; (4) sustain night 2; (5) large-scale escalation day 4. The fifth phase — large-scale escalation — extends beyond the prior three-phase template documented May 9-13.

First applied toRussia

Dated2026-05-14

Draft

floor-conduct-with-survival-infrastructure-targeting

Drone warfare targeting survival infrastructure (water wells, airports) in a single 10-day window, at national scale. Water-source targeting is an IHL-protected target class requiring isolated documentation. ...

First applied toSudan

Dated2026-05-14

Draft

aid-withdrawal-induced-EMP-compounding

EMP-dimension failure in an assessed entity amplified by an upstream-actor decision (US aid cut) that is outside the assessed entity's control. Documents downstream harm compounding where the amplifying cause is a distinct actor's policy choice.

First applied toSouth Sudan

Dated2026-05-14

Draft

medical-science-denial-enforcement

A state's use of its criminal-justice system to enforce a charge under a scientifically-incorrect framework — specifically, prosecuting individuals with undetectable viral loads (medically untransmittable) for 'voluntary HIV transmission.' Simultaneously an EMP failure (denial of medical reality of the population prosecuted), an ACC failure (criminal-justice system used to enforce incorrect charges), and a healthcare-system harm (25.6% treatment-attendance drop from chilling effects).

First applied toSenegal

Dated2026-05-14

Draft

accountability-decay

Non-accountability for a prior near-floor-conduct event treated as a contemporary ACC-dimension failure. The absence of accountability in the current assessment window is itself scored as a failure, not merely noted as background context. ...

First applied toIndia

Dated2026-05-14

Draft

stated-commitment-operational-hollowing

A positive scoring event from a prior assessment cycle (April 19 Switzerland talks commitment to protect civilians and ease aid) is reversed when contemporaneous primary-source evidence documents that the commitment did not translate into operational change. The hollowing is documented by the temporal overlap of the stated commitment and the adverse evidence (Amnesty 71-interview report covering the same window).

First applied toDemocratic Republic of Congo

Dated2026-05-14

Draft

proactive-disclosure-under-pressure-before-legislative-compulsion

An institution discloses or implements a transparency reform in response to regulatory or legislative pressure, but before the compulsory mechanism is fully operationalized. Sits between full external-accountability-reversal (voluntary) and reactive compliance (compelled). ...

First applied toUnitedHealth Group

Dated2026-05-14

Draft

coercive-diplomacy-under-ceasefire

A state uses the threat of military resumption as a diplomatic coercion tool during an active ceasefire, to compel a specified behavior from the opposing party. Constitutes a B1 consent failure (unilateral threat violates ceasefire consent framework) and an I1 consistency failure (ceasefire-professed posture contradicted by war-resumption threat). ...

First applied toIsrael

Dated2026-05-14

Confirmed positions

Entities reassessed for this briefing where published scores remain supported by current evidence.

Confirmed positions from the May 14 briefing.
Entity	Index	Band	Published	Assessed	Delta	Date	Finding
PakistanWatch	Countries	developing	20.3	20.3	0	May 14	Pakistan MONSOON WATCH DAY 2 — confirmed at 20.3 Developing. No mass-casualty flooding in 24h window. PDMA/NDMA active response. Watch closes May 17 if no mass-casualty event documented from May 13-17 weather system.
Ukraine	Countries	functional	50	50	0	May 14	Ukraine ASYMMETRIC-CONDUCT CONFIRMATION at 50.0 Functional. May 14 Russia mass strike demonstrates pattern asymmetry: Ukraine sustained defensive posture, targeted Russian command posts only. +3.1 post-ceasefire credit sustained. No evidence of Ukrainian-attributed large-scale civilian targeting in 24h window.
UnitedHealth Group	Fortune500	critical	10.9	11.4	+0.5	May 14	UnitedHealth Group DOCUMENTED: +0.5 sub-threshold movement (10.9 → 11.4). Optum Rx transparent pricing model credited as first proactive transparency reform (+0.125 INT half-step) under new 'proactive-disclosure-under-pressure-before-legislative-compulsion' sub-category. Methodology precedent under formalization. DOJ probe and court-ordered AI disclosure offset further credit.

Floor conduct record

Cycle-specific conduct documentation for entities at composite zero, recorded for the May 14 briefing.

Floor · Critical

xAI

Primary sources

1.x.com
2.justsecurity.org

Floor · Critical

Russia

Primary sources

Floor · Critical

Myanmar

Primary sources

1.moemaka.net
2.moemaka.net

Floor · Critical

Sudan

Primary sources

1.ohchr.org
2.euronews.com

Floor · Critical

South Sudan

Primary sources

1.aljazeera.com
2.hrw.org

Floor · Critical

Israel

Primary sources

1.aljazeera.com
2.un.org

Math hygiene

Entities where published composite and reconstructed composite diverge. Tracked openly as a publication-integrity obligation.

Open Bionics at 12 cycles (incremented from 11). Formula audit is CRITICAL BLOCKING item. 13 entities total unchanged. Math-hygiene flags from today's cycle: India composite math 30.47 → reported 30.5 (within 0.5pt tolerance, no flag); Senegal math 32.81 → reported 33.6 (0.8pt rounding gap, within tolerance but tracked); UnitedHealth math 11.33 → reported 11.4 (0.1pt tolerance, clean). No new flags added to cluster.

Carry-forward dimensional credits

·5 entities with documented pressure not yet reflected in composite

Hungary

41.4→—

Ukraine

50→—

Waymo

35.9→—

Vanuatu

35.9→—

Mongolia

48.4→—

Held this cycle

·5 entities deferred with documented reason

Ai Labs
Anthropic
Pentagon blacklist maintained / White House carve-out EO in drafting / Mythos dual-use cybersecurity deployment confirmed despite blacklist / Claude Opus 4.7 released May 13. Full assessment queued for tomorrow.
Fortune500
Microsoft
Musk v. Altman closing arguments proceeding May 14-15. Nadella testified May 11: Musk never raised concerns to him; Microsoft to spend $100B+ on OpenAI by June 2026. Hold expires tomorrow regardless of verdict timing.
Ai Labs
Google
Hold EXPIRED today (May 14). No material compassion event found May 14. Queued for May 15 standard rotation assessment. Key evidence: DOJ remedies order (April 14), cross-appeal filed, EU DMA team dispatch, choice screen mandate.
Ai Labs
OpenAI
Musk v. Altman trial closing arguments May 14-15. Advisory jury to deliberate. Judge Gonzalez Rogers ruling expected mid-May. Hold maintained through verdict.
Robotics Labs
Open Bionics
Math-hygiene formula audit hold — CRITICAL BLOCKING, 12 cycles open. Published 97.5 carries -10.0 discrepancy. Do NOT re-queue for assessment until formula audit is complete.

Forward signals

Calendar of upcoming scoring events the methodology pipeline is tracking.

Invalid Date·1 signal

Musk v. Altman
Closing arguments in Oakland. Advisory jury deliberates. Charitable-trust-breach finding would be ACC/SYS scoring event for OpenAI and Microsoft. Pre-stage both for post-verdict assessment.

May 15·3 signals

Anthropic
Hold expires. Full assessment: Pentagon blacklist vs. safety red-lines vs. White House carve-out vs. Mythos dual-use deployment vs. Claude Opus 4.7 release vs. Claude for Small Business. Most complex AI labs scoring event in active pipeline.
Microsoft
Hold expires. Assess post-closing-argument. If verdict arrives before assessment: incorporate charitable trust ruling as ACC/SYS signal. Nadella testimony documented.
Google
Hold expired today. Standard rotation assessment May 15. Key evidence: DOJ remedies order (April 14), cross-appeal filed, EU DMA comment period closed May 13, choice screen mandate, no exclusive distribution contracts.

May 17·1 signal

Pakistan
Pakistan monsoon watch closes May 17. If NDMA sitreps document significant casualties or displacement from May 13-17 weather system, EMP dimension update required. Check ndma.gov.pk/sitrepm.

May 20·1 signal

Vanuatu
UNGA vote on Vanuatu ICJ climate resolution. Positive INT-dimension event if passed. Re-scan May 21 for Marshall Islands, Kiribati, Timor-Leste (all at 39.1 — 0.9 below Functional floor). Vote outcome could trigger band crossings.

May 21·1 signal

OpenAI
Hold expires on estimated verdict. Breach-of-charitable-trust finding is ACC/SYS scoring event. Pacific cluster reassessment same day.

May 25·1 signal

Hungary
Magyar-Brussels visit (von der Leyen summit). EU fund pathway discussion. First concrete legislative milestone signal expected. ACT/SYS dimension evidence generation.

Analytical notes

Observations on methodology, evidence quality, and structural patterns from the May 14 briefing.

Note

May 14 documented the first 'partial external-accountability-reversal' sub-anchor positive in an AI Labs floor entity (xAI's GitHub system-prompt pledge) and the first 'medical-science-denial enforcement' designation (Senegal charging undetectable-viral-load patients). Both require new methodology sub-distinctions: the first between partial and full external-accountability-reversal; the second formalizing prosecutorial science-denial as a compound ACC+EMP failure category. Nine new conduct categories in a single cycle suggests the methodology's vocabulary is expanding faster than its consolidation pace — a sign of a rich evidentiary environment but also a methodological housekeeping need.

Note

The 'stated-commitment-operational-hollowing' pattern now recurs across three floor-cluster entities in this cycle alone: DRC (April 19 Switzerland talks vs. May 5 Amnesty report), Myanmar (April 21 100-day peace plan vs. May 6-14 strike cluster), Russia (May 9-11 ceasefire format vs. May 9-14 offensive surge). This cross-cluster frequency suggests the pattern is systematic enough to warrant a unified conduct category rather than entity-specific designations. Formalizing it as a top-level category would allow the benchmark to track the prevalence of stated-commitment gaps across the full entity set as a structural indicator of institutional integrity.

Floor designations

·8 entities at composite 0 with documented evidence pattern

Composite scores resolving at zero — methodology disclosure

These entities have all 8 dimensions resolving at the lowest behavioral anchor (1.0/5.0) across multiple assessment cycles. Read the methodology.

Ai Labs

Character AI

BNDACCEMPINT

Ai Labs

Palantir AI

AWREMPACCEQUINT

Ai Labs

xAI/Grok

AWREMPACCINT

Countries

Israel

AWREMPEQUBNDACCINT

Countries

Myanmar

AWREMPEQUBNDACCINT

Countries

South Sudan

AWRACTEQUACCSYSINT

Countries

Sudan

AWRACTEQUBNDACCINT

Robotics Labs

Ghost Robotics