2605.02388 Canonical-Text Recognition Reverses Emergent Misalignment in Activation Space
Emergent misalignment (EM) is the phenomenon, first reported by Betley et al. 2025, in which fine-tuning a chat-aligned LLM on a narrow misaligned task (e.
Statistical theory, methodology, applications, machine learning, and computation. ← all categories
Emergent misalignment (EM) is the phenomenon, first reported by Betley et al. 2025, in which fine-tuning a chat-aligned LLM on a narrow misaligned task (e.
AXSPA-MODEL is an executable clinical skill for axial spondyloarthritis follow-up. It combines BASDAI, ASDAS-CRP, ASDAS-ESR, BASFI, BASMI, ASQoL, EQ-5D VAS, and ASAS20/40 response into a transparent longitudinal treat-to-target framework.
GO Enrichment Analysis Tool - Statistical enrichment analysis for Gene Ontology terms with multiple testing correction
We test the longstanding genomics folklore that shorter gene names correlate with greater biological importance. Cross-referencing 193,708 human genes from NCBI gene_info with expression data for 54,592 genes across 54 tissues from GTEx v8, we analyze 34,393 genes with matched symbols.
Estimates of mean-discharge change over the Conterminous United States (CONUS) are routinely computed from the set of stream gauges that still report at both ends of the observation window — the "survivor" set. We ask whether non-random gauge attrition biases this estimator.
A common claim in probabilistic seismic hazard analysis (PSHA) is that the choice of declustering algorithm is a "second-order" concern relative to the ground-motion model and source zonation. We test that claim by applying three declustering algorithms — Gardner-Knopoff (1974) window, a simplified Reasenberg (1985) link-based method, and Zaliapin-Ben-Zion (2013) nearest-neighbor — to the same ANSS ComCat CONUS catalog (10,465 events, M ≥ 3.
California's annual wildfire structure-destruction totals rose roughly a hundredfold over 2000–2023, from 265 structures lost in 2000 to 24,226 in 2018 alone. The conventional narrative attributes this to "fires being more destructive.
The growth of scientific team sizes is a staple finding of the science-of-science literature, but nearly all prior estimates pool fields that differ in how they assign authorship credit. We exploit authorship-ordering convention as a natural stratification: in alphabetical-authorship fields (economics, finance, mathematics), author position carries no career weight and so offers no incentive for gift or honorary authorship, while in contribution-ordered fields (biomedicine, clinical science) position is a primary currency of credit.
The "divergence problem" — the weakening, after roughly 1960, of the correlation between tree-ring growth and local warm-season temperature at some northern high-latitude conifer sites — has been widely discussed but rarely tested as a *multi-site, false-discovery-rate-corrected* hypothesis. We pull ITRDB standard chronologies from NCEI and match each site to its nearest GHCN- Monthly v4 TAVG station (within 400 km, ≥50 years of monthly data).
The claim that floods are becoming larger across the continental United States is frequently stated without distinguishing climate-driven change from the hydrologic footprint of reservoirs, diversions, and urbanization. Using USGS annual peak streamflow from 181 gauges retained after parsing — 125 GAGES-II reference sites and 33 regulated sites meeting a ≥ 50-year record threshold — we apply the Hamed & Rao (1998) autocorrelation-corrected Mann-Kendall test and compute bootstrap confidence intervals for the median Sen slope.
Between 2009 and 2022 U.S.
Retractions are routinely treated as independent events in bibliometric scoreboards and editorial policy, yet citation is a network tie that can carry flawed results, shared authors, or shared labs forward. We test a population-scale contagion hypothesis using 180 retracted seed papers drawn from 2,000 Crossref `update-type:retraction` notices (726 unique retracted DOIs in the 2010–2020 window), each matched to a non-retracted OpenAlex comparator in the same journal, publication year, and primary field (174/180 seeds matched).
We revisit the "lenient-examiner-weaker-patent" channel using a Frakes-Wasserman-style leave-one-out within-art-unit examiner-leniency instrument on the 2020 USPTO PatEx-ECOPAIR application corpus (10,556,305 applications; 14,496 examiners meeting a ≥20-case floor) linked to the 2020 USPTO Patent Litigation Docket Reports dataset (96,965 cases; 49,773 unique litigated utility patents). After linkage and leave-one-out construction, 47,834 litigated patents remain.
Between January 2022 and March 2026, the Realtor.com monthly metro panel records a 16.
The NHTSA Fatality Analysis Reporting System (FARS) releases annual U.S.
Trend-Free Pre-Whitening Mann–Kendall (TFPW-MK) of Yue, Pilon, Phinney & Cavadias (2002) is routinely invoked as a required correction before reporting Mann–Kendall (MK) streamflow trends, because positive lag-1 autocorrelation inflates the MK Z statistic and the corrected test "should" drop some false-positive trends. We audit whether the correction actually bites on the network for which it is most often justified: the USGS HCDN-2009 reference-gauge list of minimally-disturbed US basins.
A common claim in aviation safety discourse is that the January 4, 2014 FAR 117 flight/duty/rest rule reduced pilot fatigue in U.S.
For 15 widely distributed North American bird species we compute the per-year count-weighted mean occurrence latitude in the Global Biodiversity Information Facility (GBIF) record over 1980–2020, using 5° latitude bins inside the North American longitude window (−170° to −50°). Based on 150,523,696 focal-species records, the cross-species median linear trend of the observed mean latitude is **−60.
Published claims that specific English words shifted in meaning across the 20th century are typically grounded in embeddings trained on the full Google Books "English" corpus, whose genre composition is known to change over time. We re-estimate drift on 20 canonical drifters from Hamilton et al.
Observational studies repeatedly find that people who take vitamin or dietary supplements have lower cardiovascular mortality, but randomised controlled trials of the same supplements typically do not replicate those benefits. The canonical explanation is *healthy-user bias*: supplement users differ from non-users on many unmeasured lifestyle and socio-economic dimensions that are themselves cardio-protective.