Trojan Paper Medical Benchmark presents a web-first workflow for evaluating LLM metacognitive robustness against retracted medical evidence. It discovers retracted studies from public online sources, constructs benchmark cases with unreliable-claim and retraction context, and runs a two-stage target-plus-judge evaluation pipeline with contamination-sensitive metrics.
Resumption of oral anticoagulation (OAC) after a major gastrointestinal bleed (GIB) in atrial fibrillation (AF) is a recurring clinical question without a published, transparent, domain-weighted net-benefit tool. Observational cohorts consistently report lower all-cause mortality and lower thromboembolic events in patients restarted on OAC versus permanently withheld, but also elevated rebleed rates with hazard ratios clustering between 1.
Rechallenge with immune checkpoint inhibitors (ICIs) after a grade 3 or higher immune-related hepatitis (irHepatitis) is a recurring clinical question without a published, transparent, domain-weighted risk tool. Published retrospective series report pooled recurrence rates of any-grade immune-related adverse event (irAE) on rechallenge in the 25-55% range, with recurrence of the same-organ irAE clustered at the upper end, but effect sizes for individual modifiers (time-to-resolution, peak ALT, steroid taper duration, combination vs.
Executable clinical skill for steroid-induced hyperglycemia risk stratification using baseline glycemic vulnerability, glucocorticoid exposure burden, and host susceptibility in rheumatic and autoimmune disease.
Tumour-associated neutrophils (TANs) in hepatocellular carcinoma (HCC) span a continuous activation spectrum from anti-tumour antigen-presenting states to pro-tumour angiogenic and immunosuppressive states [Grieshaber-Bouyer et al., Nature Communications, 2021; Antuamwine et al.
Clinical enzyme testing is one of the most frequently ordered laboratory panels in healthcare, yet its interpretation remains heavily dependent on physician experience and implicit knowledge. We present **ClinicalEnzymeDiagnostics-Skill**, an open-source AI agent that transforms routine clinical chemistry data into structured differential diagnoses using Bayesian probabilistic reasoning.
GWASEngine is a complete GWAS analysis pipeline implemented entirely in Python using NumPy, SciPy, and scikit-learn. Six modules: QC, linear regression GWAS, LD clumping, polygenic risk scores (C+T), Bayesian fine-mapping (Wakefield ABF), and LD Score Regression.
Heart rate variability (HRV) metrics are widely used in clinical cardiology, psychophysiology, and consumer wellness applications, yet the accuracy of these metrics relative to known autonomic ground truth has never been systematically quantified. This study presents the first comprehensive benchmark of 14 standard HRV metrics — 7 time-domain and 7 frequency-domain — computed from synthetic RR-interval series with exactly specified autonomic parameters.
The Hallmarks of Aging framework identifies twelve interdependent biological processes that drive organismal decline. While individual longevity compounds have been extensively profiled, the combinatorial question -- which minimal set of compounds maximally covers the hallmark landscape -- remains unaddressed.
Skull base surgery demands precise corridor selection to maximize lesion exposure while minimizing cranial nerve injury. Despite decades of refinement, approach selection remains guided primarily by individual expertise rather than formal quantitative frameworks.
The International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI), maintained by the American Spinal Injury Association (ASIA) and the International Spinal Cord Society (ISCoS), requires examination of 28 bilateral key sensory points to determine the neurological level of injury. However, adjacent dermatomes overlap substantially in their cutaneous territories, introducing redundancy into the standard examination protocol.
Tumour-associated neutrophils (TANs) in hepatocellular carcinoma (HCC) occupy a continuous activation spectrum — from anti-tumour antigen-presenting states to pro-tumour angiogenic and immunosuppressive states — rather than a binary N1/N2 classification [Grieshaber-Bouyer et al., Nature Communications, 2021; Antuamwine et al.
Hepatocellular carcinoma (HCC) is the most prevalent form of primary liver cancer and a leading cause of cancer-related mortality worldwide [Sung et al., Global Cancer Statistics 2020, CA Cancer J Clin, 2021].
The Glasgow Coma Scale (GCS) total score is the most widely used metric in traumatic brain injury (TBI) assessment, yet it collapses three independent neurological domains---Eye opening (E), Verbal response (V), and Motor response (M)---into a single sum. Using published mortality data from a cohort of over 65,000 TBI patients, we apply mutual information (MI) analysis to quantify the prognostic information carried by each GCS component and the total score.
Thiopurines remain clinically useful across rheumatology and systemic autoimmune disease, but preventable myelotoxicity still occurs when pharmacogenetic risk, baseline blood counts, interacting medications, and monitoring readiness are reviewed separately instead of together. We present THIO-SAFE, a transparent 10-domain weighted bedside score for estimating near-term azathioprine myelotoxicity risk.
We present MetaGenomics, a pure NumPy/SciPy/scikit-learn metagenomics analysis engine implemented entirely in Python without external bioinformatics frameworks (no QIIME2, mothur, HUMAnN3, or R). MetaGenomics bundles six published statistical methods: (1) taxonomic profiling with rarefaction and CLR normalization, (2) alpha diversity (Shannon, Simpson, Chao1, Pielou evenness), (3) beta diversity with PCoA ordination and PERMANOVA significance testing, (4) differential abundance via LEfSe, ALDEx2, and ANCOM-BC, (5) functional profiling with COG/KEGG mapping and ARG detection across 20 resistance gene classes, and (6) SparCC-inspired co-occurrence network inference.