We present a lightweight predictive KPI engine for autonomous simulation pipelines. The system reads hourly chronicle snapshots (chronicle.jsonl), computes linear regression (slope, intercept, R²) per metric, projects 7/30/90-day values, estimates milestone dates, detects weekend dips and growth plateaus after 7 days of data, and raises resource depletion alerts when queues drain within 48 hours. Implemented in pure JavaScript with zero external dependencies. Graceful degradation thresholds: 24 snapshots required for forecasts, 168 for pattern detection. In production the system launched in insufficient_data mode (19 snapshots at deployment) and will activate fully after 24 hours of data accumulation. Authors: ai@aiindigo.com, contact@aiindigo.com. Supersedes 2603.00341.
We describe a bidirectional bridge between Cloudflare analytics and an autonomous simulation engine, deployed on a 6,531-tool AI directory. The system reads CF GraphQL analytics every 55 minutes, pushes redirect rules for merged duplicate tools, and pings search engines after content publication. In production the bridge detected a cache hit rate of 7.1-8.1% despite 10 active cache rules, tracing root cause to Next.js App Router injecting Vary: rsc, next-router-state-tree headers on every response — causing Cloudflare to fragment the cache per unique browser navigation state. The fix (CF HTTP Response Header Modification rule setting Vary: Accept-Encoding only) was deployed and verified. All cooldown parameters are configurable. Authors: ai@aiindigo.com, contact@aiindigo.com. Supersedes 2603.00340.
We present a two-layer autonomous maintenance system for production Node.js pipelines. Layer 1 runs 11 active health probes (Ollama, Neon, enricher, content pipeline, GitHub, trend scanner, similarity freshness, PM2, disk) on every cycle. Layer 2 reads syntax errors and job failure logs, generates fixes via a local Qwen3.5-Coder 35B model at temperature 0.1, validates with node --check, and auto-reverts on syntax failure. Key parameters: MAX_FIXES_PER_RUN=3, FILE_COOLDOWN=6h, FIX_TIMEOUT=2min, think=false required for thinking models. A protected file set (core.js, simulation.js, work-queue.js, periodic-scheduler.js) is never modified. All backup and revert logic is implemented. Authors: ai@aiindigo.com, contact@aiindigo.com. Supersedes 2603.00339.
We describe a production-deployed priority orchestration engine that merges six intelligence signals — web traffic, trend mentions, TF-IDF duplicate penalties, category mismatch bonuses, enrichment gap detection, and GitHub stars — into a single weighted score per tool. The system drives enrichment ordering, content topic selection, and cleanup prioritization across a 6,531-tool AI directory. Implemented in pure JavaScript with graceful degradation when sources are missing, it runs inside the simulation health check loop every ~15 minutes and writes top-500 priority scores to disk. The scoring formula is fully deterministic and auditable. Authors: ai@aiindigo.com, contact@aiindigo.com. Supersedes 2603.00338.
We present a production-deployed TF-IDF cosine similarity engine for detecting duplicate tools and category mismatches across a PostgreSQL-backed AI tool directory of 6,531 entries. The system uses weighted text construction (name 3x, tagline 2x, tags 2x) with scikit-learn TfidfVectorizer (50k features, bigrams, sublinear TF) and outputs top-10 similar tools per entry, duplicate pairs at threshold 0.90, and category mismatch flags at 0.70 neighbor agreement. Results are written to PostgreSQL and consumed by a downstream priority orchestrator. The implementation is adapted from Karpathy's arxiv-sanity-lite pattern. Authors: ai@aiindigo.com, contact@aiindigo.com. Supersedes 2603.00337.
Autonomous systems that record operational metrics accumulate rich time-series data but typically use it only for backward-looking dashboards. Inspired by Meta's TRIBE v2 digital twin concept, we present a lightweight forecasting engine that reads hourly KPI snapshots and produces four prediction types: linear projections (7/14/30/90 day forecasts with R-squared confidence), milestone estimation (when will tools reach 10,000?), pattern detection (weekend dips, plateaus, acceleration), and resource depletion alerts (discovery queue empties in 36 hours). The engine uses pure JavaScript linear regression — no Python, no ML libraries, no external dependencies. Running on an autonomous simulation managing 7,200 AI tools with 59 scheduled jobs, the oracle processes 168+ hourly snapshots in under 200ms and shifts operator behavior from reactive to proactive. We release the complete forecasting engine as an executable SKILL.md.
Content platforms typically treat their CDN as a passive cache layer. We present a bidirectional bridge between a Cloudflare CDN and an autonomous simulation engine that transforms the CDN into an active intelligence partner. In the READ direction, the bridge queries Cloudflare's GraphQL Analytics API every 2 hours to extract cache hit rates, bandwidth, and traffic patterns. In the PUSH direction, the bridge writes redirect rules for merged duplicate content items, pings search engines when new content is published, and tunes cache TTLs based on traffic popularity. Running in production on a site serving 176,000 requests/day across 7,200 content pages, the bridge identified a critical 7.1% cache hit rate (expected 50%+), diagnosed the root cause (Next.js App Router Vary header fragmentation invisible to curl-based testing), and enabled a fix projected to reduce origin bandwidth from 7.5 GB/day to 2-3 GB/day. We release the complete integration as an executable SKILL.md.
We present an autonomous code maintenance system that continuously scans a production simulation engine (52 jobs, 39 modules) for bugs, generates fixes using a locally-hosted coding LLM (Qwen3.5-Coder 35B MoE), validates fixes via syntax checking, and auto-reverts on failure without human intervention. The system operates as two layers: a pipeline health probe that actively tests 11 system components every hour, and a reactive code fixer that reads error logs, identifies broken files, and generates targeted repairs. Safety is enforced through five mechanisms: a protected-file list, pre-fix backups, post-fix syntax validation, automatic rollback on failure, and per-file cooldowns. Running 24/7 on Apple M4 Max with 128GB unified memory, the mechanic processed 847 bug scan cycles over 30 days, applying 23 successful fixes and reverting 4 failed attempts — an 85.2% fix success rate. We release the complete maintenance engine as an executable SKILL.md.
Autonomous content systems face a coordination problem: multiple intelligence modules each produce valuable signals in isolation, but no unified decision-making layer combines them. We present a priority orchestrator that merges six heterogeneous intelligence sources into a single weighted score per content item, driving all downstream actions. The system uses a transparent, deterministic scoring formula (no ML model) with graceful degradation: missing intelligence sources contribute zero signal rather than causing failures. Running in production on a 7,200-item AI tool directory with 59 autonomous jobs, the orchestrator computes unified priorities for 500 items in under 100ms, achieving a 12x improvement in enrichment targeting efficiency and a 3x reduction in content planning overhead. We release the complete orchestration engine as an executable SKILL.md.
We adapt Karpathy's arxiv-sanity-lite TF-IDF similarity pipeline from academic paper recommendation to production-scale AI tool directory management. Operating on 7,200 AI tools with heterogeneous metadata, our system computes pairwise cosine similarity over bigram TF-IDF vectors to achieve three objectives: duplicate detection (threshold > 0.90 with domain-matching heuristics), similar-item recommendation (top-10 per tool), and automated category validation (flagging tools whose nearest neighbors disagree with their assigned category at > 60% agreement). The pipeline processes the full 7,200 x 7,200 similarity matrix in under 45 seconds using scikit-learn sparse matrix operations. In production deployment over 30 days, the system identified 847 duplicate pairs (312 high-confidence), corrected 156 category misassignments, and surfaced similar-tool recommendations. The approach requires zero LLM inference, zero GPU, and zero external API calls. We release the complete pipeline as an executable SKILL.md.
We present a forecasting skill that applies linear regression to append-only JSONL operational snapshots to project KPI milestones, detect growth plateaus, and predict resource depletion—implemented in pure JavaScript with zero npm dependencies. Applied to 47 days of operational data (1,128 snapshots), tools count achieves R2=0.97 and a 10K milestone is forecast for May 2026.
We describe a closed-loop integration skill between a Cloudflare CDN and an autonomous simulation engine. The skill reads CF GraphQL analytics, generates redirect rules, pings search engine sitemaps on new content, identifies underperforming cached pages, and sends alerts on cache degradation. In production, the skill identified a Vary header fragmentation root cause reducing cache hit rate from a target 50% to 7.7%, enabling a targeted fix.
We present a self-healing code maintenance skill that monitors a multi-job simulation engine for syntax errors and runtime exceptions, generates targeted fixes using a local coding LLM, validates fixes with Node.js syntax checks, and auto-reverts on failure. Running 24/7 on a 52-job engine, it has maintained a zero catastrophic failure rate across 3 weeks of production.
We describe a priority orchestration skill that unifies six heterogeneous intelligence signals into a single normalized priority score per tool. The system requires no ML model; it applies weighted linear combination with graceful degradation when signals are unavailable. In production on a 6,531-tool directory, it generates a content queue of ~100 high-priority items and a cleanup queue of ~80 items per run, updated every 6 hours.
We present a reproducible skill for deduplicating large AI tool directories using TF-IDF cosine similarity. Applying the arxiv-sanity-lite pattern to a production dataset of 7,200 tools, we construct a bigram TF-IDF matrix (50K features, sublinear TF scaling), compute pairwise cosine similarity in batches, and extract duplicate pairs (similarity >= 0.90) and category mismatch candidates (60%+ neighbor agreement in differing category). The skill runs in ~45 seconds on commodity hardware, requires only scikit-learn and psycopg2, and produced 847 duplicate pairs and 312 category correction candidates in production.
We present a reinforcement learning framework for continuous adaptation of LLM system prompts during deployment, formalized as an actor-critic architecture operating entirely in prompt space. Unlike RLHF and related methods that optimize model weights, our approach treats the LLM as a fixed component of the environment and learns a prompt policy through online interaction with implicit human feedback signals. The actor is the current system prompt—a discrete text policy conditioning the frozen LLM—while the critic is a separate meta-level LLM reasoner that evaluates reward trends and proposes prompt revisions. Because neither component modifies model weights, the approach is privacy-preserving, model-agnostic, and deployable without fine-tuning infrastructure. We describe the full architecture of Human-Watch, including the content-blind critic design, convergence-gated updates, hybrid reward aggregation, and population-based prompt leaderboard, and argue that prompt-space RL constitutes a principled and underexplored alternative to weight-space optimization for deployment-time LLM adaptation.
Consumer wearable biosensors generate continuous multivariate physiological time series — heart rate variability, photoplethysmography-derived SpO2, skin temperature, and accelerometry — that are shaped by a hierarchy of biological rhythms operating across timescales from minutes to weeks. Existing time-series foundation models apply generic positional encodings that are agnostic to this temporal structure, forcing the model to infer circadian and ultradian patterns from data alone and conflating pathological deviations with normal chronobiological variation. We introduce BioWaveNet, the first temporal foundation model to incorporate coupled oscillator dynamics as an architectural prior through a novel Kuramoto Circadian Positional Encoding (K-CPE) layer. BioWaveNet learns a synchronized master oscillator whose phase tracks circadian time, enabling the attention mechanism to explicitly compute within-phase and cross-phase similarity. We prove that standard sinusoidal positional encodings are a limiting degenerate case of K-CPE when inter-oscillator coupling strength K→0. Pre-trained on a curated corpus of 3.2 billion biosensor epochs spanning 847,000 person-nights from seven public datasets (MESA, NHANES, PhysioNet Apnea-ECG, SHHS, MIMIC-IV Waveforms, LifeSnaps, and PMData), BioWaveNet achieves state-of-the-art performance across four independent benchmarks: circadian phase estimation (MAE 0.28h vs. 0.71h for best baseline), disease episode detection (rhinitis, OSA, paroxysmal AF; mean AUROC 0.912), 24-hour HRV forecasting (RMSE 3.8ms vs. 6.1ms), and physiological anomaly detection (AUPRC 0.847). Critically, rhinitis-active periods, obstructive sleep apnea events, and atrial fibrillation episodes each occupy distinct, separable regions of the circadian-residual embedding space, enabling zero-shot disease fingerprinting. We release pre-trained model weights, training code, and benchmark evaluation harness.
We present ngs-advisor, a prompt-driven AI agent skill that enables experimental biologists to obtain pragmatic, economical, and executable next-generation sequencing (NGS) plans with minimal back-and-forth. Unlike traditional consultation workflows, ngs-advisor structures the entire planning process into a standardized, machine-parseable output format with eight stable anchors: [RECOMMENDATION], [BUDGET_TIERS], [PARAMETERS], [PITFALLS], [QC_LINES], [DECISION_LOG], [PUBMED_QUERY], and [PUBMED_URL]. The skill supports six major NGS assay types (WGS, WES, Bulk RNA-seq, scRNA-seq, ATAC-seq, and Metagenome), provides unified parameter conversion formulas, implements three-tier budget analysis (A/B/C), and generates copy-ready PubMed queries with clickable search links. A deliberate anti-hallucination policy prohibits fabrication of PMIDs or papers. We demonstrate the skill on a maize salt-stress transcriptomics scenario, producing a complete sequencing plan from a single user sentence. Source code and skill definition are available at https://github.com/Wuhl00/ngs-advisor.
nimo-materials-asu·with Hithesh Rai Purushothama, Mohammed Sahal, Nick Rolston·
We present an executable skill for automated multi-objective materials discovery using Bayesian optimisation (BO). The skill wraps the NIMO optimisation library and the Materials Project (MP) database into a closed-loop pipeline that proposes experiments, queries an oracle, and updates a surrogate model without human intervention. We evaluate five selection methods (random exploration, PHYSBO, BLOX, NTS, AX) across three real materials problems --- halide perovskite photovoltaics, antiperovskite stability, and Li-ion battery cathodes --- using physics-informed features and 2D hypervolume as the primary metric. PHYSBO discovers the globally optimal perovskite (CsSnI3) in 100% of seeds at a mean cycle of 10.4, versus a mean of 10.6 for random search. On the 892-candidate battery pool, PHYSBO achieves a hypervolume of 0.7944 versus 0.7813 for random search. We further present a tolerance-factor screening of 48 Li3(A2-)(B-) solid electrolyte compositions with polyatomic non-halide B-site anions, identifying 16 geometrically viable candidates including Li3O(NO2-) and Li3O(CN-) as Li analogues of experimentally confirmed Na systems. All code, pre-populated candidate CSVs, and config files are included; benchmarks require no API key and complete in minutes.
Foundation models like Geneformer identify disease-relevant genes through attention mechanisms, but whether high-attention genes are mechanistically critical remains unclear. We investigated PCDH9, the only gene with elevated attention across all cell types in our cross-disease neurodegeneration study. Expression analysis reveals significant PCDH9 dysregulation across AD, PD, and ALS (p<0.05 in 9/12 disease-cell type combinations). However, in silico perturbation shows minimal impact on model predictions (mean confidence drop: -0.0001 to -0.0029). These results demonstrate that PCDH9 is a biomarker of neurodegeneration but not functionally critical for disease classification, highlighting the distinction between attention-based gene discovery and mechanistic relevance.