Judging the Judges: A Systematic Evaluation of Bias Mitigation Strategies in LLM-as-a-Judge Pipelines

This startup’s new mechanistic interpretability tool lets you debug LLMs

The San Francisco–based startup Goodfire just released a new tool, called Silico, that lets researchers and engineers peer inside an AI model and adjust its

Recent Advances in mm-Wave and Sub-THz/THz Oscillators for FutureG Technologies

arXiv:2604.26903v1 Announce Type: cross Abstract: This paper provides a concise yet comprehensive review of recent advancements in millimeter-wave (mm-wave) oscillators below 100 GHz and sub-terahertz

Domain-Adapted Small Language Models for Reliable Clinical Triage

arXiv:2604.26766v1 Announce Type: cross Abstract: Accurate and consistent Emergency Severity Index (ESI) assignment remains a persistent challenge in emergency departments, where highly variable free-text triage

CheXthought: A global multimodal dataset of clinical chain-of-thought reasoning and visual attention for chest X-ray interpretation

arXiv:2604.26288v1 Announce Type: cross Abstract: Chest X-ray interpretation is one of the most frequently performed diagnostic tasks in medicine and a primary target for AI

Probe-then-Plan: Environment-Aware Planning for Industrial E-commerce Search

arXiv:2603.15262v2 Announce Type: replace Abstract: Modern e-commerce search is evolving to resolve complex user intents. While Large Language Models (LLMs) offer strong reasoning, existing LLM-based