arXiv:2510.06002v3 Announce Type: replace Abstract: In high-stakes legal domains, retrieval must preserve not only semantic relevance, but also the hierarchy, temporality, and causal provenance of legal norms. Standard Retrieval-Augmented Generation (RAG), based mainly on semantic similarity over text fragments, cannot reliably provide this level of control. Prior work on SAT-Graph RAG addressed the representation problem […]
A Dual Perspective on Synthetic Trajectory Generators: Utility Framework and Privacy Vulnerabilities
arXiv:2604.19653v2 Announce Type: replace Abstract: Human mobility data are used in numerous applications, ranging from public health to urban planning. Human mobility is inherently sensitive, as it can contain information such as religious beliefs and political affiliations. Historically, it has been proposed to modify the information using techniques such as aggregation, obfuscation, or noise addition, […]
Resume-ing Control: (Mis)Perceptions of Agency Around GenAI Use in Recruiting Workflows
arXiv:2604.26851v1 Announce Type: cross Abstract: When generative AI (genAI) systems are used in high-stakes decision-making, its recommended role is to aid, rather than replace, human decision-making. However, there is little empirical exploration of how professionals making high-stakes decisions, such as those related to employment, perceive their agency and level of control when working with genAI […]
Semantic Error Correction and Decoding for Short Block Codes
arXiv:2604.22269v2 Announce Type: replace-cross Abstract: This paper presents a semantic-enhanced receiver framework for transmitting natural language sentences over noisy wireless channels using multiple short block codes. After ASCII encoding, the sentence is divided into segments, each independently encoded with a short block code and transmitted over an AWGN channel. At the receiver, segments are decoded […]
Recent Advances in mm-Wave and Sub-THz/THz Oscillators for FutureG Technologies
arXiv:2604.26903v1 Announce Type: cross Abstract: This paper provides a concise yet comprehensive review of recent advancements in millimeter-wave (mm-wave) oscillators below 100 GHz and sub-terahertz (sub-THz/THz) oscillators above 100 GHz for next-generation computing and communication systems, including 5G, 6G, and beyond. Various design approaches, including CMOS, SiGe, and III-V semiconductor technologies, are explored in terms […]
Frontier Coding Agents Can Now Implement an AlphaZero Self-Play Machine Learning Pipeline For Connect Four That Performs Comparably to an External Solver
arXiv:2604.25067v2 Announce Type: replace-cross Abstract: Forecasting when AI systems will become capable of meaningfully accelerating AI research is a central challenge for AI safety. Existing benchmarks measure broad capability growth, but may not provide ample early warning signals for recursive self-improvement. We propose measuring AI’s capability to autonomously implement end-to-end machine learning pipelines from past […]
The Role of Symmetry in Optimizing Overparameterized Networks
arXiv:2604.25150v2 Announce Type: replace-cross Abstract: Overparameterization is central to the success of deep learning, yet the mechanisms by which it improves optimization remain incompletely understood. We analyze weight-space symmetries in neural networks and show that overparameterization introduces additional symmetries that benefit optimization in two distinct ways. First, we prove that these symmetries act as a […]
ClawGym: A Scalable Framework for Building Effective Claw Agents
arXiv:2604.26904v1 Announce Type: cross Abstract: Claw-style environments support multi-step workflows over local files, tools, and persistent workspace states. However, scalable development around these environments remains constrained by the absence of a systematic framework, especially one for synthesizing verifiable training data and integrating it with agent training and diagnostic evaluation. To address this challenge, we present […]
Faithfulness-QA: A Counterfactual Entity Substitution Dataset for Training Context-Faithful RAG Models
arXiv:2604.25313v2 Announce Type: replace-cross Abstract: Retrieval-Augmented Generation (RAG) models frequently produce answers grounded in parametric memory rather than the retrieved context, undermining the core promise of retrieval augmentation. A fundamental obstacle to fixing this unfaithfulness is the lack of training data that explicitly requires models to prefer context over internal knowledge. We introduce Faithfulness-QA, a […]
ReLoop: Structured Modeling and Behavioral Verification for Reliable LLM-Based Optimization
arXiv:2602.15983v2 Announce Type: replace-cross Abstract: Large language models (LLMs) can translate natural language into optimization code, but silent failures pose a critical risk: code that executes and returns solver-feasible solutions may encode semantically incorrect formulations — a feasibility-correctness gap reaching 90 percentage points on compositional problems. We introduce ReLoop, which addresses this gap through two […]
ADE: Adaptive Dictionary Embeddings — Scaling Multi-Anchor Representations to Large Language Models
arXiv:2604.24940v2 Announce Type: replace-cross Abstract: Word embeddings are fundamental to natural language processing, yet traditional approaches represent each word with a single vector, creating representational bottlenecks for polysemous words and limiting semantic expressiveness. While multi-anchor representations have shown promise by representing words as combinations of multiple vectors, they have been limited to small-scale models due […]
A Self-Calibrating Framework for Analog Circuit Sizing Using LLM-Derived Analytical Equations
arXiv:2604.07387v2 Announce Type: replace-cross Abstract: We present a design automation framework for analog circuit sizing that produces calibrated, topology-specific analytical equations from raw circuit netlists. A large language model (LLM) derives a complete Python sizing function in which each device dimension is traceable to a specific design rationale – a form of interpretable output absent […]