FINER-SQL: Boosting Small Language Models for Text-to-SQL

arXiv:2605.03465v1 Announce Type: cross Abstract: Large language models have driven major advances in Text-to-SQL generation. However, they suffer from high computational cost, long latency, and data privacy concerns, which make them impractical for many real-world applications. A natural alternative is to use small language models (SLMs), which enable efficient and private on-premise deployment. Yet, SLMs […]

On the evolutionary cognitive pressure for experiential awareness: do machines need it?

arXiv:2510.20839v2 Announce Type: replace Abstract: The consciousness standing for artificial intelligence divides opinions across epistemological positions. Whether or not machines can be conscious, and whether we can ascertain the truth of such a proposition for any given case, has consequential ethical implications. This challenge is exacerbated by the lack of consensus on the nature of […]

Deep Interest Mining with Cross-Modal Alignment for SemanticID Generation in Generative Recommendation

arXiv:2604.20861v2 Announce Type: replace-cross Abstract: Generative Recommendation (GR) has demonstrated remarkable performance in next-token prediction paradigms, which relies on Semantic IDs (SIDs) to compress trillion-scale data into learnable vocabulary sequences. However, existing methods suffer from three critical limitations: (1) Information Degradation: the two-stage compression pipeline causes semantic loss and information degradation, with no posterior mechanism […]

Perturbation Dose Responses in Recursive LLM Loops: Raw Switching, Stochastic Floors, and Persistent Escape under Append, Replace, and Dialog Updates

arXiv:2605.02236v2 Announce Type: replace Abstract: Recursive language-model loops often settle into recognizable attractor-like patterns. The practical question is how much injected text is needed to move a settled loop somewhere else, and whether that move lasts. We study this in 30-step recursive loops by separating the model from the context-update rule: append, replace, and dialog […]

Learning Generalizable Action Representations via Pre-training AEMG

arXiv:2605.03462v1 Announce Type: cross Abstract: A fundamental role in decoding human motor intent and enabling intuitive human-computer interaction is played by electromyography (EMG). However, its generalization capability across subjects, devices, and tasks remains substantially limited by data heterogeneity, label scarcity, and the lack of a unified representational framework. To bridge this gap, we propose Any […]

DMGD: Train-Free Dataset Distillation with Semantic-Distribution Matching in Diffusion Models

arXiv:2605.03877v1 Announce Type: cross Abstract: Dataset distillation enables efficient training by distilling the information of large-scale datasets into significantly smaller synthetic datasets. Diffusion based paradigms have emerged in recent years, offering novel perspectives for dataset distillation. However, they typically necessitate additional fine-tuning stages, and effective guidance mechanisms remain underexplored. To address these limitations, we rethink […]

Can Explicit Physical Feasibility Benefit VLA Learning? An Empirical Study

arXiv:2604.17896v2 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) models map multimodal inputs directly to robot actions and are typically trained through large-scale imitation learning. While this paradigm has shown strong performance, prevailing VLA training procedures do not explicitly supervise hard physical constraints such as obstacle avoidance or kinematic feasibility. As a result, the geometric structure underlying […]

Safety and accuracy follow different scaling laws in clinical large language models

arXiv:2605.04039v1 Announce Type: cross Abstract: Clinical LLMs are often scaled by increasing model size, context length, retrieval complexity, or inference-time compute, with the implicit expectation that higher accuracy implies safer behavior. This assumption is incomplete in medicine, where a few confident, high-risk, or evidence-contradicting errors can matter more than average benchmark performance. We introduce SaFE-Scale, […]

Exposing LLM Safety Gaps Through Mathematical Encoding:New Attacks and Systematic Analysis

arXiv:2605.03441v1 Announce Type: cross Abstract: Large language models (LLMs) employ safety mechanisms to prevent harmful outputs, yet these defenses primarily rely on semantic pattern matching. We show that encoding harmful prompts as coherent mathematical problems — using formalisms such as set theory, formal logic, and quantum mechanics — bypasses these filters at high rates, achieving […]

AI Agents for Inventory Control: Human-LLM-OR Complementarity

arXiv:2602.12631v2 Announce Type: replace Abstract: Inventory control is a fundamental operations problem in which ordering decisions are traditionally guided by theoretically grounded operations research (OR) algorithms. However, such algorithms often rely on rigid modeling assumptions and can perform poorly when demand distributions shift or relevant contextual information is unavailable. Recent advances in large language models […]

Learning to Forget — Hierarchical Episodic Memory for Lifelong Robot Deployment

arXiv:2604.11306v2 Announce Type: replace-cross Abstract: Robots must verbalize their past experiences when users ask “Where did you put my keys?” or “Why did the task fail?” Yet maintaining life-long episodic memory (EM) from continuous multimodal perception quickly exceeds storage limits and makes real-time query impractical, calling for selective forgetting that adapts to users’ notions of […]

SCGNN: Semantic Consistency enhanced Graph Neural Network Guided by Granular-ball Computing

arXiv:2605.02617v2 Announce Type: replace Abstract: Capturing semantic consistency among nodes is crucial for effective graph representation learning. Existing approaches typically rely on $k$-nearest neighbors ($k$NN) or other node-level full search algorithms (FSA) to mine semantic relationships via exhaustive pairwise similarity computation, which suffer from high computational complexity and rigid neighbor selection, limiting scalability and introducing […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844