arXiv:2604.05429v2 Announce Type: replace-cross Abstract: Addressing the critical need for intelligent, context-aware energy management in renewable systems, we introduce the OpenCEM Simulator and Dataset: the first open-source digital twin explicitly designed to integrate rich, unstructured contextual information with quantitative renewable energy dynamics. Traditional energy management relies heavily on numerical time series, thereby neglecting the significant […]
Towards Privacy-Preserving Large Language Model: Text-free Inference Through Alignment and Adaptation
arXiv:2604.06831v1 Announce Type: cross Abstract: Current LLM-based services typically require users to submit raw text regardless of its sensitivity. While intuitive, such practice introduces substantial privacy risks, as unauthorized access may expose personal, medical, or legal information. Although prior defenses strived to mitigate these risks, they often incur substantial computational overhead and degrade model performance. […]
On the Step Length Confounding in LLM Reasoning Data Selection
arXiv:2604.06834v1 Announce Type: cross Abstract: Large reasoning models have recently demonstrated strong performance on complex tasks that require long chain-of-thought reasoning, through supervised fine-tuning on large-scale and high-quality datasets. To construct such datasets, existing pipelines generate long reasoning data from more capable Large Language Models (LLMs) and apply manually heuristic or naturalness-based selection methods to […]
Governance and Regulation of Artificial Intelligence in Developing Countries: A Case Study of Nigeria
arXiv:2604.06018v2 Announce Type: replace-cross Abstract: This study examines the perception of legal professionals on the governance of AI in developing countries, using Nigeria as a case study. The study focused on ethical risks, regulatory gaps, and institutional readiness. The study adopted a qualitative case study design. Data were collected through 27 semi-structured interviews with legal […]
High-Precision Estimation of the State-Space Complexity of Shogi via the Monte Carlo Method
arXiv:2604.06189v1 Announce Type: new Abstract: Determining the state-space complexity of the game of Shogi (Japanese Chess) has been a challenging problem, with previous combinatorial estimates leaving a gap of five orders of magnitude ($10^64$ to $10^69$). This large gap arises from the difficulty of distinguishing Shogi positions legally reachable from the initial position among the […]
Self-Preference Bias in Rubric-Based Evaluation of Large Language Models
arXiv:2604.06996v1 Announce Type: cross Abstract: LLM-as-a-judge has become the de facto approach for evaluating LLM outputs. However, judges are known to exhibit self-preference bias (SPB): they tend to favor outputs produced by themselves or by models from their own family. This skews evaluations and, thus, hinders model development, especially in settings of recursive self-improvement. We […]
CSA-Graphs: A Privacy-Preserving Structural Dataset for Child Sexual Abuse Research
arXiv:2604.07132v1 Announce Type: cross Abstract: Child Sexual Abuse Imagery (CSAI) classification is an important yet challenging problem for computer vision research due to the strict legal and ethical restrictions that prevent the public sharing of CSAI datasets. This limitation hinders reproducibility and slows progress in developing automated methods. In this work, we introduce CSA-Graphs, a […]
Noise Immunity in In-Context Tabular Learning: An Empirical Robustness Analysis of TabPFN’s Attention Mechanisms
arXiv:2604.04868v2 Announce Type: replace-cross Abstract: Tabular foundation models (TFMs) such as TabPFN (Tabular Prior-Data Fitted Network) are designed to generalize across heterogeneous tabular datasets through in-context learning (ICL). They perform prediction in a single forward pass conditioned on labeled examples without dataset-specific parameter updates. This paradigm is particularly attractive in industrial domains (e.g., finance and […]
Unavailability of experimental 3D structural data on protein folding dynamics and necessity for a new generation of structure prediction methods in this context
arXiv:2507.08188v2 Announce Type: replace Abstract: Motivation: Protein folding is a dynamic process during which a protein’s amino acid sequence undergoes a series of 3-dimensional (3D) conformational changes en route to reaching a native 3D structure; the resulting 3D structural conformations are called folding intermediates. While data on native 3D structures are abundant, data on 3D […]
Environmental, Social and Governance Sentiment Analysis on Slovene News: A Novel Dataset and Models
arXiv:2604.06826v1 Announce Type: cross Abstract: Environmental, Social, and Governance (ESG) considerations are increasingly integral to assessing corporate performance, reputation, and long-term sustainability. Yet, reliable ESG ratings remain limited for smaller companies and emerging markets. We introduce the first publicly available Slovene ESG sentiment dataset and a suite of models for automatic ESG sentiment detection. The […]
From Exploration to Revelation: Detecting Dark Patterns in Mobile Apps
arXiv:2411.18084v2 Announce Type: replace-cross Abstract: Mobile apps are essential in daily life but frequently employ deceptive patterns, such as visual emphasis or linguistic nudging, to manipulate user behavior. Existing research largely relies on manual detection, which is time-consuming and cannot keep pace with rapidly evolving apps. Although recent work has explored automated approaches, these methods […]
Temporal Inversion for Learning Interval Change in Chest X-Rays
arXiv:2604.04563v2 Announce Type: replace-cross Abstract: Recent advances in vision–language pretraining have enabled strong medical foundation models, yet most analyze radiographs in isolation, overlooking the key clinical task of comparing prior and current images to assess interval change. For chest radiographs (CXRs), capturing interval change is essential, as radiologists must evaluate not only the static appearance […]