arXiv:2410.09134v2 Announce Type: replace-cross Abstract: The need for autonomous and adaptive defense mechanisms has become paramount in the rapidly evolving landscape of cyber threats. Multi-Agent Deep Reinforcement Learning (MADRL) presents a promising approach to enhancing the efficacy and resilience of autonomous cyber operations. This paper explores the application of Multi-Agent Actor-Critic algorithms which provides a […]
EuraGovExam: A Multilingual Multimodal Benchmark from Real-World Civil Service Exams
arXiv:2603.27223v1 Announce Type: cross Abstract: We present EuraGovExam, a multilingual and multimodal benchmark sourced from real-world civil service examinations across five representative Eurasian regions: South Korea, Japan, Taiwan, India, and the European Union. Designed to reflect the authentic complexity of public-sector assessments, the dataset contains over 8,000 high-resolution scanned multiple-choice questions covering 17 diverse academic […]
Bitboard version of Tetris AI
arXiv:2603.26765v1 Announce Type: new Abstract: The efficiency of game engines and policy optimization algorithms is crucial for training reinforcement learning (RL) agents in complex sequential decision-making tasks, such as Tetris. Existing Tetris implementations suffer from low simulation speeds, suboptimal state evaluation, and inefficient training paradigms, limiting their utility for large-scale RL research. To address these […]
LLMs versus the Halting Problem: Revisiting Program Termination Prediction
arXiv:2601.18987v4 Announce Type: replace-cross Abstract: Determining whether a program terminates is a central problem in computer science. Turing’s foundational result established the Halting Problem as undecidable, showing that no algorithm can universally determine termination for all programs and inputs. Consequently, automatic verification tools approximate termination, sometimes failing to prove or disprove; these tools rely on […]
Training-Free Diffusion-Driven Modeling of Pareto Set Evolution for Dynamic Multiobjective Optimization
arXiv:2603.26749v1 Announce Type: cross Abstract: Dynamic multiobjective optimization problems (DMOPs) feature time-varying objectives, which cause the Pareto optimal solution (POS) set to drift over time and make it difficult to maintain both convergence and diversity under limited response time. Many existing prediction-based dynamic multiobjective evolutionary algorithms (DMOEAs) either depend on learned models with nontrivial training […]
Explaining, Verifying, and Aligning Semantic Hierarchies in Vision-Language Model Embeddings
arXiv:2603.26798v1 Announce Type: cross Abstract: Vision-language model (VLM) encoders such as CLIP enable strong retrieval and zero-shot classification in a shared image-text embedding space, yet the semantic organization of this space is rarely inspected. We present a post-hoc framework to explain, verify, and align the semantic hierarchies induced by a VLM over a given set […]
Real-World AI Evaluation: How FRAME Generates Systematic Evidence to Resolve the Decision-Maker’s Dilemma
arXiv:2603.13294v4 Announce Type: replace-cross Abstract: Organizational leaders are being asked to make high-stakes decisions about AI deployment without dependable evidence of what these systems actually do in the environments they oversee. The predominant AI evaluation ecosystem yields scalable but abstract metrics that reflect the priorities of model development. By smoothing over the heterogeneity of real-world […]
What-If Explanations Over Time: Counterfactuals for Time Series Classification
arXiv:2603.27792v1 Announce Type: cross Abstract: Counterfactual explanations emerge as a powerful approach in explainable AI, providing what-if scenarios that reveal how minimal changes to an input time series can alter the model’s prediction. This work presents a survey of recent algorithms for counterfactual explanations for time series classification. We review state-of-the-art methods, spanning instance-based nearest-neighbor […]
Recent advances in modeling and simulation of biological phenomena in crowded and cellular environments
arXiv:2603.26974v1 Announce Type: new Abstract: While experiments and computer simulations to study biological phenomena are usually performed in diluted in vitro conditions, such phenomena happen inside the cellular cytoplasm, an environment densely packed with diverse macromolecules. Here, we revise recent computational methods to investigate crowded and cellular environments. Protein crowders, inert crowders and small molecules […]
Can Small Language Models Handle Context-Summarized Multi-Turn Customer-Service QA? A Synthetic Data-Driven Comparative Evaluation
arXiv:2602.00665v2 Announce Type: replace-cross Abstract: Customer-service question answering (QA) systems increasingly rely on conversational language understanding. While Large Language Models (LLMs) achieve strong performance, their high computational cost and deployment constraints limit practical use in resource-constrained environments. Small Language Models (SLMs) provide a more efficient alternative, yet their effectiveness for multi-turn customer-service QA remains underexplored, […]
Coarse-Guided Visual Generation via Weighted h-Transform Sampling
arXiv:2603.12057v2 Announce Type: replace-cross Abstract: Coarse-guided visual generation, which synthesizes fine visual samples from degraded or low-fidelity coarse references, is essential for various real-world applications. While training-based approaches are effective, they are inherently limited by high training costs and restricted generalization due to paired data collection. Accordingly, recent training-free works propose to leverage pretrained diffusion […]
KVSculpt: KV Cache Compression as Distillation
arXiv:2603.27819v1 Announce Type: cross Abstract: KV cache compression is critical for efficient long-context LLM inference. Approaches that reduce the per-pair footprint — quantization and low-rank decomposition — are orthogonal to those that reduce the sequence length of the cache. Along the sequence-length dimension, existing methods range from pure eviction — selecting which KV pairs to […]