arXiv:2604.16207v1 Announce Type: cross Abstract: As forgery types continue to emerge consistently, Incremental Face Forgery Detection (IFFD) has become a crucial paradigm. However, existing methods typically rely on data replay or coarse binary supervision, which fails to explicitly constrain the feature space, leading to severe feature drift and catastrophic forgetting. To address this, we propose […]
Dynamic Sampling that Adapts: Self-Aware Iterative Data Persistent Optimization for Mathematical Reasoning
arXiv:2505.16176v2 Announce Type: replace Abstract: In mathematical reasoning, data selection strategies predominantly rely on static, externally defined metrics, which fail to adapt to the evolving capabilities of models during training. This misalignment limits the efficiency of Supervised Fine-Tuning and Reinforcement Learning. To bridge this gap, we introduce SAI-DPO (Self-Aware Iterative Data Persistent Optimization), a dynamic […]
Subjective and Objective Quality-of-Experience Evaluation Study for Live Video Streaming
arXiv:2409.17596v2 Announce Type: replace-cross Abstract: In recent years, live video streaming has gained widespread popularity across various social media platforms. Quality of experience (QoE), which reflects end-users’ satisfaction and overall experience, plays a critical role for media service providers to optimize large-scale live compression and transmission strategies to achieve perceptually optimal rate-distortion trade-off. Although many […]
vla-eval: A Unified Evaluation Harness for Vision-Language-Action Models
arXiv:2603.13966v2 Announce Type: replace Abstract: Vision-Language-Action (VLA) models are increasingly evaluated across multiple simulation benchmarks, yet adding each benchmark to an evaluation pipeline requires resolving incompatible dependencies, matching underspecified evaluation protocols, and reverse-engineering undocumented preprocessing. This burden scales with the number of models and benchmarks, making comprehensive evaluation impractical for most teams. We present vla-eval, […]
Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants
arXiv:2510.24328v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are increasingly used to answer everyday questions, yet their performance on culturally grounded and dialectal content remains uneven across languages. We propose a comprehensive method that (i) translates Modern Standard Arabic (MSA) multiple-choice questions (MCQs) into English and several Arabic dialects, (ii) converts them into open-ended […]
The Illusion of Equivalence: Systematic FP16 Divergence in KV-Cached Autoregressive Inference
arXiv:2604.15409v1 Announce Type: cross Abstract: KV caching is a ubiquitous optimization in autoregressive transformer inference, long presumed to be numerically equivalent to cache-free computation. This assumption fails under standard FP16 precision: cache-ON and cache-OFF execution paths employ different floating-point accumulation orderings which, due to FP16 non-associativity, produce a deterministic divergence in decoded token sequences. Across […]
KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs
arXiv:2604.13226v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) rely heavily on Key-Value (KV) caching to minimize inference latency. However, standard KV caches are context-dependent: reusing a cached document in a new context requires recomputing KV states to account for shifts in attention distribution. Existing solutions such as CacheBlend, EPIC, and SAM-KV mitigate this issue […]
StoSignSGD: Unbiased Structural Stochasticity Fixes SignSGD for Training Large Language Models
arXiv:2604.15416v1 Announce Type: cross Abstract: Sign-based optimization algorithms, such as SignSGD, have garnered significant attention for their remarkable performance in distributed learning and training large foundation models. Despite their empirical superiority, SignSGD is known to diverge on non-smooth objectives, which are ubiquitous in modern machine learning due to ReLUs, max-pools, and mixture-of-experts. To overcome this […]
Bureaucratic Silences: What the Canadian AI Register Reveals, Omits, and Obscures
arXiv:2604.15514v1 Announce Type: new Abstract: In November 2025, the Government of Canada operationalized its commitment to transparency by releasing its first Federal AI Register. In this paper, we argue that such registers are not neutral mirrors of government activity, but active instruments of ontological design that configure the boundaries of accountability. We analyzed the Register’s […]
How people use Copilot for Health
arXiv:2604.15331v1 Announce Type: cross Abstract: We analyze over 500,000 de-identified health-related conversations with Microsoft Copilot from January 2026 to characterize what people ask conversational AI about health. We develop a hierarchical intent taxonomy of 12 primary categories using privacy-preserving LLM-based classification validated against expert human annotation, and apply LLM-driven topic-clustering for prevalent themes within each […]
Sparse regression, classification, and microbial network estimation in QIIME2 with q2-classo and q2-gglasso
arXiv:2604.15520v1 Announce Type: new Abstract: Motivation: Statistical analysis of microbial count data derived from 16S rRNA or metagenomics sequencing poses unique challenges due to the sparse, compositional, and high-dimensional nature of the data. While QIIME 2 already provides many tools for data pre-processing and analysis, plugins for statistical regression, classification, and microbial network estimation tailored […]
A Comparative Study on the Impact of Traditional Learning and Interactive Learning on Students’ Academic Performance and Emotional Well-Being
arXiv:2604.15335v1 Announce Type: cross Abstract: The growing adoption of interactive learning tools in higher education offers new opportunities to enhance student performance and well-being. This study compares the effects of traditional and interactive learning methods on academic performance, engagement, motivation, and emotional well-being among 100 university students enrolled in a computer intrusion detection course. Participants […]