arXiv:2504.14569v5 Announce Type: replace-cross Abstract: Large language models (LLMs) exhibit remarkable performance across various natural language processing tasks but suffer from immense computational and memory demands, limiting their deployment in resource-constrained environments. To address this challenge, we propose NoWag (Normalized Weight and Activation Guided Compression), a unified framework for one-shot shape preserving compression algorithms. We […]
Field Matters: A Lightweight LLM-enhanced Method for CTR Prediction
arXiv:2505.14057v2 Announce Type: replace-cross Abstract: Click-through rate (CTR) prediction is a fundamental task in modern recommender systems. In recent years, the integration of large language models (LLMs) has been shown to effectively enhance the performance of traditional CTR methods. However, existing LLM-enhanced methods often require extensive processing of detailed textual descriptions for large-scale instances or […]
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning
arXiv:2506.04207v2 Announce Type: replace-cross Abstract: Inspired by the remarkable reasoning capabilities of Deepseek-R1 in complex textual tasks, many works attempt to incentivize similar capabilities in Multimodal Large Language Models (MLLMs) by directly applying reinforcement learning (RL). However, they still struggle to activate complex reasoning. In this paper, rather than examining multimodal RL in isolation, we […]
Multimodal Multi-Agent Ransomware Analysis Using AutoGen
arXiv:2601.20346v1 Announce Type: cross Abstract: Ransomware has become one of the most serious cybersecurity threats causing major financial losses and operational disruptions worldwide.Traditional detection methods such as static analysis, heuristic scanning and behavioral analysis often fall short when used alone. To address these limitations, this paper presents multimodal multi agent ransomware analysis framework designed for […]
Order-Optimal Sample Complexity of Rectified Flows
arXiv:2601.20250v1 Announce Type: cross Abstract: Recently, flow-based generative models have shown superior efficiency compared to diffusion models. In this paper, we study rectified flow models, which constrain transport trajectories to be linear from the base distribution to the data distribution. This structural restriction greatly accelerates sampling, often enabling high-quality generation with a single Euler step. […]
Towards Compact and Robust DNNs via Compression-aware Sharpness Minimization
arXiv:2601.20301v1 Announce Type: cross Abstract: Sharpness-Aware Minimization (SAM) has recently emerged as an effective technique for improving DNN robustness to input variations. However, its interplay with the compactness requirements of on-device DNN deployments remains less explored. Simply pruning a SAM-trained model can undermine robustness, since flatness in the continuous parameter space does not necessarily translate […]
Optimal illness policy for an unethical daycare center
arXiv:2601.20123v1 Announce Type: cross Abstract: While businesses are typically more profitable if their workers and communities are minimally exposed to diseases, the same is not true for daycare centers. Here it is shown that a daycare center could maximize its profits by maintaining a population of sick children within the center, with the intention to […]
Do we really need Self-Attention for Streaming Automatic Speech Recognition?
arXiv:2601.19960v1 Announce Type: cross Abstract: Transformer-based architectures are the most used architectures in many deep learning fields like Natural Language Processing, Computer Vision or Speech processing. It may encourage the direct use of Transformers in the constrained tasks, without questioning whether it will yield the same benefits as in standard tasks. Given specific constraints, it […]
Size Matters: Reconstructing Real-Scale 3D Models from Monocular Images for Food Portion Estimation
arXiv:2601.20051v1 Announce Type: cross Abstract: The rise of chronic diseases related to diet, such as obesity and diabetes, emphasizes the need for accurate monitoring of food intake. While AI-driven dietary assessment has made strides in recent years, the ill-posed nature of recovering size (portion) information from monocular images for accurate estimation of “how much did […]
Meeting SLOs, Slashing Hours: Automated Enterprise LLM Optimization with OptiKIT
arXiv:2601.20408v1 Announce Type: cross Abstract: Enterprise LLM deployment faces a critical scalability challenge: organizations must optimize models systematically to scale AI initiatives within constrained compute budgets, yet the specialized expertise required for manual optimization remains a niche and scarce skillset. This challenge is particularly evident in managing GPU utilization across heterogeneous infrastructure while enabling teams […]
CCMamba: Selective State-Space Models for Higher-Order Graph Learning on Combinatorial Complexes
arXiv:2601.20518v1 Announce Type: cross Abstract: Topological deep learning has emerged for modeling higher-order relational structures beyond pairwise interactions that standard graph neural networks fail to capture. Although combinatorial complexes offer a unified topological framework, most existing topological deep learning methods rely on local message passing via attention mechanisms, which incur quadratic complexity and remain low-dimensional, […]
On the Effectiveness of LLM-Specific Fine-Tuning for Detecting AI-Generated Text
arXiv:2601.20006v1 Announce Type: cross Abstract: The rapid progress of large language models has enabled the generation of text that closely resembles human writing, creating challenges for authenticity verification in education, publishing, and digital security. Detecting AI-generated text has therefore become a crucial technical and ethical issue. This paper presents a comprehensive study of AI-generated text […]