CiMRAG: Cim-Aware Domain-Adaptive and Noise-Resilient Retrieval-Augmented Generation for Edge-Based LLMs

arXiv:2601.20041v1 Announce Type: cross Abstract: Personalized virtual assistants powered by large language models (LLMs) on edge devices are attracting growing attention, with Retrieval-Augmented Generation (RAG) emerging as a key method for personalization by retrieving relevant profile data and generating tailored responses. However, deploying RAG on edge devices faces efficiency hurdles due to the rapid growth […]

How Much Progress Has There Been in NVIDIA Datacenter GPUs?

arXiv:2601.20115v1 Announce Type: cross Abstract: Graphics Processing Units (GPUs) are the state-of-the-art architecture for essential tasks, ranging from rendering 2D/3D graphics to accelerating workloads in supercomputing centers and, of course, Artificial Intelligence (AI). As GPUs continue improving to satisfy ever-increasing performance demands, analyzing past and current progress becomes paramount in determining future constraints on scientific […]

BengaliSent140: A Large-Scale Bengali Binary Sentiment Dataset for Hate and Non-Hate Speech Classification

arXiv:2601.20129v1 Announce Type: cross Abstract: Sentiment analysis for the Bengali language has attracted increasing research interest in recent years. However, progress remains constrained by the scarcity of large-scale and diverse annotated datasets. Although several Bengali sentiment and hate speech datasets are publicly available, most are limited in size or confined to a single domain, such […]

What’s the plan? Metrics for implicit planning in LLMs and their application to rhyme generation and question answering

arXiv:2601.20164v1 Announce Type: cross Abstract: Prior work suggests that language models, while trained on next token prediction, show implicit planning behavior: they may select the next token in preparation to a predicted future token, such as a likely rhyming word, as supported by a prior qualitative study of Claude 3.5 Haiku using a cross-layer transcoder. […]

ProFlow: Zero-Shot Physics-Consistent Sampling via Proximal Flow Guidance

arXiv:2601.20227v1 Announce Type: cross Abstract: Inferring physical fields from sparse observations while strictly satisfying partial differential equations (PDEs) is a fundamental challenge in computational physics. Recently, deep generative models offer powerful data-driven priors for such inverse problems, yet existing methods struggle to enforce hard physical constraints without costly retraining or disrupting the learned generative prior. […]

How AI Impacts Skill Formation

arXiv:2601.20245v1 Announce Type: cross Abstract: AI assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development of skills required to effectively supervise AI remains unclear. Novice workers who rely heavily on AI to complete unfamiliar tasks may compromise their own skill acquisition in the process. We […]

Eliciting Least-to-Most Reasoning for Phishing URL Detection

arXiv:2601.20270v1 Announce Type: cross Abstract: Phishing continues to be one of the most prevalent attack vectors, making accurate classification of phishing URLs essential. Recently, large language models (LLMs) have demonstrated promising results in phishing URL detection. However, their reasoning capabilities that enabled such performance remain underexplored. To this end, in this paper, we propose a […]

Truthfulness Despite Weak Supervision: Evaluating and Training LLMs Using Peer Prediction

arXiv:2601.20299v1 Announce Type: cross Abstract: The evaluation and post-training of large language models (LLMs) rely on supervision, but strong supervision for difficult tasks is often unavailable, especially when evaluating frontier models. In such cases, models are demonstrated to exploit evaluations built on such imperfect supervision, leading to deceptive results. However, underutilized in LLM research, a […]

SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchips

arXiv:2601.20309v1 Announce Type: cross Abstract: Large Language Model (LLM) serving faces a fundamental tension between stringent latency Service Level Objectives (SLOs) and limited GPU memory capacity. When high request rates exhaust the KV cache budget, existing LLM inference systems often suffer severe head-of-line (HOL) blocking. While prior work explored PCIe-based offloading, these approaches cannot sustain […]

IoT Device Identification with Machine Learning: Common Pitfalls and Best Practices

arXiv:2601.20548v1 Announce Type: cross Abstract: This paper critically examines the device identification process using machine learning, addressing common pitfalls in existing literature. We analyze the trade-offs between identification methods (unique vs. class based), data heterogeneity, feature extraction challenges, and evaluation metrics. By highlighting specific errors, such as improper data augmentation and misleading session identifiers, we […]

Membership Inference Attacks Against Fine-tuned Diffusion Language Models

arXiv:2601.20125v1 Announce Type: cross Abstract: Diffusion Language Models (DLMs) represent a promising alternative to autoregressive language models, using bidirectional masked token prediction. Yet their susceptibility to privacy leakage via Membership Inference Attacks (MIA) remains critically underexplored. This paper presents the first systematic investigation of MIA vulnerabilities in DLMs. Unlike the autoregressive models’ single fixed prediction […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844