March 19, 2026 – Page 17 – dijee Pharma Intelligence

Deployment and Evaluation of an EHR-integrated, Large Language Model-Powered Tool to Triage Surgical Patients

arXiv:2603.17234v1 Announce Type: cross Abstract: Surgical co-management (SCM) is an evidence-based model in which hospitalists jointly manage medically complex perioperative patients alongside surgical teams. Despite its clinical and financial value, SCM is limited by the need to manually identify eligible patients. To determine whether SCM triage can be automated, we conducted a prospective, unblinded study […]

March 19, 2026

Symphony: A Cognitively-Inspired Multi-Agent System for Long-Video Understanding

arXiv:2603.17307v1 Announce Type: cross Abstract: Despite rapid developments and widespread applications of MLLM agents, they still struggle with long-form video understanding (LVU) tasks, which are characterized by high information density and extended temporal spans. Recent research on LVU agents demonstrates that simple task decomposition and collaboration mechanisms are insufficient for long-chain reasoning tasks. Moreover, directly […]

March 19, 2026

AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement

arXiv:2603.17441v1 Announce Type: cross Abstract: GUI grounding is a critical capability for vision-language models (VLMs) that enables automated interaction with graphical user interfaces by locating target elements from natural language instructions. However, grounding on GUI screenshots remains challenging due to high-resolution images, small UI elements, and ambiguous user instructions. In this work, we propose AdaZoom-GUI, […]

March 19, 2026

KineVLA: Towards Kinematics-Aware Vision-Language-Action Models with Bi-Level Action Decomposition

arXiv:2603.17524v1 Announce Type: cross Abstract: In this paper, we introduce a novel kinematics-rich vision-language-action (VLA) task, in which language commands densely encode diverse kinematic attributes (such as direction, trajectory, orientation, and relative displacement) from initiation through completion, at key moments, unlike existing action instructions that capture kinematics only coarsely or partially, thereby supporting fine-grained and […]

March 19, 2026

Joint Optimization of Storage and Loading for High-Performance 3D Point Cloud Data Processing

arXiv:2603.16945v1 Announce Type: cross Abstract: With the rapid development of computer vision and deep learning, significant advancements have been made in 3D vision, partic- ularly in autonomous driving, robotic perception, and augmented reality. 3D point cloud data, as a crucial representation of 3D information, has gained widespread attention. However, the vast scale and complexity of […]

March 19, 2026

PhysQuantAgent: An Inference Pipeline of Mass Estimation for Vision-Language Models

arXiv:2603.16958v1 Announce Type: cross Abstract: Vision-Language Models (VLMs) are increasingly applied to robotic perception and manipulation, yet their ability to infer physical properties required for manipulation remains limited. In particular, estimating the mass of real-world objects is essential for determining appropriate grasp force and ensuring safe interaction. However, current VLMs lack reliable mass reasoning capabilities, […]

March 19, 2026

MSRAMIE: Multimodal Structured Reasoning Agent for Multi-instruction Image Editing

arXiv:2603.16967v1 Announce Type: cross Abstract: Existing instruction-based image editing models perform well with simple, single-step instructions but degrade in realistic scenarios that involve multiple, lengthy, and interdependent directives. A main cause is the scarcity of training data with complex multi-instruction annotations. However, it is costly to collect such data and retrain these models. To address […]

March 19, 2026

The State of Generative AI in Software Development: Insights from Literature and a Developer Survey

arXiv:2603.16975v1 Announce Type: cross Abstract: Generative Artificial Intelligence (GenAI) rapidly transforms software engineering, yet existing research remains fragmented across individual tasks in the Software Development Lifecycle. This study integrates a systematic literature review with a survey of 65 software developers. The results show that GenAI exerts its highest impact in design, implementation, testing, and documentation, […]

March 19, 2026

Early Quantization Shrinks Codebook: A Simple Fix for Diversity-Preserving Tokenization

arXiv:2603.17052v1 Announce Type: cross Abstract: Vector quantization is a technique in machine learning that discretizes continuous representations into a set of discrete vectors. It is widely employed in tokenizing data representations for large language models, diffusion models, and other generative models. Despite its prevalence, the characteristics and behaviors of vector quantization in generative models remain […]

March 19, 2026

REAL: Regression-Aware Reinforcement Learning for LLM-as-a-Judge

arXiv:2603.17145v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed as automated evaluators that assign numeric scores to model outputs, a paradigm known as LLM-as-a-Judge. However, standard Reinforcement Learning (RL) methods typically rely on binary rewards (e.g., 0-1 accuracy), thereby ignoring the ordinal structure inherent in regression tasks; for instance, they fail to […]

March 19, 2026

Generalist Multimodal LLMs Gain Biometric Expertise via Human Salience

arXiv:2603.17173v1 Announce Type: cross Abstract: Iris presentation attack detection (PAD) is critical for secure biometric deployments, yet developing specialized models faces significant practical barriers: collecting data representing future unknown attacks is impossible, and collecting diverse-enough data, yet still limited in terms of its predictive power, is expensive. Additionally, sharing biometric data raises privacy concerns. Due […]

March 19, 2026

Catching rationalization in the act: detecting motivated reasoning before and after CoT via activation probing

arXiv:2603.17199v1 Announce Type: cross Abstract: Large language models (LLMs) can produce chains of thought (CoT) that do not accurately reflect the actual factors driving their answers. In multiple-choice settings with an injected hint favoring a particular option, models may shift their final answer toward the hinted option and produce a CoT that rationalizes the response […]

March 19, 2026

Subscribe for Updates