Uncategorized – Page 39 – dijee Pharma Intelligence

Empirical Recipes for Efficient and Compact Vision-Language Models

arXiv:2603.16987v1 Announce Type: cross Abstract: Deploying vision-language models (VLMs) in resource-constrained settings demands low latency and high throughput, yet existing compact VLMs often fall short of the inference speedups their smaller parameter counts suggest. To explain this discrepancy, we conduct an empirical end-to-end efficiency analysis and systematically profile inference to identify the dominant bottlenecks. Based […]

March 19, 2026

Large Reasoning Models Struggle to Transfer Parametric Knowledge Across Scripts

arXiv:2603.17070v1 Announce Type: cross Abstract: In this work, we analyze shortcomings in cross-lingual knowledge transfer in large, modern reasoning LLMs. We demonstrate that the perceived gap in knowledge transfer is primarily a script barrier. First, we conduct an observational data analysis on the performance of thinking models on two datasets with local knowledge from around […]

March 19, 2026

Intent Formalization: A Grand Challenge for Reliable Coding in the Age of AI Agents

arXiv:2603.17150v1 Announce Type: cross Abstract: Agentic AI systems can now generate code with remarkable fluency, but a fundamental question remains: emphdoes the generated code actually do what the user intended? The gap between informal natural language requirements and precise program behavior — the emphintent gap — has always plagued software engineering, but AI-generated code amplifies […]

March 19, 2026

A scalable neural bundle map for multiphysics prediction in lithium-ion battery across varying configurations

arXiv:2603.17209v1 Announce Type: cross Abstract: Efficient and accurate prediction of Multiphysics evolution across diverse cell geometries is fundamental to the design, management and safety of lithium-ion batteries. However, existing computational frameworks struggle to capture the coupled electrochemical, thermal, and mechanical dynamics across diverse cell geometries and varying operating conditions. Here, we present a Neural Bundle […]

March 19, 2026

Deployment and Evaluation of an EHR-integrated, Large Language Model-Powered Tool to Triage Surgical Patients

arXiv:2603.17234v1 Announce Type: cross Abstract: Surgical co-management (SCM) is an evidence-based model in which hospitalists jointly manage medically complex perioperative patients alongside surgical teams. Despite its clinical and financial value, SCM is limited by the need to manually identify eligible patients. To determine whether SCM triage can be automated, we conducted a prospective, unblinded study […]

March 19, 2026

Symphony: A Cognitively-Inspired Multi-Agent System for Long-Video Understanding

arXiv:2603.17307v1 Announce Type: cross Abstract: Despite rapid developments and widespread applications of MLLM agents, they still struggle with long-form video understanding (LVU) tasks, which are characterized by high information density and extended temporal spans. Recent research on LVU agents demonstrates that simple task decomposition and collaboration mechanisms are insufficient for long-chain reasoning tasks. Moreover, directly […]

March 19, 2026

AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement

arXiv:2603.17441v1 Announce Type: cross Abstract: GUI grounding is a critical capability for vision-language models (VLMs) that enables automated interaction with graphical user interfaces by locating target elements from natural language instructions. However, grounding on GUI screenshots remains challenging due to high-resolution images, small UI elements, and ambiguous user instructions. In this work, we propose AdaZoom-GUI, […]

March 19, 2026

KineVLA: Towards Kinematics-Aware Vision-Language-Action Models with Bi-Level Action Decomposition

arXiv:2603.17524v1 Announce Type: cross Abstract: In this paper, we introduce a novel kinematics-rich vision-language-action (VLA) task, in which language commands densely encode diverse kinematic attributes (such as direction, trajectory, orientation, and relative displacement) from initiation through completion, at key moments, unlike existing action instructions that capture kinematics only coarsely or partially, thereby supporting fine-grained and […]

March 19, 2026

Joint Optimization of Storage and Loading for High-Performance 3D Point Cloud Data Processing

arXiv:2603.16945v1 Announce Type: cross Abstract: With the rapid development of computer vision and deep learning, significant advancements have been made in 3D vision, partic- ularly in autonomous driving, robotic perception, and augmented reality. 3D point cloud data, as a crucial representation of 3D information, has gained widespread attention. However, the vast scale and complexity of […]

March 19, 2026

PhysQuantAgent: An Inference Pipeline of Mass Estimation for Vision-Language Models

arXiv:2603.16958v1 Announce Type: cross Abstract: Vision-Language Models (VLMs) are increasingly applied to robotic perception and manipulation, yet their ability to infer physical properties required for manipulation remains limited. In particular, estimating the mass of real-world objects is essential for determining appropriate grasp force and ensuring safe interaction. However, current VLMs lack reliable mass reasoning capabilities, […]

March 19, 2026

MSRAMIE: Multimodal Structured Reasoning Agent for Multi-instruction Image Editing

arXiv:2603.16967v1 Announce Type: cross Abstract: Existing instruction-based image editing models perform well with simple, single-step instructions but degrade in realistic scenarios that involve multiple, lengthy, and interdependent directives. A main cause is the scarcity of training data with complex multi-instruction annotations. However, it is costly to collect such data and retrain these models. To address […]

March 19, 2026

The State of Generative AI in Software Development: Insights from Literature and a Developer Survey

arXiv:2603.16975v1 Announce Type: cross Abstract: Generative Artificial Intelligence (GenAI) rapidly transforms software engineering, yet existing research remains fragmented across individual tasks in the Software Development Lifecycle. This study integrates a systematic literature review with a survey of 65 software developers. The results show that GenAI exerts its highest impact in design, implementation, testing, and documentation, […]

March 19, 2026

Subscribe for Updates