SepsisSuite: Beyond Risk Stratification — A Comparative Analysis of Deep Fusion vs. Expert Stacking for Prescriptive Sepsis AI

arXiv:2512.14712v1 Announce Type: cross Abstract: Sepsis accounts for nearly 20% of global ICU admissions, yet conventional prediction models often fail to effectively integrate heterogeneous data streams, remaining either siloed by modality or reliant on brittle early fusion. In this work, we present a rigorous architectural comparison between End-to-End Deep Fusion and Context-Aware Stacking for sepsis […]

WaveGNN: Integrating Graph Neural Networks and Transformers for Decay-Aware Classification of Irregular Clinical Time-Series

arXiv:2412.10621v2 Announce Type: replace-cross Abstract: Clinical time series are often irregularly sampled, with varying sensor frequencies, missing observations, and misaligned timestamps. Prior approaches typically address these irregularities by interpolating data into regular sequences, thereby introducing bias, or by generating inconsistent and uninterpretable relationships across sensor measurements, complicating the accurate learning of both intra-series and inter-series […]

How a Bit Becomes a Story: Semantic Steering via Differentiable Fault Injection

arXiv:2512.14715v1 Announce Type: cross Abstract: Hard-to-detect hardware bit flips, from either malicious circuitry or bugs, have already been shown to make transformers vulnerable in non-generative tasks. This work, for the first time, investigates how low-level, bitwise perturbations (fault injection) to the weights of a large language model (LLM) used for image captioning can influence the […]

LLMs and Fuzzing in Tandem: A New Approach to Automatically Generating Weakest Preconditions

arXiv:2507.05272v2 Announce Type: replace-cross Abstract: The weakest precondition (WP) of a program describes the largest set of initial states from which all terminating executions of the program satisfy a given postcondition. The generation of WPs is an important task with practical applications in areas ranging from verification to run-time error checking. This paper proposes the […]

Hybrid Attribution Priors for Explainable and Robust Model Training

arXiv:2512.14719v1 Announce Type: cross Abstract: Small language models (SLMs) are widely used in tasks that require low latency and lightweight deployment, particularly classification. As interpretability and robustness gain increasing importance, explanation-guided learning has emerged as an effective framework by introducing attribution-based supervision during training; however, deriving general and reliable attribution priors remains a significant challenge. […]

Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers

arXiv:2512.15674v1 Announce Type: cross Abstract: Large language model (LLM) activations are notoriously difficult to understand, with most existing techniques using complex, specialized methods for interpreting them. Recent work has proposed a simpler approach known as LatentQA: training LLMs to directly accept LLM activations as inputs and answer arbitrary questions about them in natural language. However, […]

HATSolver: Learning Groebner Bases with Hierarchical Attention Transformers

arXiv:2512.14722v1 Announce Type: cross Abstract: At NeurIPS 2024, Kera et al. introduced the use of transformers for computing Groebner bases, a central object in computer algebra with numerous practical applications. In this paper, we improve this approach by applying Hierarchical Attention Transformers (HATs) to solve systems of multivariate polynomial equations via Groebner bases computation. The […]

RPM-MCTS: Knowledge-Retrieval as Process Reward Model with Monte Carlo Tree Search for Code Generation

arXiv:2511.19895v2 Announce Type: replace Abstract: Tree search-based methods have made significant progress in enhancing the code generation capabilities of large language models. However, due to the difficulty in effectively evaluating intermediate algorithmic steps and the inability to locate and timely correct erroneous steps, these methods often generate incorrect code and incur increased computational costs. To […]

Quantum Decision Transformers (QDT): Synergistic Entanglement and Interference for Offline Reinforcement Learning

arXiv:2512.14726v1 Announce Type: cross Abstract: Offline reinforcement learning enables policy learning from pre-collected datasets without environment interaction, but existing Decision Transformer (DT) architectures struggle with long-horizon credit assignment and complex state-action dependencies. We introduce the Quantum Decision Transformer (QDT), a novel architecture incorporating quantum-inspired computational mechanisms to address these challenges. Our approach integrates two core […]

LATTE: Learning Aligned Transactions and Textual Embeddings for Bank Clients

arXiv:2508.10021v4 Announce Type: replace-cross Abstract: Learning clients embeddings from sequences of their historic communications is central to financial applications. While large language models (LLMs) offer general world knowledge, their direct use on long event sequences is computationally expensive and impractical in real-world pipelines. In this paper, we propose LATTE, a contrastive learning framework that aligns […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844