Open Biomedical Knowledge Graphs at Scale: Construction, Federation, and AI Agent Access with Samyama Graph Database

arXiv:2603.15080v2 Announce Type: replace-cross Abstract: Biomedical knowledge is fragmented across siloed databases — Reactome for pathways, STRING for protein interactions, ClinicalTrials.gov for study registries, DrugBank for drug vocabularies, DGIdb for drug-gene interactions, SIDER for side effects. We present three open-source biomedical knowledge graphs — Pathways KG (118,686 nodes, 834,785 edges from 5 sources), Clinical Trials […]

Building a Correct-by-Design Lakehouse. Data Contracts, Versioning, and Transactional Pipelines for Humans and Agents

arXiv:2602.02335v3 Announce Type: replace-cross Abstract: Lakehouses are now the default substrate for analytics and AI, but they remain fragile under concurrent, untrusted change: schema mismatches often surface only at runtime, development and production easily diverge, and multi-table pipelines can expose partial results after failure. We present Bauplan, a code-first lakehouse that aims to eliminate a […]

LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations

arXiv:2602.09924v2 Announce Type: replace-cross Abstract: Running LLMs with extended reasoning on every problem is expensive, but determining which inputs actually require additional compute remains challenging. We investigate whether their own likelihood of success is recoverable from their internal representations before generation, and if this signal can guide more efficient inference. We train linear probes on […]

From Vulnerabilities to Remediation: A Systematic Literature Review of LLMs in Code Security

arXiv:2412.15004v4 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have emerged as powerful tools for automating programming tasks, including security-related ones. However, they can also introduce vulnerabilities during code generation, fail to detect existing vulnerabilities, or report nonexistent ones. This systematic literature review investigates the security benefits and drawbacks of using LLMs for code-related tasks. […]

Traj2Action: A Co-Denoising Framework for Trajectory-Guided Human-to-Robot Skill Transfer

arXiv:2510.00491v2 Announce Type: replace-cross Abstract: Learning diverse manipulation skills for real-world robots is severely bottlenecked by the reliance on costly and hard-to-scale teleoperated demonstrations. While human videos offer a scalable alternative, effectively transferring manipulation knowledge is fundamentally hindered by the significant morphological gap between human and robotic embodiments. To address this challenge and facilitate skill […]

Conservative Continuous-Time Treatment Optimization

arXiv:2603.16789v1 Announce Type: cross Abstract: We develop a conservative continuous-time stochastic control framework for treatment optimization from irregularly sampled patient trajectories. The unknown patient dynamics are modeled as a controlled stochastic differential equation with treatment as a continuous-time control. Naive model-based optimization can exploit model errors and propose out-of-support controls, so optimizing the estimated dynamics […]

SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

arXiv:2511.21471v3 Announce Type: replace Abstract: Spatial cognition is fundamental to real-world multimodal intelligence, allowing models to effectively interact with the physical environment. While multimodal large language models (MLLMs) have made significant strides, existing benchmarks often oversimplify spatial cognition, reducing it to a single-dimensional metric, which fails to capture the hierarchical structure and interdependence of spatial […]

FederatedFactory: Generative One-Shot Learning for Extremely Non-IID Distributed Scenarios

arXiv:2603.16370v1 Announce Type: cross Abstract: Federated Learning (FL) enables distributed optimization without compromising data sovereignty. Yet, where local label distributions are mutually exclusive, standard weight aggregation fails due to conflicting optimization trajectories. Often, FL methods rely on pretrained foundation models, introducing unrealistic assumptions. We introduce FederatedFactory, a zero-dependency framework that inverts the unit of federation […]

To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

arXiv:2603.15159v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have shown strong potential for code generation, yet they remain limited in private-library-oriented code generation, where the goal is to generate code using APIs from private libraries. Existing approaches mainly rely on retrieving private-library API documentation and injecting relevant knowledge into the context at inference time. […]

Age Predictors Through the Lens of Generalization, Bias Mitigation, and Interpretability: Reflections on Causal Implications

arXiv:2603.16377v1 Announce Type: cross Abstract: Chronological age predictors often fail to achieve out-of-distribution (OOD) gen- eralization due to exogenous attributes such as race, gender, or tissue. Learning an invariant representation with respect to those attributes is therefore essential to improve OOD generalization and prevent overly optimistic results. In predic- tive settings, these attributes motivate bias […]

Bayesian Inference in Epidemic Modelling: A Beginner’s Guide

arXiv:2603.15175v2 Announce Type: replace-cross Abstract: This lecture note provides a self-contained introduction to Bayesian inference and Markov Chain Monte Carlo (MCMC) methods for parameter estimation in epidemic models. Using the classical Susceptible-Infectious-Recovered (SIR) compartmental model as a running example, we derive the likelihood function from first principles, specify priors on the transmission and recovery parameters, […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844