From Knowledge to Action: Outcomes of the 2025 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry

arXiv:2605.03205v1 Announce Type: cross Abstract: Large language models (LLMs) are rapidly changing how researchers in materials science and chemistry discover, organize, and act on scientific knowledge. This paper analyzes a broad set of community-developed LLM applications in an effort to identify emerging patterns in how these systems can be used across the scientific research lifecycle. […]

S^2tory: Story Spine Distillation for Movie Script Summarization

arXiv:2605.03244v1 Announce Type: cross Abstract: Movie scripts pose a fundamental challenge for automatic summarization due to their non-linear, cross-cut narrative structure, which makes surface-level saliency methods ineffective at preserving core story progression. To address this, we introduce S^2tory (Story Spine Distillation), a narratology-grounded framework that leverages character development trajectories to identify plot nuclei, the essential […]

SHIELD: A Diverse Clinical Note Dataset and Distilled Small Language Models for Enterprise-Scale De-identification

arXiv:2605.03301v1 Announce Type: cross Abstract: De-identification of clinical text remains essential for secondary use of electronic health records (EHRs), yet public benchmarks such as i2b2 2006/2014 are over a decade old and lack the semantic and demographic diversity of modern narratives. While Large Language Models (LLMs) achieve state-of-the-art zero-shot extraction, enterprise deployment is hindered by […]

Smart Passive Acoustic Monitoring: Embedding a Classifier on AudioMoth Microcontroller

arXiv:2605.03412v1 Announce Type: cross Abstract: Passive Acoustic Monitoring (PAM) is an efficient and non-invasive method for surveying ecosystems at a reduced cost. Typically, autonomous recorders allow the acquisition of vast bioacoustic datasets which are then analyzed. However, power consumption and data storage are both scarce and limit the duration of acquisition campaigns. To address this […]

CuraView: A Multi-Agent Framework for Medical Hallucination Detection with GraphRAG-Enhanced Knowledge Verification

arXiv:2605.03476v1 Announce Type: cross Abstract: Discharge summaries require extracting critical information from lengthy electronic health records (EHRs), a process that is labor-intensive when performed manually. Large language models (LLMs) can improve generation efficiency; however, they are prone to producing faithfulness hallucinations, statements that contradict source records, posing direct risks to patient safety. To address this, […]

Predicting Euler Characteristics and Constructing Topological Structure Using Machine Learning Techniques

arXiv:2605.02947v1 Announce Type: cross Abstract: This study proposes a novel approach to extract topological properties, specifically the Euler characteristic, from input images using neural networks without relying on large pre-existing datasets but with a single geometric image. Inspired by solid-state physics, where topological properties of magnetic structures are derived from spin field analysis, our model […]

Inferring Phylogenetic Networks from Allowed and Forbidden LCA-Constraints

arXiv:2605.03827v1 Announce Type: cross Abstract: Phylogenetic networks provide a framework for representing evolutionary histories involving reticulate events such as hybridization or horizontal gene transfer. A central problem is to infer such networks from local structural information. In this paper, we study network inference from least common ancestor (LCA) constraints, which specify relative ancestral relationships between […]

Decompose to Understand, Fuse to Detect: Frequency-Decoupled Anomaly Detection for Encrypted Network Traffic

arXiv:2605.02970v1 Announce Type: cross Abstract: Network traffic anomaly detection represents a critical cybersecurity task, yet widespread encryption makes this task increasingly challenging. In response, image-based methods that model traffic as visual patterns have emerged as the dominant approach. However, this work pioneers the identification of a pervasive “full-frequency” characteristic and an associated limitation termed “spectral […]

NeuralSet: A High-Performing Python Package for Neuro-AI

arXiv:2605.03169v1 Announce Type: new Abstract: Artificial intelligence (AI) is increasingly central to understanding how the brain processes information. However, the integration of neuroscience and modern AI is bottlenecked by a fragmented software ecosystem. Current tools are siloed by recording modality and optimized for small-scale, in-memory workflows, limiting the use of massive, naturalistic datasets. Here, we […]

Mixed-Precision Information Bottlenecks for On-Device Trait-State Disentanglement in Bipolar Agitation Detection

arXiv:2605.03039v1 Announce Type: cross Abstract: Continuous monitoring of bipolar disorder agitation via voice biomarkers requires disentangling stable speaker traits from volatile affective states on resource-constrained edge devices. We introduce MP-IB, the first framework to treat mixed-precision quantization as an information bottleneck for clinical trait-state separation. The core insight is that numerical precision itself controls capacity: […]

Label-Efficient School Detection from Aerial Imagery via Weakly Supervised Pretraining and Fine-Tuning

arXiv:2605.03968v1 Announce Type: cross Abstract: Accurate school detection is essential for supporting education initiatives, including infrastructure planning and expanding internet connectivity to underserved areas. However, many regions around the world face challenges due to outdated, incomplete, or unavailable official records. Manual mapping efforts, while valuable, are labor-intensive and lack scalability across large geographic areas. To […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844