Text-as-Signal: Quantitative Semantic Scoring with Embeddings, Logprobs, and Noise Reduction

arXiv:2604.13056v1 Announce Type: cross Abstract: This paper presents a practical pipeline for turning text corpora into quantitative semantic signals. Each news item is represented as a full-document embedding, scored through logprob-based evaluation over a configurable positional dictionary, and projected onto a noise-reduced low-dimensional manifold for structural interpretation. In the present case study, the dictionary is […]

Trust and Reliance on AI in Education: AI Literacy and Need for Cognition as Moderators

arXiv:2604.01114v2 Announce Type: replace-cross Abstract: As generative AI systems are integrated into educational settings, students often encounter AI-generated output while working through learning tasks, either by requesting help or through integrated tools. Trust in AI can influence how students interpret and use that output, including whether they evaluate it critically or exhibit overreliance. We investigate […]

Bi-Predictability: A Real-Time Signal for Monitoring LLM Interaction Integrity

arXiv:2604.13061v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in high-stakes autonomous and interactive workflows, where reliability demands continuous, multi-turn coherence. However, current evaluation methods either rely on post-hoc semantic judges, measure unidirectional token confidence (e.g., perplexity), or require compute-intensive repeated sampling (e.g., semantic entropy). Because these techniques focus exclusively on the […]

Soft $Q(lambda)$: A multi-step off-policy method for entropy regularised reinforcement learning using eligibility traces

arXiv:2604.13780v1 Announce Type: cross Abstract: Soft Q-learning has emerged as a versatile model-free method for entropy-regularised reinforcement learning, optimising for returns augmented with a penalty on the divergence from a reference policy. Despite its success, the multi-step extensions of soft Q-learning remain relatively unexplored and limited to on-policy action sampling under the Boltzmann policy. In […]

Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data

arXiv:2604.13066v1 Announce Type: cross Abstract: In-context learning has established itself as an important learning paradigm for Large Language Models (LLMs). In this paper, we demonstrate that LLMs can learn encoding keys in-context and perform analysis directly on encoded representations. This finding enables lossless prompt compression via dictionary encoding without model fine-tuning: frequently occurring subsequences are […]

Hydra: Unifying Document Retrieval and Generation in a Single Vision-Language Model

arXiv:2603.28554v2 Announce Type: replace-cross Abstract: Visual document understanding typically requires separate retrieval and generation models, doubling memory and system complexity. We present Hydra, a dual-head approach that provides both ColBERT-style late-interaction retrieval and autoregressive generation from a single vision-language model (VLM). A single LoRA adapter, trained only for retrieval, is toggled at inference: enabling it […]

EVE: A Domain-Specific LLM Framework for Earth Intelligence

arXiv:2604.13071v1 Announce Type: cross Abstract: We introduce Earth Virtual Expert (EVE), the first open-source, end-to-end initiative for developing and deploying domain-specialized LLMs for Earth Intelligence. At its core is EVE-Instruct, a domain-adapted 24B model built on Mistral Small 3.2 and optimized for reasoning and question answering. On newly constructed Earth Observation and Earth Sciences benchmarks, […]

From Anchors to Supervision: Memory-Graph Guided Corpus-Free Unlearning for Large Language Models

arXiv:2604.13777v1 Announce Type: cross Abstract: Large language models (LLMs) may memorize sensitive or copyrighted content, raising significant privacy and legal concerns. While machine unlearning has emerged as a potential remedy, prevailing paradigms rely on user-provided forget sets, making unlearning requests difficult to audit and exposing systems to secondary leakage and malicious abuse. We propose MAGE, […]

OmniTrace: A Unified Framework for Generation-Time Attribution in Omni-Modal LLMs

arXiv:2604.13073v1 Announce Type: cross Abstract: Modern multimodal large language models (MLLMs) generate fluent responses from interleaved text, image, audio, and video inputs. However, identifying which input sources support each generated statement remains an open challenge. Existing attribution methods are primarily designed for classification settings, fixed prediction targets, or single-modality architectures, and do not naturally extend […]

ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

arXiv:2603.27064v2 Announce Type: replace-cross Abstract: Understanding charts requires models to jointly reason over geometric visual patterns, structured numerical data, and natural language — a capability where current vision-language models (VLMs) remain limited. We introduce ChartNet, a high-quality, million-scale multimodal dataset designed to advance chart interpretation and reasoning. ChartNet leverages a novel code-guided synthesis pipeline to […]

Document-tuning for robust alignment to animals

arXiv:2604.13076v1 Announce Type: cross Abstract: We investigate the robustness of value alignment via finetuning with synthetic documents, using animal compassion as a value that is both important in its own right and orthogonal to existing alignment efforts. To evaluate compassionate reasoning, we develop and publicly release the Animal Harm Benchmark (AHB), a 26-question evaluation spanning […]

A Dynamic-Growing Fuzzy-Neuro Controller, Application to a 3PSP Parallel Robot

arXiv:2604.13763v1 Announce Type: cross Abstract: To date, various paradigms of soft-Computing have been used to solve many modern problems. Among them, a self organizing combination of fuzzy systems and neural networks can make a powerful decision making system. Here, a Dynamic Growing Fuzzy Neural Controller (DGFNC) is combined with an adaptive strategy and applied to […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844