Measuring Chain-of-Thought Monitorability Through Faithfulness and Verbosity

arXiv:2510.27378v1 Announce Type: cross Abstract: Chain-of-thought (CoT) outputs let us read a model’s step-by-step reasoning. Since any long, serial reasoning process must pass through this textual trace, the quality of the CoT is a direct window into what the model is thinking. This visibility could help us spot unsafe or misaligned behavior (monitorability), but only […]

GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation

arXiv:2510.27210v1 Announce Type: new Abstract: While Multimodal Large Language Models (MLLMs) have advanced GUI navigation agents, current approaches face limitations in cross-domain generalization and effective history utilization. We present a reasoning-enhanced framework that systematically integrates structured reasoning, action prediction, and history summarization. The structured reasoning component generates coherent Chain-of-Thought analyses combining progress estimation and decision […]

Atlas-Alignment: Making Interpretability Transferable Across Language Models

arXiv:2510.27413v1 Announce Type: cross Abstract: Interpretability is crucial for building safe, reliable, and controllable language models, yet existing interpretability pipelines remain costly and difficult to scale. Interpreting a new model typically requires costly training of model-specific sparse autoencoders, manual or semi-automated labeling of SAE components, and their subsequent validation. We introduce Atlas-Alignment, a framework for […]

TetraJet-v2: Accurate NVFP4 Training for Large Language Models with Oscillation Suppression and Outlier Control

arXiv:2510.27527v1 Announce Type: cross Abstract: Large Language Models (LLMs) training is prohibitively expensive, driving interest in low-precision fully-quantized training (FQT). While novel 4-bit formats like NVFP4 offer substantial efficiency gains, achieving near-lossless training at such low precision remains challenging. We introduce TetraJet-v2, an end-to-end 4-bit FQT method that leverages NVFP4 for activations, weights, and gradients […]

Protein-protein interaction networks can be highly sensitive to the membrane phase transition

arXiv:2510.26949v1 Announce Type: cross Abstract: Many protein-protein interaction (PPI) networks take place in the fluid yet structured plasma membrane. Lipid domains, sometimes termed rafts, have been implicated in the functioning of various membrane-bound signaling processes. Here, we present a model and a Monte Carlo simulation framework to investigate how changes in the domain size that […]

Reinforcement Learning for Long-Horizon Unordered Tasks: From Boolean to Coupled Reward Machines

arXiv:2510.27329v1 Announce Type: new Abstract: Reward machines (RMs) inform reinforcement learning agents about the reward structure of the environment. This is particularly advantageous for complex non-Markovian tasks because agents with access to RMs can learn more efficiently from fewer samples. However, learning with RMs is ill-suited for long-horizon problems in which a set of subtasks […]

Can machines think efficiently?

arXiv:2510.26954v1 Announce Type: cross Abstract: The Turing Test is no longer adequate for distinguishing human and machine intelligence. With advanced artificial intelligence systems already passing the original Turing Test and contributing to serious ethical and environmental concerns, we urgently need to update the test. This work expands upon the original imitation game by accounting for […]

Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models

arXiv:2510.27629v1 Announce Type: cross Abstract: Open-weight bio-foundation models present a dual-use dilemma. While holding great promise for accelerating scientific research and drug development, they could also enable bad actors to develop more deadly bioweapons. To mitigate the risk posed by these models, current approaches focus on filtering biohazardous data during pre-training. However, the effectiveness of […]

Frame Semantic Patterns for Identifying Underreporting of Notifiable Events in Healthcare: The Case of Gender-Based Violence

arXiv:2510.26969v1 Announce Type: cross Abstract: We introduce a methodology for the identification of notifiable events in the domain of healthcare. The methodology harnesses semantic frames to define fine-grained patterns and search them in unstructured data, namely, open-text fields in e-medical records. We apply the methodology to the problem of underreporting of gender-based violence (GBV) in […]

Discriminative Rule Learning for Outcome-Guided Process Model Discovery

arXiv:2510.27343v1 Announce Type: new Abstract: Event logs extracted from information systems offer a rich foundation for understanding and improving business processes. In many real-world applications, it is possible to distinguish between desirable and undesirable process executions, where desirable traces reflect efficient or compliant behavior, and undesirable ones may involve inefficiencies, rule violations, delays, or resource […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844