Augmenting Intelligence: A Hybrid Framework for Scalable and Stable Explanations

arXiv:2512.19557v1 Announce Type: new Abstract: Current approaches to Explainable AI (XAI) face a “Scalability-Stability Dilemma.” Post-hoc methods (e.g., LIME, SHAP) may scale easily but suffer from instability, while supervised explanation frameworks (e.g., TED) offer stability but require prohibitive human effort to label every training instance. This paper proposes a Hybrid LRR-TED framework that addresses this […]

Confucius Code Agent: Scalable Agent Scaffolding for Real-World Codebases

arXiv:2512.10398v5 Announce Type: replace-cross Abstract: Real-world software engineering tasks require coding agents that can operate over massive repositories, sustain long-horizon sessions, and reliably coordinate complex toolchains at test time. Existing research-grade coding agents offer transparency but struggle when scaled to heavier, production-level workloads, while production-grade systems achieve strong practical performance but provide limited extensibility, interpretability, […]

Generation of Programmatic Rules for Document Forgery Detection Using Large Language Models

arXiv:2512.19228v1 Announce Type: new Abstract: Document forgery poses a growing threat to legal, economic, and governmental processes, requiring increasingly sophisticated verification mechanisms. One approach involves the use of plausibility checks, rule-based procedures that assess the correctness and internal consistency of data, to detect anomalies or signs of manipulation. Although these verification procedures are essential for […]

A probabilistic foundation model for crystal structure denoising, phase classification, and order parameters

arXiv:2512.11077v2 Announce Type: replace-cross Abstract: Atomistic simulations generate large volumes of noisy structural data, but extracting phase labels, order parameters (OPs), and defect information in a way that is universal, robust, and interpretable remains challenging. Existing tools such as PTM and CNA are restricted to a small set of hand-crafted lattices (e.g. FCC/BCC/HCP), degrade under […]

LouvreSAE: Sparse Autoencoders for Interpretable and Controllable Style Transfer

arXiv:2512.18930v1 Announce Type: cross Abstract: Artistic style transfer in generative models remains a significant challenge, as existing methods often introduce style only via model fine-tuning, additional adapters, or prompt engineering, all of which can be computationally expensive and may still entangle style with subject matter. In this paper, we introduce a training- and inference-light, interpretable […]

When Less is More: 8-bit Quantization Improves Continual Learning in Large Language Models

arXiv:2512.18934v1 Announce Type: cross Abstract: Catastrophic forgetting poses a fundamental challenge in continual learning, particularly when models are quantized for deployment efficiency. We systematically investigate the interplay between quantization precision (FP16, INT8, INT4) and replay buffer strategies in large language models, revealing unexpected dynamics. While FP16 achieves superior initial task performance (74.44% on NLU), we […]

Understanding Syllogistic Reasoning in LLMs from Formal and Natural Language Perspectives

arXiv:2512.12620v2 Announce Type: replace-cross Abstract: We study syllogistic reasoning in LLMs from the logical and natural language perspectives. In process, we explore fundamental reasoning capabilities of the LLMs and the direction this research is moving forward. To aid in our studies, we use 14 large language models and investigate their syllogistic reasoning capabilities in terms […]

Learning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive Refinement

arXiv:2512.18950v1 Announce Type: cross Abstract: We present MACLA, a framework that decouples reasoning from learning by maintaining a frozen large language model while performing all adaptation in an external hierarchical procedural memory. MACLA extracts reusable procedures from trajectories, tracks reliability via Bayesian posteriors, selects actions through expected-utility scoring, and refines procedures by contrasting successes and […]

Population-Evolve: a Parallel Sampling and Evolutionary Method for LLM Math Reasoning

arXiv:2512.19081v1 Announce Type: new Abstract: Test-time scaling has emerged as a promising direction for enhancing the reasoning capabilities of Large Language Models in last few years. In this work, we propose Population-Evolve, a training-free method inspired by Genetic Algorithms to optimize LLM reasoning. Our approach maintains a dynamic population of candidate solutions for each problem […]

An Empirical Study of Developer-Provided Context for AI Coding Assistants in Open-Source Projects

arXiv:2512.18925v1 Announce Type: cross Abstract: While Large Language Models (LLMs) have demonstrated remarkable capabilities, research shows that their effectiveness depends not only on explicit prompts but also on the broader context provided. This requirement is especially pronounced in software engineering, where the goals, architecture, and collaborative conventions of an existing project play critical roles in […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844