Primary – Page 6 – dijee Pharma Intelligence

Value of Information-Enhanced Exploration in Bootstrapped DQN

arXiv:2511.02969v2 Announce Type: replace-cross Abstract: Efficient exploration in deep reinforcement learning remains a fundamental challenge, especially in environments characterized by high-dimensional states and sparse rewards. Traditional exploration strategies that rely on random local policy noise, such as $epsilon$-greedy and Boltzmann exploration methods, often struggle to efficiently balance exploration and exploitation. In this paper, we integrate […]

November 24, 2025

From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers

arXiv:2509.06938v2 Announce Type: replace-cross Abstract: As generative AI systems become competent and democratized in science, business, and government, deeper insight into their failure modes now poses an acute need. The occasional volatility in their behavior, such as the propensity of transformer models to hallucinate, impedes trust and adoption of emerging AI solutions in high-stakes areas. […]

November 24, 2025

Generative AI and Power Imbalances in Global Education: Frameworks for Bias Mitigation

arXiv:2406.02966v4 Announce Type: replace-cross Abstract: This study examines how Generative Artificial Intelligence reproduces global power hierarchies in education and proposes a framework to address resulting inequities. Using a critical qualitative design, the study conducted zero-shot prompt testing with two leading systems, ChatGPT-4 Turbo and Gemini 1.5, and collected real-time outputs from Global North and South […]

November 24, 2025

Lost in Translation and Noise: A Deep Dive into the Failure Modes of VLMs on Real-World Tables

arXiv:2511.17238v1 Announce Type: cross Abstract: The impressive performance of VLMs is largely measured on benchmarks that fail to capture the complexities of real-world scenarios. Existing datasets for tabular QA, such as WikiTableQuestions and FinQA, are overwhelmingly monolingual (English) and present tables in a digitally perfect, clean format. This creates a significant gap between research and […]

November 24, 2025

Defending the Edge: Representative-Attention Defense against Backdoor Attacks in Federated Learning

arXiv:2505.10297v2 Announce Type: replace-cross Abstract: Federated learning (FL) remains highly vulnerable to adaptive backdoor attacks that preserve stealth by closely imitating benign update statistics. Existing defenses predominantly rely on anomaly detection in parameter or gradient space, overlooking behavioral constraints that backdoor attacks must satisfy to ensure reliable trigger activation. These anomaly-centric methods fail against adaptive […]

November 24, 2025

Approximating a gene regulatory network from non-sequential data

arXiv:2401.11858v3 Announce Type: replace Abstract: Given non-sequential snapshots from instances of a dynamical system, we design a compressed sensing based algorithm that reconstructs the dynamical system. On the theoretical side, we show that: (1) successful reconstruction is possible under the assumption that we can construct an approximate clock from a subset of the coordinates of […]

November 24, 2025

How LLMs Learn to Reason: A Complex Network Perspective

arXiv:2509.23629v2 Announce Type: replace Abstract: Training large language models with Reinforcement Learning with Verifiable Rewards (RLVR) exhibits a set of distinctive and puzzling behaviors that remain poorly understood, including a two-stage learning curve, a V-shaped response-length trajectory, and a pronounced vulnerability to catastrophic forgetting. In this work, we propose that these behaviors are emergent collective […]

November 24, 2025

A Reinforcement Learning-Based Telematic Routing Protocol for the Internet of Underwater Things

arXiv:2506.00133v2 Announce Type: replace-cross Abstract: The Internet of Underwater Things (IoUT) has a lot of problems, like low bandwidth, high latency, mobility, and not enough energy. Routing protocols that were made for land-based networks, like RPL, don’t work well in these underwater settings. This paper talks about RL-RPL-UA, a new routing protocol that uses reinforcement […]

November 24, 2025

Model Inversion Attack Against Deep Hashing

arXiv:2511.12233v2 Announce Type: replace-cross Abstract: Deep hashing improves retrieval efficiency through compact binary codes, yet it introduces severe and often overlooked privacy risks. The ability to reconstruct original training data from hash codes could lead to serious threats such as biometric forgery and privacy breaches. However, model inversion attacks specifically targeting deep hashing models remain […]

November 24, 2025

MonoKAN: Certified Monotonic Kolmogorov-Arnold Network

arXiv:2409.11078v2 Announce Type: replace-cross Abstract: Artificial Neural Networks (ANNs) have significantly advanced various fields by effectively recognizing patterns and solving complex problems. Despite these advancements, their interpretability remains a critical challenge, especially in applications where transparency and accountability are essential. To address this, explainable AI (XAI) has made progress in demystifying ANNs, yet interpretability alone […]

November 24, 2025

SMILE: A Composite Lexical-Semantic Metric for Question-Answering Evaluation

arXiv:2511.17432v1 Announce Type: cross Abstract: Traditional evaluation metrics for textual and visual question answering, like ROUGE, METEOR, and Exact Match (EM), focus heavily on n-gram based lexical similarity, often missing the deeper semantic understanding needed for accurate assessment. While measures like BERTScore and MoverScore leverage contextual embeddings to address this limitation, they lack flexibility in […]

November 24, 2025

Meta-World+: An Improved, Standardized, RL Benchmark

arXiv:2505.11289v2 Announce Type: replace Abstract: Meta-World is widely used for evaluating multi-task and meta-reinforcement learning agents, which are challenged to master diverse skills simultaneously. Since its introduction however, there have been numerous undocumented changes which inhibit a fair comparison of algorithms. This work strives to disambiguate these results from the literature, while also leveraging the […]

November 24, 2025

Subscribe for Updates