LLM-Agent-UMF: LLM-based Agent Unified Modeling Framework for Seamless Design of Multi Active/Passive Core-Agent Architectures

arXiv:2409.11393v3 Announce Type: replace-cross Abstract: In an era where vast amounts of data are collected and processed from diverse sources, there is a growing demand for sophisticated AI systems capable of intelligently fusing and analyzing this information. To address these challenges, researchers have turned towards integrating tools into LLM-powered agents to enhance the overall information […]

Large Language Models for Sentiment Analysis to Detect Social Challenges: A Use Case with South African Languages

arXiv:2511.17301v1 Announce Type: cross Abstract: Sentiment analysis can aid in understanding people’s opinions and emotions on social issues. In multilingual communities sentiment analysis systems can be used to quickly identify social challenges in social media posts, enabling government departments to detect and address these issues more precisely and effectively. Recently, large-language models (LLMs) have become […]

ISS-Geo142: A Benchmark for Geolocating Astronaut Photography from the International Space Station

arXiv:2504.21194v2 Announce Type: replace-cross Abstract: This paper introduces ISS-Geo142, a curated benchmark for geolocating astronaut photography captured from the International Space Station (ISS). Although the ISS position at capture time is known precisely, the specific Earth locations depicted in these images are typically not directly georeferenced, making automated localization non-trivial. ISS-Geo142 consists of 142 images […]

T2I-RiskyPrompt: A Benchmark for Safety Evaluation, Attack, and Defense on Text-to-Image Model

arXiv:2510.22300v2 Announce Type: replace-cross Abstract: Using risky text prompts, such as pornography and violent prompts, to test the safety of text-to-image (T2I) models is a critical task. However, existing risky prompt datasets are limited in three key areas: 1) limited risky categories, 2) coarse-grained annotation, and 3) low effectiveness. To address these limitations, we introduce […]

SHIELD: Secure Hypernetworks for Incremental Expansion Learning Defense

arXiv:2506.08255v3 Announce Type: replace-cross Abstract: Continual learning under adversarial conditions remains an open problem, as existing methods often compromise either robustness, scalability, or both. We propose a novel framework that integrates Interval Bound Propagation (IBP) with a hypernetwork-based architecture to enable certifiably robust continual learning across sequential tasks. Our method, SHIELD, generates task-specific model parameters […]

Where Culture Fades: Revealing the Cultural Gap in Text-to-Image Generation

arXiv:2511.17282v1 Announce Type: cross Abstract: Multilingual text-to-image (T2I) models have advanced rapidly in terms of visual realism and semantic alignment, and are now widely utilized. Yet outputs vary across cultural contexts: because language carries cultural connotations, images synthesized from multilingual prompts should preserve cross-lingual cultural consistency. We conduct a comprehensive analysis showing that current T2I […]

Posts of Peril: Detecting Information About Hazards in Text

arXiv:2405.17838v2 Announce Type: replace-cross Abstract: Socio-linguistic indicators of affectively-relevant phenomena, such as emotion or sentiment, are often extracted from text to better understand features of human-computer interactions, including on social media. However, an indicator that is often overlooked is the presence or absence of information concerning harms or hazards. Here, we develop a new model […]

Bridging the Semantic Gap: Contrastive Rewards for Multilingual Text-to-SQL with GRPO

arXiv:2510.13827v2 Announce Type: replace-cross Abstract: Current Text-to-SQL methods are evaluated and only focused on executable queries, overlooking the semantic alignment challenge — both in terms of the semantic meaning of the query and the correctness of the execution results. Even execution accuracy itself shows significant drops when moving from English to other languages, with an […]

Text-guided multi-property molecular optimization with a diffusion language model

arXiv:2410.13597v3 Announce Type: replace-cross Abstract: Molecular optimization (MO) is a crucial stage in drug discovery in which task-oriented generated molecules are optimized to meet practical industrial requirements. Existing mainstream MO approaches primarily utilize external property predictors to guide iterative property optimization. However, learning all molecular samples in the vast chemical space is unrealistic for predictors. […]

Leveraging CVAE for Joint Configuration Estimation of Multifingered Grippers from Point Cloud Data

arXiv:2511.17276v1 Announce Type: cross Abstract: This paper presents an efficient approach for determining the joint configuration of a multifingered gripper solely from the point cloud data of its poly-articulated chain, as generated by visual sensors, simulations or even generative neural networks. Well-known inverse kinematics (IK) techniques can provide mathematically exact solutions (when they exist) for […]

Sionna RT: Technical Report

arXiv:2504.21719v2 Announce Type: replace-cross Abstract: Sionna is an open-source, GPU-accelerated library that, as of version 0.14, incorporates a ray tracer, Sionna RT, for simulating radio wave propagation. A unique feature of Sionna RT is differentiability, enabling the calculation of gradients for the channel impulse responses (CIRs), radio maps, and other related metrics with respect to […]

Synthetic Object Compositions for Scalable and Accurate Learning in Detection, Segmentation, and Grounding

arXiv:2510.09110v3 Announce Type: replace-cross Abstract: Visual grouping — operationalized through tasks such as instance segmentation, visual grounding, and object detection — enables applications ranging from robotic perception to photo editing. These fundamental problems in computer vision are powered by large-scale, painstakingly annotated datasets. Despite their impact, these datasets are costly to build, biased in coverage, […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844