arXiv:2510.23693v1 Announce Type: cross Abstract: This PhD thesis investigates the societal impact of machine learning (ML). ML increasingly informs consequential decisions and recommendations, significantly affecting many aspects of our lives. As these data-driven systems are often developed without explicit fairness considerations, they carry the risk of discriminatory effects. The contributions in this thesis enable more […]
Covert Surveillance in Smart Devices: A SCOUR Framework Analysis of Youth Privacy Implications
arXiv:2510.24072v1 Announce Type: cross Abstract: This paper investigates how smart devices covertly capture private conversations and discusses in more in-depth the implications of this for youth privacy. Using a structured review guided by the PRISMA methodology, the analysis focuses on privacy concerns, data capture methods, data storage and sharing practices, and proposed technical mitigations. To […]
Genotype-Phenotype Integration through Machine Learning and Personalized Gene Regulatory Networks for Cancer Metastasis Prediction
arXiv:2510.23620v1 Announce Type: new Abstract: Metastasis is the leading cause of cancer-related mortality, yet most predictive models rely on shallow architectures and neglect patient-specific regulatory mechanisms. Here, we integrate classical machine learning and deep learning to predict metastatic potential across multiple cancer types. Gene expression profiles from the Cancer Cell Line Encyclopedia were combined with […]
Reproducible workflow for online AI in digital health
arXiv:2509.13499v3 Announce Type: replace-cross Abstract: Online artificial intelligence (AI) algorithms are an important component of digital health interventions. These online algorithms are designed to continually learn and improve their performance as streaming data is collected on individuals. Deploying online AI presents a key challenge: balancing adaptability of online AI with reproducibility. Online AI in digital […]
Multi-Environment POMDPs: Discrete Model Uncertainty Under Partial Observability
arXiv:2510.23744v1 Announce Type: new Abstract: Multi-environment POMDPs (ME-POMDPs) extend standard POMDPs with discrete model uncertainty. ME-POMDPs represent a finite set of POMDPs that share the same state, action, and observation spaces, but may arbitrarily vary in their transition, observation, and reward models. Such models arise, for instance, when multiple domain experts disagree on how to […]
MiniOneRec: An Open-Source Framework for Scaling Generative Recommendation
arXiv:2510.24431v1 Announce Type: cross Abstract: The recent success of large language models (LLMs) has renewed interest in whether recommender systems can achieve similar scaling benefits. Conventional recommenders, dominated by massive embedding tables, tend to plateau as embedding dimensions grow. In contrast, the emerging generative paradigm replaces embeddings with compact Semantic ID (SID) sequences produced by […]
Evaluating In Silico Creativity: An Expert Review of AI Chess Compositions
arXiv:2510.23772v1 Announce Type: new Abstract: The rapid advancement of Generative AI has raised significant questions regarding its ability to produce creative and novel outputs. Our recent work investigates this question within the domain of chess puzzles and presents an AI system designed to generate puzzles characterized by aesthetic appeal, novelty, counter-intuitive and unique solutions. We […]
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents
arXiv:2510.24702v1 Announce Type: cross Abstract: Public research results on large-scale supervised finetuning of AI agents remain relatively rare, since the collection of agent training data presents unique challenges. In this work, we argue that the bottleneck is not a lack of underlying data sources, but that a large variety of data is fragmented across heterogeneous […]
Zero-Shot Cross-Lingual Transfer using Prefix-Based Adaptation
arXiv:2510.24619v1 Announce Type: cross Abstract: With the release of new large language models (LLMs) like Llama and Mistral, zero-shot cross-lingual transfer has become increasingly feasible due to their multilingual pretraining and strong generalization capabilities. However, adapting these decoder-only LLMs to new tasks across languages remains challenging. While parameter-efficient fine-tuning (PeFT) techniques like Low-Rank Adaptation (LoRA) […]
Training-Free Safe Text Embedding Guidance for Text-to-Image Diffusion Models
arXiv:2510.24012v1 Announce Type: cross Abstract: Text-to-image models have recently made significant advances in generating realistic and semantically coherent images, driven by advanced diffusion models and large-scale web-crawled datasets. However, these datasets often contain inappropriate or biased content, raising concerns about the generation of harmful outputs when provided with malicious text prompts. We propose Safe Text […]