Inside the stealthy startup that pitched brainless human clones

Inside the stealthy startup that pitched brainless human clones

After operating in secrecy for years, a startup company called R3 Bio, in Richmond, California, suddenly shared details about its work last week—saying it had

How Open Must Language Models be to Enable Reliable Scientific Inference?

arXiv:2603.26539v1 Announce Type: cross Abstract: How does the extent to which a model is open or closed impact the scientific inferences that can be drawn

R-PGA: Robust Physical Adversarial Camouflage Generation via Relightable 3D Gaussian Splatting

arXiv:2603.26067v1 Announce Type: cross Abstract: Physical adversarial camouflage poses a severe security threat to autonomous driving systems by mapping adversarial textures onto 3D objects. Nevertheless,

Neuro-Symbolic Process Anomaly Detection

arXiv:2603.26461v1 Announce Type: cross Abstract: Process anomaly detection is an important application of process mining for identifying deviations from the normal behavior of a process.

Physics-Informed Neural Networks and Sequence Encoder: Application to heating and early cooling of thermo-stamping process

arXiv:2603.26245v1 Announce Type: cross Abstract: In a previous work (Elaarabi et al., 2025b), the Sequence Encoder for online dynamical system identification (Elaarabi et al., 2025a)

Toward domain-specific machine translation and quality estimation systems

March 27, 2026

arXiv:2603.24955v1 Announce Type: cross
Abstract: Machine Translation (MT) and Quality Estimation (QE) perform well in general domains but degrade under domain mismatch. This dissertation studies how to adapt MT and QE systems to specialized domains through a set of data-focused contributions. Chapter 2 presents a similarity-based data selection method for MT. Small, targeted in-domain subsets outperform much larger generic datasets and reach strong translation quality at lower computational cost. Chapter 3 introduces a staged QE training pipeline that combines domain adaptation with lightweight data augmentation. The method improves performance across domains, languages, and resource settings, including zero-shot and cross-lingual cases. Chapter 4 studies the role of subword tokenization and vocabulary in fine-tuning. Aligned tokenization-vocabulary setups lead to stable training and better translation quality, while mismatched configurations reduce performance. Chapter 5 proposes a QE-guided in-context learning method for large language models. QE models select examples that improve translation quality without parameter updates and outperform standard retrieval methods. The approach also supports a reference-free setup, reducing reliance on a single reference set. These results show that domain adaptation depends on data selection, representation, and efficient adaptation strategies. The dissertation provides methods for building MT and QE systems that perform reliably in domain-specific settings.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844