Volumetric Ergodic Control

arXiv:2511.11533v2 Announce Type: replace-cross Abstract: Ergodic control synthesizes optimal coverage behaviors over spatial distributions for nonlinear systems. However, existing formulations model the robot as a

Generative AI-assisted Participatory Modeling in Socio-Environmental Planning under Deep Uncertainty

arXiv:2603.17021v1 Announce Type: new Abstract: Socio-environmental planning under deep uncertainty requires researchers to identify and conceptualize problems before exploring policies and deploying plans. In practice

Sympatric speciation by symmetry-breaking: The three-clade case

arXiv:2603.17026v1 Announce Type: new Abstract: In this paper we expand the concept of biological speciation by symmetry breaking of Golubitsky and Stewart to the case

Intermitotic timing and motility patterns in the cell division of the diatom Seminavis robusta

arXiv:2603.16984v1 Announce Type: new Abstract: Many diatoms follow a size diminuation – size restoration cycle in their vegetative phase, leading to daughter cells that differ

Topology-Guided Biomechanical Profiling: A White-Box Framework for Opportunistic Screening of Spinal Instability on Routine CT

arXiv:2603.16963v1 Announce Type: new Abstract: Routine oncologic computed tomography (CT) presents an ideal opportunity for screening spinal instability, yet prophylactic stabilization windows are frequently missed

NOBLE: Accelerating Transformers with Nonlinear Low-Rank Branches

March 9, 2026

arXiv:2603.06492v1 Announce Type: cross
Abstract: We introduce NOBLE (Nonlinear lOw-rank Branch for Linear Enhancement), an architectural augmentation that adds nonlinear low-rank branches to transformer linear layers. Unlike LoRA and other parameter-efficient fine-tuning (PEFT) methods, NOBLE is designed for pretraining from scratch. The branch is a permanent part of the architecture as opposed to an adapter for finetuning on top of frozen weights. The branch computes sigma(xWdown)Wup where sigma is a learnable nonlinearity. We evaluate several activation functions and find that CosNet, a two-layer cosine nonlinearity with learnable frequency and phase with a linear projection in between them in the bottleneck space, performs best. NOBLE achieves substantial improvements with minimal overhead: up to 1.47x step speedup to reach baseline eval loss (up to 32% fewer training steps), with as low as 4% additional parameters and 7% step time overhead, resulting in up to 1.22x net wallclock speedup. Experiments on LLMs (250M and 1.5B parameters), BERT, VQGAN, and ViT consistently show improved training efficiency. We identify one caveat: Mixup/CutMix augmentation interferes with NOBLE’s benefits in Imagenet classification along with other stochastic augmentations, but when disabled, ViT also improves. This discrepancy is possibly explained by regularization techniques that encourage smoother fits to the target function while NOBLE may specialize more in sharper aspects of the target function.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844