arXiv:2601.08146v3 Announce Type: replace-cross
Abstract: Existing circuit discovery methods rely on templated tasks with clean counterfactuals, limiting their use on diverse natural text. We adapt Contextual Decomposition for Transformers (CD-T) for unstructured settings via label-balanced activation means and task-directional relevance scoring, enabling counterfactual-free circuit discovery. We leverage these circuits for Circuit-Targeted Supervised Fine-Tuning (CT-SFT), restricting parameter updates to task-relevant heads and LayerNorm. Experiments on NusaX cross-lingual sentiment transfer show that CT-SFT is highly competitive for low-resource adaptation. While non-circuit sparse updates and full fine-tuning sometimes match target accuracy through capacity recruitment, CT-SFT uniquely minimizes catastrophic forgetting, preserving source-language and related-task performance. Extensions to XNLI confirm these findings hold across broader tasks and model families, demonstrating that circuit-targeted adaptation provides a safer, causally grounded alternative to global fine-tuning.
Digital health tools and point solutions—pitfalls in population health program measurement
Digital health tools are generally poorly regulated and often lack strong research evidence, posing challenges for purchasers of point solutions such as employer groups and