arXiv:2510.26969v1 Announce Type: cross
Abstract: We introduce a methodology for the identification of notifiable events in the domain of healthcare. The methodology harnesses semantic frames to define fine-grained patterns and search them in unstructured data, namely, open-text fields in e-medical records. We apply the methodology to the problem of underreporting of gender-based violence (GBV) in e-medical records produced during patients’ visits to primary care units. A total of eight patterns are defined and searched on a corpus of 21 million sentences in Brazilian Portuguese extracted from e-SUS APS. The results are manually evaluated by linguists and the precision of each pattern measured. Our findings reveal that the methodology effectively identifies reports of violence with a precision of 0.726, confirming its robustness. Designed as a transparent, efficient, low-carbon, and language-agnostic pipeline, the approach can be easily adapted to other health surveillance contexts, contributing to the broader, ethical, and explainable use of NLP in public health systems.
Fast Approximation Algorithm for Non-Monotone DR-submodular Maximization under Size Constraint
arXiv:2511.02254v1 Announce Type: cross Abstract: This work studies the non-monotone DR-submodular Maximization over a ground set of $n$ subject to a size constraint $k$. We

