arXiv:2603.21991v1 Announce Type: cross
Abstract: Gaussian Error Linear Unit (GELU) is a widely used smooth alternative to Rectifier Linear Unit (ReLU), yet many deployment, compression, and analysis toolchains are most naturally expressed for piecewise-linear (ReLU-type) networks. We study a hardness-parameterized formulation of GELU, f(x;lambda)=xPhi(lambda x), where Phi is the Gaussian CDF and lambda in [1, infty) controls gate sharpness, with the goal of turning smooth gated training into a controlled path toward ReLU-compatible models. Learning lambda is non-trivial: naive updates yield unstable dynamics and effective gradient attenuation, so we introduce a constrained reparameterization and an optimizer-aware update scheme.
Empirically, across a diverse set of model–dataset pairs spanning MLPs, CNNs, and Transformers, we observe structured layerwise hardness profiles and assess their robustness under different initializations. We further study a deterministic ReLU-ization strategy in which the learned gates are progressively hardened toward a principled target, enabling a post-training substitution of lambda-GELU by ReLU with reduced disruption. Overall, lambda-GELU provides a minimal and interpretable knob to profile and control gating hardness, bridging smooth training with ReLU-centric downstream pipelines.
Improving Fine-Grained Rice Leaf Disease Detection via Angular-Compactness Dual Loss Learning
arXiv:2603.25006v1 Announce Type: cross Abstract: Early detection of rice leaf diseases is critical, as rice is a staple crop supporting a substantial share of the



