The notion of the general stability of deep neural networks has been widely discussed virtually since their introduction. The overall performance of DNN models is inherently linked to overparameterization; however, the excessive number of parameters introduces natural instabilities in the model’s output. As a result, one frequently considers the effects of the extreme perturbations of the input data, known as adversarial attacks, to characterize the robustness of the models. Over the years, a variety of attacks have been proposed, exploiting different levels of access, targets, and structures, thereby enabling the development of defense mechanisms. Simultaneously, pruning or downsizing overparameterized models, along with various dimensionality reduction techniques applied to input data, has led to the incorporation of low-rank structures in the DNN pipeline—structures that significantly influence stability and robustness. In the current thesis, we discuss various questions related to the stability and robustness of DNN models through the prism of low-rank structures. First, we propose a novel adversarial attack that explicitly searches for low-rank adversarial perturbations. We demonstrate its competitive performance and computational efficiency, showing that this low-rank attack can effectively enhance model robustness through adversarial training. Second, we propose a stabilization technique that enhances DNN robustness by constraining trainable weight matrices to remain close to the low-rank orthogonal manifold. This is achieved by modifying the existing Low-Rank Training procedure through the introduction of the relaxed Stiefel manifold. Finally, we explore the construction of universal adversarial attacks by comput- ing the generalized singular values and vectors of the composite operator norm, which is linked to the model’s generalized condition number. We evaluate the performance of each proposed method on image classification tasks using popular DNN architectures (VGG16, ResNet) and several benchmark datasets.
Low-rank structures for robustness in Deep Learning / Savostianova, Dayana. - (2025 May 23).
Low-rank structures for robustness in Deep Learning
SAVOSTIANOVA, DAYANA
2025-05-23
Abstract
The notion of the general stability of deep neural networks has been widely discussed virtually since their introduction. The overall performance of DNN models is inherently linked to overparameterization; however, the excessive number of parameters introduces natural instabilities in the model’s output. As a result, one frequently considers the effects of the extreme perturbations of the input data, known as adversarial attacks, to characterize the robustness of the models. Over the years, a variety of attacks have been proposed, exploiting different levels of access, targets, and structures, thereby enabling the development of defense mechanisms. Simultaneously, pruning or downsizing overparameterized models, along with various dimensionality reduction techniques applied to input data, has led to the incorporation of low-rank structures in the DNN pipeline—structures that significantly influence stability and robustness. In the current thesis, we discuss various questions related to the stability and robustness of DNN models through the prism of low-rank structures. First, we propose a novel adversarial attack that explicitly searches for low-rank adversarial perturbations. We demonstrate its competitive performance and computational efficiency, showing that this low-rank attack can effectively enhance model robustness through adversarial training. Second, we propose a stabilization technique that enhances DNN robustness by constraining trainable weight matrices to remain close to the low-rank orthogonal manifold. This is achieved by modifying the existing Low-Rank Training procedure through the introduction of the relaxed Stiefel manifold. Finally, we explore the construction of universal adversarial attacks by comput- ing the generalized singular values and vectors of the composite operator norm, which is linked to the model’s generalized condition number. We evaluate the performance of each proposed method on image classification tasks using popular DNN architectures (VGG16, ResNet) and several benchmark datasets.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.