Новый способ квантования гарантирует надежность бинарных нейронных сетей.

Команда российских исследователей из компании Smart Engines и Московского физико-технического института представила инновационный метод квантования бинарных нейронных сетей. Они достигли значительных улучшений в процессе обучения этих сетей.

2024-12-27 10:01:32https://naked-science.ru/article/column/stabilnost-binarnyh-nejro

The work has been published in the journal Computer Optics. Modern neural networks are widely utilized across various fields, ranging from natural language processing and image generation to character recognition on mobile devices. In the rapidly evolving world of artificial intelligence, computational efficiency is a critical factor. For many applications, especially those running on low-power devices (such as mobile phones, embedded systems, and autonomous driving systems), the speed and size of the neural network are essential.

Binary neural networks (BNNs) represent one approach to creating compact and fast networks. In these networks, weights and activations are represented by just one bit of information (-1 or 1), which significantly reduces the memory required for storing the model and allows for fast bitwise operations instead of labor-intensive multiplications. However, training BNNs is a complex task that has long hindered their widespread application.

Traditional neural network training methods do not apply to binary neural networks. The primary challenge lies in the fact that the activation function (the transformation of input data into binary values) is a piecewise constant function (the sign function), which has a zero derivative at all points where this derivative is defined, complicating the use of backpropagation methods. Various approaches have been employed to address this issue.

The direct estimation method utilizes the sign function during the forward pass and its approximation during the backward pass to compute the gradient. A drawback of this approach is the mismatch of gradients and fluctuations in weights, leading to slow and unstable training.

Self-binarizing neural networks employ a smooth approximation of the sign function (such as the hyperbolic tangent), which gradually approaches the sign function as training progresses. A limitation is the gap between the trained model and the final binary model, resulting in decreased accuracy.

Researchers from MIPT, along with colleagues, made a breakthrough by developing a new quantization method based on uncertainty, which addresses this issue by ensuring stable training and high quality of binary neural networks even with a limited number of parameters. This method combines the advantages of the two previously described techniques.

The key idea behind uncertainty-based quantization is the use of probabilistic activation, which takes into account the uncertainty in the values of weights and activations.

“At the core of our UBQ method is a new concept of activation uncertainty, allowing for a more accurate approximation of the binary function and, consequently, more effective training of binary neural networks,” said Anton Trusov, a graduate student at the Department of Cognitive Technologies of MIPT's School of Applied Mathematics and Computer Science.

In uncertainty-based quantization, for each weight and activation, an uncertainty value is calculated, reflecting how "confident" the network is about its sign (+1 or -1). If the uncertainty is high, a smooth approximation of the sign function is used to ensure stable training.

If the uncertainty is low, direct estimation is applied, facilitating a rapid transition to the binary representation. Additionally, to smooth the transition from training mode to execution mode, the authors propose a gradual "freezing" of network layers and replacing the standard normalization procedure with a simplified version.

To test the effectiveness of uncertainty-based quantization, experiments were conducted on widely used datasets MNIST (handwritten digit recognition) and CIFAR-10 (image classification). Several small and large convolutional neural networks with binary layers were trained using the two aforementioned methods and a new proprietary method. The results were compared based on classification accuracy.

The experiments demonstrated that the new method outperforms previous methods when working with small networks and shows comparable results to the direct estimation method for larger networks. Furthermore, the uncertainty-based quantization method exhibited more stable training than the direct estimation method, as evidenced by a smaller variance in results across repeated experiments.

The uncertainty-based quantization method can be optimized for various tasks and network architectures. Future research may involve adapting the method's parameters for different tasks, utilizing dynamic weight uncertainty, and applying the method to other types of quantized networks.