A new quantization method will ensure the stability of binary neural networks.

A team of Russian scientists from Smart Engines and the Moscow Institute of Physics and Technology (MIPT) has introduced a novel method for quantizing binary neural networks. They have achieved superior results in training these networks.

2024-12-27 10:01:32https://naked-science.ru/article/column/stabilnost-binarnyh-nejro

The work is published in the journal Computer Optics. Modern neural networks are widely utilized across various fields, ranging from natural language processing and image generation to character recognition on mobile devices. In the rapidly evolving world of artificial intelligence, computational efficiency is a critical factor. For many applications, especially those operating on low-power devices (mobile phones, embedded systems, autonomous driving systems), the speed and size of the neural network are crucial.

Binary neural networks (BNNs) are one approach to creating compact and fast networks. In these networks, weights and activations are represented by a single bit of information (–1 or 1), which significantly reduces the memory required to store the model and allows for fast bitwise operations instead of time-consuming multiplications. However, training BNNs is a complex task that has long hindered their widespread application.

Traditional neural network training methods are not suitable for binary neural networks. The main challenge lies in the fact that the activation function (the transformation of input data into binary values) is a piecewise constant function (sign function) that has a zero derivative at all points where this derivative is defined, making the application of error backpropagation methods difficult. Various approaches have been used to address this issue.

The direct estimation method utilizes the sign function during the forward pass and its approximation during the backward pass to compute the gradient. A drawback is the mismatch of gradients and weight oscillations, leading to slow and unstable training.

Self-binarizing neural networks use a smooth approximation of the sign function (e.g., hyperbolic tangent) that gradually approaches the sign function as training progresses. A drawback is the gap between the trained model and the final binary model, resulting in reduced accuracy.

Scientists from MIPT, along with colleagues, made a breakthrough by developing a new quantization method based on uncertainty, which addresses this problem by ensuring stable training and high quality of binary neural networks even with a limited number of parameters. It combines the advantages of the two aforementioned methods.

The key idea of uncertainty-based quantization is the use of probabilistic activation, which takes into account the uncertainty in the values of weights and activations.

“At the core of our method, UBQ, lies a new concept of activation uncertainty, allowing for a more accurate approximation of the binary function and, consequently, more efficient training of binary neural networks,” said Anton Trusov, a graduate student at the Department of Cognitive Technologies at the MIPT School of Applied Mathematics and Computer Science.

In uncertainty-based quantization, the uncertainty value is computed for each weight and activation, reflecting how "confident" the network is about its sign (+1 or –1). If the uncertainty is high, a smooth approximation of the sign function is used, ensuring stable training.

If the uncertainty is low, direct estimation is applied, facilitating a rapid transition to the binary representation. Additionally, to smooth the transition from training mode to execution mode, the authors suggest a gradual "freezing" of the network layers and replacing the standard normalization procedure with a simplified version.

To evaluate the effectiveness of uncertainty-based quantization, experiments were conducted on widely used datasets such as MNIST (handwritten digit recognition) and CIFAR-10 (image classification). Several small and large convolutional neural networks with binary layers were trained using the two methods described above and a new proprietary method. The results were compared based on classification accuracy.

The experiments showed that the new method outperforms previous methods when working with small networks and demonstrates comparable results with the direct estimation method for larger networks. Furthermore, the uncertainty-based quantization method exhibited more stable training than the direct estimation method, as evidenced by the lower variance in results across repeated experiments.

The uncertainty-based quantization method can be optimized for various tasks and network architectures. Further research may include adapting the method's parameters for different tasks, utilizing dynamic weight uncertainty, and applying the method to other types of quantized networks.