The article, published in the "Herald of SGUGiT," presents the best parameters for training the model to ensure maximum accuracy.
For construction and privatization tasks, regular monitoring of territories is required. Typically, this is done using traditional methods. Employees visit the site and conduct a visual inspection. This process is time-consuming and exacerbated by a shortage of personnel. Researchers from MIPT and Kuban State Technological University have proposed to automate this process.
The authors of the study cite the implementation of the "garage amnesty" law in Krasnodar as an example. According to this law, citizens can legalize their garages and acquire ownership of the land beneath them. Currently, the municipal property department is processing 7,000 applications, and people are waiting for document approvals from six to 16 months, while the regulation allows for a month.
The process can be expedited by using laser scanning (Lidar) of the area. To recognize objects, the researchers proposed using the PointNext neural network, developed based on PointNet++. This open-source program is designed to work with laser reflection point clouds. It is used for segmentation, classification, and identification of three-dimensional objects.
"Usually, neural networks are used for object recognition in photos or videos, while PointNext works with laser reflection point clouds. That's why we decided to use it," explained Sergey Samarin, a graduate student at the MIPT School of Radio Engineering and Computer Technologies.
The Lidar scans the territory with laser pulses, determining the distance to an object based on the time it takes for the pulses to return. This results in a point cloud, which is then fed into the neural network.
However, to produce quality results, the network must be trained. This is done using reference datasets. In this case, the researchers utilized the Terra_Maker system developed at Kuban State University. This system generated a laser reflection point cloud of a 1000 by 1000 meter area containing more than 500 real estate objects. The total number of points exceeded 4.7 million, all categorized into five classes: land, building roofs, low vegetation, medium vegetation, and high vegetation.
To assess the model's performance, various metrics are used, primarily accuracy, which indicates the proportion of correct responses. Good accuracy approaches 100 percent (but does not equal it). To achieve maximum accuracy, the parameters for the neural network's operation must be correctly selected. This was the challenge the authors of the research addressed. They specifically reconfigured PointNext for this purpose and began training.
Twelve experiments were conducted to determine the optimal number of points for a single training sample, grid size, and the number of epochs (when the entire dataset passes through the algorithm). The study employed the CrossEntropy loss function, the Adam optimizer, and exponential decay of the learning rate (Step Decay).
The results of the neural network's work were presented in the form of three-dimensional graphs with points colored in specific hues. For instance, the roof of a building is purple, while high vegetation is red.
The most accurate result was achieved with 2500 points in a single training sample and a grid size of 25 meters. During the training process, a pattern was observed: the smaller the grid side and the fewer points in the cloud, the higher the accuracy. Adding color information to the dataset slightly decreased accuracy but not significantly. Overall, the fewer parameters, the more effectively the model predicts. The best accuracy achieved in the experiment was 0.9998. Such a result, close to one, indicates an ideal dataset used by the neural network. With a real dataset containing distortions and noise, accuracy would be lower.
The next step for the researchers is to implement aerial laser scanning on real objects, followed by processing the data with the neural network.
"Instead of spending a whole day surveying the land plots, we launch a drone equipped with Lidar to perform the scan. We clean the data from noise and send it to the neural network. It segments and classifies the data so that we can understand where buildings, such as garages, are located on the territory," shared Sergey Samarin about their plans.
This work is important not only for the implementation of the "garage amnesty" law but also for identifying illegal constructions, monitoring compliance with construction regulations, such as adherence to building heights and setbacks from land boundaries.