The results of the work are presented in an overview article published in the journal Nature Communications.
“We identified that among the key challenges are the imbalance and unevenness of data, spatial autocorrelation, data biases, prediction errors, and difficulties in assessing model uncertainty. Although these issues are known, existing approaches often overlook them, relying on standard training and validation procedures for machine learning models,” said the lead author of the study, Diana Koldasbaeva, a graduate student at Skoltech in the “Computational Systems and Data Analysis in Science and Technology” program.
“To address these limitations, it is necessary to develop methods that take into account the unique characteristics of ecological data and spatiotemporal processes. The article presents a unified approach to tackling such challenges, including tools and techniques to enhance model accuracy, as well as recommendations for improving the assessment of their quality. We hope that our results will assist researchers from various countries in selecting research directions,” shared co-author Alexey Zaitsev, a senior lecturer at the Skoltech Center for Artificial Intelligence.
The authors also identified key directions for the development of geospatial research, considering the specifics of ecological data, and presented their own selection of advanced tools, resources, and projects that leverage geospatial technologies to address environmental issues. Researchers have made this collection freely available on GitHub and invite colleagues to utilize and contribute to it.
“In the study, we identified new datasets, models, and approaches to ensure the quality of work necessary for implementation in the fields of applied scientific developments and to solve the interpretability problem of data-driven predictions. For instance, it is crucial to create well-organized databases. Higher quality data naturally leads to a reduction in distortions related to imbalance and autocorrelation. We anticipate the emergence of self-learning for geospatial mapping in ecological research, similar to what we have already seen in language modeling and computer vision,” commented Evgeny Burnaev, director of the Center for Artificial Intelligence at Skoltech and head of the “Learnable Intelligence” research group at the AIRI Institute.