Optimized Spatial Data Collection
Balancing quantity and representativeness in constrained geospatial dataset design for machine learning applications
Overview
This project focuses on optimizing spatial data collection strategies for machine learning when resources are limited. The work addresses the fundamental challenge of designing geospatial datasets that balance both quantity and representativeness under budget constraints.
Key Contributions
- Developed methods for optimizing spatial sampling strategies
- Addressed tradeoffs between data quantity and geographic representativeness
- Applied techniques to real-world machine learning applications in remote sensing
Publication
- Betti, L., Sanni, F., Sogoyou, G., Agbagla, T., Molitor, C., Carleton, T., & Rolf, E. (2025). Mapping on a Budget: Optimizing Spatial Data Collection for ML. arXiv preprint arXiv:2509.03749. (Accepted)
Presentation
- Presented at the Machine Learning for Remote Sensing Workshop (2025)
For more details or to collaborate, please contact me.