Blog

2026

Rethinking Geolocation Features in Deep Learning and Geospatial Applications

Most labeled datasets for environmental and geospatial applications, especially those that represent true “ground truth” from field measurements, are collected in fairly limited regions. Whether I am reading research papers or reviewing student proposals in my machine learning and spatial data class, I see one common mistake: adding latitude and longitude as features to machine-learning models. The trouble is that adding lat/lon does seem to improve classification, regression, or estimation performance on these limited datasets. Reported metrics go up, errors go down, and everything looks better on paper. So why do I call this a mistake? Because in many cases, the model isn’t learning the underlying environmental process at all—it’s memorizing location. It is learning how to map lat/lon to the target.

Most labeled datasets for environmental and geospatial applications, especially those that represent true “ground truth” from field measurements, are collected in fairly limited regions. Whether I am reading research papers or reviewing student proposals in my machine learning and spatial data class, I see one common mistake: adding latitude and... Continue reading.

2025

Locatoin Encoder Earth Embeddings as Priors for Dynamic Air Quality Estimation

There has been considerable excitement, along with reasonable amounts of skepticism, around geospatial foundation models and so-called Earth embeddings. These models promise reusable representations of places learned from large, heterogeneous datasets, and they are increasingly treated as a general-purpose building block for downstream geospatial machine learning tasks. Yet one question kept resurfacing: what actually happens when static Earth embeddings are used in a higly dynamic estimation problem?

There has been considerable excitement, along with reasonable amounts of skepticism, around geospatial foundation models and so-called Earth embeddings. These models promise reusable representations of places learned from large, heterogeneous datasets, and they are increasingly treated as a general-purpose building block for downstream geospatial machine learning tasks. Yet one question... Continue reading.

2022

Forecasting the Geographic Incidence of COVID-19 How Does Social Media Help?

Forecasting COVID-19 geographic spread is a challenging task. Reliable forecasting is crucial, as it is used in resource allocation and allows local authorities and health officials to implement timely interventions.

Forecasting COVID-19 geographic spread is a challenging task. Reliable forecasting is crucial, as it is used in resource allocation and allows local authorities and health officials to implement timely interventions. Continue reading.