The application of artificial intelligence to geospatial problems is not new. Neural networks have been applied to satellite image classification since the 1990s. Machine learning models for spatial prediction — species distribution modelling, property valuation, crop yield estimation — have been developed and deployed for over two decades. Remote sensing analysts have used automated classification algorithms throughout their careers.

What is new, and what makes the current moment qualitatively different from previous waves of AI enthusiasm in the geospatial field, is the emergence of foundation models: large-scale models trained on enormous and diverse datasets that have demonstrated generalisation capabilities well beyond what previous approaches achieved.

This article examines the ways foundation models and AI-native approaches are changing geospatial intelligence, and what this means for practitioners.

# What Makes Foundation Models Different

Traditional machine learning for spatial tasks was largely supervised: you collected labelled training data for a specific task (land cover type labels for satellite imagery, damage assessment labels for post-disaster imagery), trained a model on that data, and deployed it for that specific task in that specific domain. Collecting sufficient labelled training data was often the rate-limiting step, and models generalised poorly to out-of-distribution data.

Foundation models flip this paradigm. Models like Segment Anything Model (SAM), CLIP, and DINOv2 are pre-trained on vast and diverse datasets — hundreds of millions of images, in some cases — using self-supervised objectives that do not require manual labelling. The resulting models learn general visual representations that can be adapted to many specific tasks with minimal additional training data.

For geospatial applications, this has profound implications. A foundation model trained on general imagery can often be adapted to satellite image analysis with a small amount of labelled geospatial data — far less than was previously required. The barrier to building effective computer vision applications for satellite and aerial imagery is falling rapidly.

# Segment Anything for Geospatial Analysis

Meta’s Segment Anything Model (SAM), released in 2023, is a vision foundation model that can segment any object in any image, prompted by points, boxes, or text. The release of SAM triggered a wave of experimentation in the remote sensing community, because it offered the prospect of automatic segmentation of satellite and aerial imagery at scale without task-specific training.

Early results were mixed: SAM, trained predominantly on natural images, performs less well on the overhead view of satellite imagery and on the spectral characteristics of remote sensing data (multiple spectral bands, different spatial resolutions). But fine-tuned variants — SAM-Geo, GeoSAM, SAM-CD — have demonstrated that small amounts of geospatial fine-tuning can produce models that perform very well on satellite image segmentation tasks.

The practical applications are significant:

Building footprint extraction: Automatic extraction of building outlines from high-resolution satellite or aerial imagery. Previously this required careful manual digitisation or expensive specialised models. With SAM-derived tools, it is increasingly automated.

Agricultural parcel delineation: Identifying field boundaries in agricultural remote sensing data, which is needed for subsidy administration, yield monitoring, and precision agriculture applications.

Infrastructure mapping: Detecting and mapping roads, utility infrastructure, and other built features from overhead imagery.

Change detection: Comparing segmentations from imagery acquired at different times to identify what has changed — new construction, deforestation, flooding extent.

# Large Language Models and Spatial Reasoning

The integration of large language models (LLMs) into geospatial workflows is proceeding along several dimensions.

# Natural Language Spatial Querying

One of the most immediately practical applications is using LLMs to translate natural language spatial questions into SQL queries against spatial databases. Non-technical users can ask questions like “Which neighbourhoods in the city have the highest concentration of food banks within 1 kilometre of schools?” and receive a generated PostGIS query, which can be executed and the results returned.

This is a form of text-to-SQL that incorporates spatial reasoning: the LLM must understand that “within 1 kilometre” implies a spatial proximity operation, that “concentration” likely means a count or density calculation, and that the query will need to join multiple tables with a spatial join.

The results are impressive for common query patterns and work reliably with well-structured databases. The limitations are around complex spatial reasoning, unusual coordinate systems, and queries that require spatial operations beyond the common ST_DWithin/ST_Intersects/ST_Contains vocabulary.

# LLM-Assisted Analysis Workflows

LLMs are increasingly useful as coding assistants for spatial analysis. Writing PostGIS SQL, GeoPandas Python, or MapLibre GL JS style specifications is work that requires familiarity with specific APIs and spatial concepts. LLMs with strong code generation capabilities can substantially accelerate this work for practitioners who know what they want to achieve but are not expert in the specific tool.

The practical workflow is: describe the analysis goal in natural language, have the LLM generate the code, review and correct it, execute. This is not a replacement for spatial expertise — you need to understand what the code is doing, whether the spatial operations are correct, and how to interpret the results — but it reduces the barrier to using unfamiliar tools.

# Grounding LLMs with Spatial Context

A more sophisticated integration pattern uses LLMs as reasoning engines over spatial data, with the spatial data provided as context. The LLM does not need to know the spatial data in advance; it is given relevant spatial features, statistics, or analysis results as part of its prompt and asked to reason about them.

For example: perform a spatial analysis to identify the ten locations most affected by a flood event, compute statistics about each location (population, infrastructure, economic activity), and pass these as context to an LLM that synthesises a situation report. The LLM handles the natural language synthesis; the spatial analysis tools handle the computation.

# Geospatial Foundation Models

Beyond general-purpose foundation models applied to geospatial tasks, a category of purpose-built geospatial foundation models has emerged. These are models pre-trained on large datasets of spatial data — satellite imagery, GPS traces, OpenStreetMap data, climate datasets — and designed for geospatial-specific tasks.

# Satellite Image Foundation Models

IBM and NASA’s Prithvi model (built on ViT architecture and pre-trained on HLS satellite imagery), Microsoft’s Bing Maps foundation models, and Google’s SkySat-trained models represent a new generation of models with built-in understanding of satellite imagery characteristics: spectral bands, spatial resolution, temporal sequences, and the overhead viewing geometry.

These models can be fine-tuned for specific tasks — crop type mapping, disaster damage assessment, urban change detection — with substantially less labelled data than training from scratch, and they show better generalisation to geographies and conditions not well-represented in the training data.

# Trajectory Foundation Models

Movement data — GPS traces, vessel tracks, flight paths — has a temporal sequential structure that makes it amenable to transformer-based foundation models. Pre-trained on large corpora of movement data, these models develop representations of typical movement patterns that can be fine-tuned for anomaly detection, mode of transport classification, or predictive trajectory completion.

# Spatial Language Models

Research prototypes trained on combinations of spatial data and text — geocoding training data, geographic encyclopaedias, spatial metadata — have demonstrated that models can develop implicit geographic knowledge: knowing that London is near Paris, that the Nile flows north, that high altitude correlates with cold temperatures. This geographic prior knowledge, when embedded in the model, improves performance on tasks that combine spatial and textual reasoning.

# Digital Twins and Spatial Simulation

A digital twin is a dynamic, continuously-updated digital representation of a physical system. At city or infrastructure scale, a digital twin integrates spatial data (3D building models, infrastructure networks, terrain), IoT sensor data (traffic, utilities, environmental monitoring), and simulation models to create a virtual replica that can be used for planning, optimisation, and prediction.

The enabling technologies for digital twins — 3D spatial data at scale, efficient cloud rendering, real-time data ingestion, simulation engines — have all matured significantly in the past five years. CesiumJS, the open source 3D geospatial visualisation library, can render streaming 3D Tiles (a spatial streaming format optimised for large-scale 3D datasets) in a browser. Tools like Bluesky, Esri’s ArcGIS Urban, and Bentley’s iTwin platform provide frameworks for building city-scale digital twins.

The intelligence layer of a digital twin is where AI becomes critical: models that predict traffic flow given infrastructure changes, models that simulate flood propagation through a city, models that optimise energy consumption across a building portfolio. This is applied geospatial intelligence at its most ambitious.

# What This Means for Practitioners

The AI-driven transformation of geospatial intelligence is creating new capabilities faster than most organisations can adopt them. Some practical implications:

The skill profile is shifting. The most valuable geospatial professionals in the next decade will combine spatial domain knowledge with the ability to evaluate, adapt, and deploy AI models. Pure cartography and basic GIS operations are increasingly automated; the human value-add is in defining the right questions, designing the analysis, validating results, and communicating findings.

Data quality becomes more important, not less. AI models amplify the quality of their training and input data. Poor-quality spatial data fed to a sophisticated model produces confidently wrong results. Investment in spatial data quality — accurate geometries, consistent attribution, careful provenance management — becomes more valuable as the analytical capabilities processing that data improve.

The evaluation problem is harder. When a rule-based spatial analysis produces a wrong result, the error is often traceable to a specific calculation or data quality issue. When a foundation model produces a wrong result, the cause can be much harder to identify. Rigorous evaluation frameworks — held-out test sets with realistic distribution, spatial cross-validation, uncertainty quantification — become even more important.

Ethical considerations are more acute. AI-powered geospatial intelligence can enable applications that raise serious ethical questions: mass surveillance through behavioural pattern analysis, predictive policing based on movement data, environmental displacement based on satellite-derived land use classification. Practitioners working in this field need to engage with the ethical dimensions of their work, not just the technical ones.

The trajectory of geospatial AI is clear: more capable, more accessible, and more deeply integrated into the spatial data pipelines that organisations depend on. The organisations that will extract the most value from it are those that invest in understanding both the capabilities and the limitations of these tools, and in the data quality and analytical rigour that makes AI-derived spatial intelligence trustworthy.


Related reading: Deriving Intelligence from Location Data: From Coordinates to Insight · Understanding Spatial Data Intelligence: A Modern Framework · Cloud-Orchestrated Geospatial Workflows