AI Geospatial Data Analyst
The AI Geospatial Data Analyst transforms satellite imagery, LiDAR, and sensor data into actionable intelligence using machine lea…
Skill Guide
The systematic process of identifying, acquiring, cleaning, transforming, and integrating spatially-referenced data from diverse sources into an analysis-ready format.
Scenario
Acquire and preprocess satellite imagery and vector land-use data for a selected city to calculate green space per capita.
Scenario
Build a pipeline that periodically acquires precipitation forecasts, elevation models, and river network data to generate potential inundation maps.
Scenario
Design and implement a near-real-time system to detect forest loss using multi-source satellite data, integrating alerts with enterprise GIS.
The fundamental Python/CLI toolkit for reading, writing, and manipulating virtually all geospatial data formats. GDAL is the industry backbone; the others provide more Pythonic interfaces for raster and vector operations.
Provide access to petabytes of pre-processed and raw satellite imagery via scalable APIs and cloud computing environments. Essential for large-area, time-series analysis without managing local storage.
For storage, complex querying, and large-scale distributed processing of vector and raster data. PostGIS is the standard for spatial SQL; Sedona and Dask enable distributed computation on clusters.
Answer Strategy
Demonstrate knowledge of CRS concepts, reprojection methods, and practical tool usage. Sample Answer: 'First, I would use `geopandas` to read the Shapefile and `rasterio` to open the GeoTIFF, inspecting their `.crs` attributes to confirm the mismatch. I would then reproject the vector data to match the raster's UTM CRS using `geopandas.to_crs(epsg=32633)`. For the overlay, I would vectorize the raster's footprint using `rasterio.features.shapes` or use `rasterstats` to extract values directly, ensuring all operations are performed in the same projected coordinate system.'
Answer Strategy
Tests problem-solving, understanding of remote sensing principles, and pipeline robustness. Focus on systematic diagnosis and modular design. Sample Answer: 'I would isolate the failure point by checking logs for specific errors (e.g., band mismatch, projection issues). I would then validate the new data's metadata against our schema. To adapt, I'd implement a configuration-driven preprocessing step where parameters like band order, radiometric calibration coefficients, and mosaicking logic are sourced from a config file per sensor, not hard-coded. This allows the pipeline to ingest new data types by updating configuration, not rewriting code.'
1 career found
Try a different search term.