lecture: mapchete - parallelized batch geoprocessing using Python
Processing geodata can be fairly simple until the input data reaches a certain size. Creating a hillshade or extracting contour lines from a DEM can be done quickly, but if you want to do this with e.g. the global SRTM dataset (1296001 x 417601 pixel), the process will crash (unless you are visiting from the future). Besides, if there are additional steps required like clipping the data to the 400MB landpolygon behemoths from OSM or applying custom filters, you probably find yourself starting to write your own tool chunking the data.
mapchete tries to solve this issue by helping you to focus on developing your geoprocess written in Python and applying this process to the data. It does so by automatically reprojecting and chunking the input datasets into tiles (based on the “WMTS simple profile”) and running your Python process for each tile individually and in parallel on all available CPU cores.
mapchete offers two command line tools.
mapchete_execute runs the process on the full dataset, similar to tile pyramid seeding for map caches.
mapchete_serve hosts an OpenLayers interface and processes only the data in areas and zoom levels you are currently inspecting. This allows you to test and assess your process on the full dataset on the server, instead of clipping and downloading subsets on your laptop.
mapchete is used as the data preprocessing backbone of EOX Maps, a service which provides background maps for example to the European Space Agency.
Links to project: https://github.com/ungarj/mapchete