lecture: Scalability of GeoNetwork: Current Status and Future Directions

In recent times, phenomenon such as the Internet of Things or the popularity of social networks, among others, have been responsible for an increase availability of sensor data and user generated content. To be able to ingest, store and analyze these massive volumes of information is a standing challenge that is no longer ignored. The data about this data is generally speaking, less of a problem, if we think for instance that trillions of sensor records, may share the same metadata record; for this reason catalogs have been less exposed to the challenges that took by storm the database community.

Nevertheless, a large variety of datasets can also pose some performance challenges to traditional catalogs, and demand increase scalability. In this talk we will look at strategies for scaling GeoNetwork through load balancing, at its current limitations, and we will discuss potential improvements by adopting distributed search server technologies such as SOLR or ElasticSearch. On the database side, we will review the current database support, which is limited to ORM, and discuss the possibility of extending it to support NoSQL databases, which could be horizontally scaled, unleashing a new generation of metadata storage.

