Digital Geography

18. May 2015

How to organize geodata storage?

In the past I’ve learned one thing: the only chance to do efficient geodata-projects is to have a clean data structure.
In this post I’ll show you some tools and share my thoughts about organizing (geo)data to have more efficient project-workflows. After that I would be very interested in your solutions and approaches to do smart data-projects.



1. Think about data-types, provider or/and topic, local geodata storage

At the beginning of small projects I usually make one decision. I determine the main criteria for storing my project-files. Relating to the main project-goals I either decide if I segmenting the data in their data-types (e.g. raster-datasets, vector-datasets, numeric-datasets), or store them separated by data-provider (e.g. customer-datasets, governmental-datasets), or split the datasets into topic-segments (e.g. background-map, roads, rivers, poi-datasets).
As result I have a folder-structure on my local workstation. Often, especially when projects gonna grow up, this technique runs the risk of messing up given data structures.
So bigger projects should be organized on database-level (e.g. PostGIS) to have a good overview, data-filtering possibilities and some great indexing features to improve speed of data-analyses.

2. Geodata databases

If you work on a professional level, sooner or later there is no getting around geodatabase solutions to store your project geodata. If you’re totally new to this term, I suggest to read this short article from my dear colleague Riccardo about “Geodatabases: a little insight“. If you’re interested in setting up such a geodatabase, please try this tutorial also from Riccardo: “OpenSource QGIS + PostGIS installation: “the Windows way”“.


PostGIS is “the” open source geodatabase extension for PostgreSQL


In general geodatabases allows you to store geodata in a database format with its geometry information, add (connect) the data directly into your desktop GIS-Software and process the geodata directly on the database server. The last point is a very powerful one. With e.g. SQL-queries you are able to select your data needs directly from the source and save them as intermediate datasets. Of course you can perform really smart requests and get out final results out of the database.
From my point of view you have many advantages with working on a geodatabase-level. The biggest disadvantages are the hard learning curve to handle database requests and the database configuration in advance.

Here you are:
PostgreSQL with PostGIS
ESRI GIS Tools for Hadoop
more

3. Geodata clouds

After the local- and dedicated database way of storing geodata, cloud storing is the third possibility.

Online services like GeoCloud2 or QGIS Cloud allows you to upload spatial data, manage data, edit data online and combine the data with fancy online techniques like JavaScript magic, the speed enhancement of a CDN content delivery network and so on.

Maybe this could be a good way to work together with people who havn’t got any GIS-skills, but are familiar with data manipulation within web interfaces.

Here you are:
QGIS Cloud
MapCentia GeoCloud2

4. So what?

  • What do you think about these three possibilities of storing geodata?
  • What’s your “best case” solution?
  • Which tools do you use for organizing geodata in projects?
  • How do you deal with participants without GIS-skills?

I’m very interested in your comments!