3. March 2016
Data Lakes has become a popular term in the Big Data community. It’s used to refer to a large storage repository and processing engine. However there is now a technology from NOAA (National Oceanic and Atmospheric Administration of the USA) that turns its existing distributed data network of Petabytes of Open Data into what can be described as a Data Ocean! This technology is called ERDDAP and it provides fixed entry points on the Internet from which data can be searched for, queried and transformed. This functionality is made available via a human interface (web site) and Restful Web Services.
What is ERDDAP?
ERDDAP, to quote NOAA, “is a data server that gives you a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps…..Our focus is on making it easier for you to get scientific data”. It is Open Source and used by other agencies/nations, not just NOAA.
So what type of data are available? Currently much of the available data tends to come from the marine and atmospheric science domains. For example:
- NOAA/NCEP Global Forecast System (GFS) Atmospheric Model. The GFS numerical weather prediction model for the globe at approximately 50-km or 0.5-deg resolution.
- Real Time Ocean Forecast System (RTOFS). A global operational forecast model run by NOAA. This model includes sea ice predictions.
- Observations from satellites, sensors, radar, animal tracking and many more…
There are a number of ERDDAP servers in existence, some of which have thousands of data sets and others only have a few. For example:
- NOAA Global Earth Observation ERDDAP server has over 6,000 data sets in its catalog.
- The BlueHub ERDDAP Server of the Maritime Affairs Unit of the EU Commission’s Joint Research Centre contains over 100 data sets collated from other ERDDAP servers brought together under the theme of Maritime Research.
- The Irish Marine Institute of Ireland ERDDAP server has around 20 data sets.
In this short introduction article we are going to look at a simple example of how to get data from a weather station located on an ocean buoy via the ERDDAP human interface. In a later article we will examine the Restful Web API. To keep things simple we will use the ERDDAP server mentioned above that is hosted by the Marine Institute.
What is the Wave Height in the Irish Sea?
So let us jump straight in and get some data! A simple example is to look at the data feed coming from a weather buoy (M2) in the Irish Sea approximately 20 nautical miles (37 Km) east of Dublin.
Incidentally the above screen grab comes from the Marine Institute web site DigitalOcean which combines data feeds from ERDDAP and a number of other sources to provide you with an idea of what is happening in the seas around Ireland.
Some of data used in Digital Ocean and the M2 (above) ocean buoy data can be found on the following ERDDAP server:
If you click on the above link you will be brought to the data catalog. Below is a screenshot of the data catalog and I have highlighted the link you need to follow to graph the data from the weather buoy .
If you then click on the graph link you will be brought to a web page that you can use to query the data resource. In the screenshot of the query page (below) I have highlighted in green the parameters that were changed to generate a graph of wave height data recorded on the weather buoy.
Once you are happy with your query you can click the “Download the Data or an Image” button, which in this case will produce the following image.
Converting the Data
Now we could have just as easily downloaded the data by changing the file type via the drop down menu on the query form. You could also copy the URL to use as data resource in a coding project. For example, here is URL for the above graph:
So we have just completed a brief introduction to ERDDAP, which in fact is really the start of a journey exploring Open Data from the Earth Science community. Without ERDDAP this would be a struggle requiring us to trawl the web looking for data, to download the data found and process it locally and finally, to grapple with some very complicated data formats.
In my next article we will look at how to use ERDDAP to access the data from NOAA’s global weather forecast model (GFS) so that we can produce layers similar to the one shown below. In this example, we see the very rare event of three hurricanes in the Pacific which occurred in 2015, as seen in a Open Layers WebGIS, being served by GeoServer, PostGIS, and of course ERDDAP.