Earthquake data with the Elastic Stack

INTRO

The Elastic Stack is fantastic for analysing time based data, most of which we currently see coming from operational requirements around things like system load, memory or disk use, network flows, application error rates and so on.

However, the awesome thing about the stack is that it isn't tied to time based data consisting of logs or metrics. Here we'll take a look at earthquake data from the Northern California Earthquake Data Center, via the ANSS Composite Catalog Search, this data set includes naturally occurring quakes, as well as man-made ones from nuclear blasts and quarry explosions.

You may not be shaking in your boots with excitement yet, but we'll show a few neat tricks around leveraging custom maps, Timelion visualisations and other tips to increase usability for your users and to give things a bit of extra shine.

FIRST STEPS

To get this dataset into Elasticsearch we built a very simple Logstash configuration with the CSV, mutate and date filters. We've also created mappings for the fields in the data and converted that into a template. If you want to play along at home, then head over to this Github repo and either clone it or download the files, then follow the instructions to get the data loaded. The dashboards are all provided, but we'll dig into the bits that make this an interesting use case throughout this post, so don't go anywhere!

VISUALISING WITH YOUR EYEBALLS

Dashboard

When you open the main Earthquake dashboard (as seen above), you can see we've broken down the data into different groupings, which are also separate dashboards themselves. These are based on Hot Areas around the world, Catalogues, or archives, and finally, by a few specific Quarries and Nuclear Blasts.

This is a basic markdown visualisation that's a handy way for introducing your data to your users. Rather than letting them click around and figuring out what the data is all about, you can help frame their discovery process by creating dashboards that link to specific subsets of your data. It's also a great way to introduce concepts such as filters and the ability to build multiple dashboards with different visualisations for different audiences, but all based on the one dataset.

One tip we also really like in markdown boxes like this, is having a simple "return to home" or "reset time frame" link that can help if someone gets lost or makes changes they didn't want. This helps with time based data, as you can have a link to the dashboard with the last 5/10/100 minutes/months/years without them having to jump into the timepicker. We've called our reset link Go back to the world map. and it's at the bottom of markdown box.

Navigation - Go back to the world map

Moving on, and next to that we have a heat map of all the events, which has been populated thanks to the inclusion of latitude and longitude in the dataset, that was then converted to a single location field and then mapped to a geo_point using the Elasticsearch template. Note that this is a standard Kibana heatmap, we'll touch on some custom maps goodness later.

Then we have a few high level metrics, a histogram of all the magnitudes for the given time period, a breakdown of quake type (natural or man made) and then the source code (ie the network operator code for the detectors) of each of these.

Finally we have a bar chart showing a date histogram of all the events for each month, with a count of each magnitude, and below that a Timelion plot showing the average depth of the quakes, broken down per-week as dots, with a moving average of those depths over the past 3 months as a line graph.

Let's look at all of this in more detail.

SECOND STEPS

Now we will open up the Earthquake - Japan dashboard, you can do that by clicking Japan under the Hot Areas links, or finding the dashboard using the Load Saved Dashboard button on the top right of the toolbar.

Hot Areas - Japan

The first thing we notice is that a saved filter has been applied, the little green box labelled "Japan Territorial Water". Interestingly. this is actually a geo_polygon query that isn't currently exposed natively in Kibana, luckily for us, Kibana does allow us to specify custom queries to pass to Elasticsearch and then have applied in our dashboard. If you mouse over the box and click the Edit button, the one that looks like a pencil and is the furthest icon on the right of the filter, you can see how this has been set up as a geo_polygon filter, with the points being an area that contains Japan's territorial waters.

As we are only looking at data points within the general Japan region, as defined by our polygon, the other visualisations also change the values that they report. This is one of the great features of Kibana as it automatically puts all of the data we are looking at in the same point of reference. If you compare the average depth of quakes in Japan to all events worldwide from the previous dashboard, it is nearly 300% more than the average and the quakes that strike the region are also usually a magnitude 4.

People sometimes think Kibana visualizations are static, but most of them are capable of running queries and aggregations dynamically on the dashboard. Let's zoom to around 2010 to the end of 2011.

Change the time window from the chart

What is also obvious here is a spike in quakes from December 2010 through to July 2011, as we know this is when the Tōhoku earthquake and tsunami occurred. Moving onto the Timelion visualisation, the average depth of the quakes around this time are also very consistent and relatively shallow, which was why the event was so devastating.

Charts of Timelion

The last few visualisations, which are also from Timelion, show a number of interesting statistics that are taken dynamically from the World Bank API. This is another neat feature that Timelion has and you can read more on it here. By mousing over the any of the Timelion graphs we can see a horizontal line showing the current time period we are over, this line is carried over to all other Timelion visualisations on the dashboard, which lets us quickly draw correlations between the data.

After the financial crisis happened in 2008, the GDP grown of Japan was in the recovery but the trial was broken and the direct investment finally became minus in 2011. Opposite to the situation, Japanese Yen was strong at that time.

DOWN AND DIRTY WITH MAPS

Those with sharp eyes may have noticed that the graphs in the Earthquake - Japan dashboard are different to the main Earthquake dashboard. Here we have leveraged the ability for Kibana to import WMS compliant maps from other sources and taken something from GEBCO.

We have used this map server as when we display the heatmap on top of the supplied bathymetric chart, you can very easily find out the connection between the earthquakes and oceanic trenches.

The dashboard can be used not only for showing statistics and exploring the data, but also recording the event happened in the past. In the "Catalogues" section of the dashboard, we have put some historical tragedies caused by earthquakes. Click on Loma Prieta, 1989 under the "Catalogues". According to Wikipedia, "The 1989 Loma Prieta earthquake occurred in Northern California on October 17 at 5:04 p.m. local time. The shock was centered in The Forest of Nisene Marks State Park approximately 10 mi (16 km) northeast of Santa Cruz on a section of the San Andreas Fault System and was named for the nearby Loma Prieta Peak in the Santa Cruz Mountains." Let's locate where it happens. We know the maximal magnitude of the earthquake is about 7 so type mag:[7 TO *] into the query area of Kibana. Now a blue dot appears on the center of the map which is the epicenter of the biggest earthquake.

Loma Prieta

We sometimes find something unexpected while exploring public datasets. When we were looking at man made quakes, typically quarries and blasts, in Nevada states, we found some hot spots on the north west of Las Vegas where could be considered as nuclear bomb testing sites. Those were quite often from the 60's to the 80's.
We also found two blue dots in Henderson city which shouldn't be nuclear explosions. According to the timeline, those blasts happened on May 4th 1988. As We searched "henderson explosion may 4 1988" on Google, it seems those are the evidences of the PEPCON disaster. Two of those explosions are huge as equal to magnitude 3 earthquakes.

Nevada State

OUTRO

Hopefully we have given you a few good tips for getting the most from Kibana. Whether it's as simple as adding a markdown box with some intro text and links to other dashboards to help guide your users, integrating Timelion visualisations showing correlation of data from external sources against the data in Elasticsearch, or getting really advanced and importing your own maps.

As a reminder, you can find the dataset, Logstash configs, Elasticsearch mappings and Kibana dashboards at this Examples GitHub Repo - so please check it out, see how the methods we have talked about in the blog post were applied, and leverage it to make something awesome of your own.

The really cool part about all of what we have looked at here; If you're someone like the United States Geological Survey (USGS), you can ingest tweets in real time to detect when earthquakes have hit regions that may not have adequate detection systems, and then build insightful dashboards to visual that data! To see more of how that works for them, watch their fantastic talk from Elastic{ON} 2015 here.

If you have further questions or comments about any of this, please head over to our Discuss forums, or hit us up on Twitter via @elastic