Digital Geography

6. July 2015

Using Jupyter for data analysis

Data analysis in the modern-day computing industry is of great essence as the world tries to understand the data that has been accumulated in many systems across the globe. Extraction of useful information is a task being focused so much in most organizations as this is marking the lifetime for existence in the business world.

For a while, there has been a number of platforms allowing users/organizations to analyze and understand the trends with big data in their workplaces/projects. R, a renown language for data analysis has had the power to accommodate most of user needs in terms of functions to conduct analysis on data as well as integration with other platforms to compliment the analysis bit.

IPython Notebook has also been of great use to the users in the data analysis sector and has had a greater number of users in the past few years. Recently, the IPython Notebook has transitioned into Jupyter (JUlia, PYThon, R) to accommodate a large number of platforms and uses. Jupyter comes with an interactive Python Kernel that has the capability to write, edit and execute codes all under one roof.

A few days ago, I attended a conference on “Workshop on Data Science in Africa, 2015″where interactive sessions were carried out using the Jupyter platform. The functionality embedded in the platform is great and robust. I had a chance to explore the various tools included in Jupyter and damn…it was boom…really cool. For a few weeks now, I’ve had a chance to dig much deeper into Jupyter and write codes to perform analysis on data.Its really work out magic.

This is how you go about setting up and using Jupyter (An overview).

  • First, you need to install all the dependencies required to run the Jupyter project.This sounds much work..right?. To avoid the hustles, you can install Anaconda (collection of powerful packages for Python that enables large-scale data management, analysis, and visualization for Business Intelligence, Machine Learning ,Scientific Analysis, Engineering).
  • To install Anaconda, follow the guide at Anaconda Site .
  • After installation, the Jupyter project can be run using the ipython notebook command on the terminal.
  • This will launch the Jupyter project on the default browser on the computer.

Jupyter homepage view

The Jupyter project provides the ability to create notebooks  by providing an interactive platform for executing Python, R and other codes. The platform also provides useful functions to carry out analysis such as plotting graphs when coding.

Writing codes in Jupyter

All the codes written in this project are auto-saved in a directory in the installed Anaconda environment for later review..Brilliant !! . So far I think the power of the platform has been of great help to data miners, Data analysts and other data users interested in deriving great and useful information from their data.

Accessing parts of Jupyter project

The above screen shots are just a few of what I have on my Jupyter project.

Actually this can be done on any browser at the moment. From my side, this is a journey that is geared towards fulfilling the needs for data analysis in the GIS world. More and more work is being undertaken to achieve the targets for data analysis in the world. We are all on the move to search for the unknowns in this sector.