ELK+R Stack

Elasticsearch is a search engine based on the Lucene library. It provides a distributed full-text search engine with an HTTP web interface and schema-free JSON documents.

Elasticsearch is becoming the bigger player in the technology for documents search in the noSQL space and is actually experiencing a great development phase (6 versions in few years and an exponentially growth of the community).

I have installed elasticsearch version 6.5.4 and kibana (aligned version 6.5.4) on my Mac where I have installed R software version 3.5.

First impact: I have worked few hours in trying to have everything fine installed on my machine. In order to work with such technologies a limited set of hacking skills are required.

For installing both elasticsearch and kibana I have followed the instructions on the elastic website.

I don’t spend here much time on installation issue due to the fact that they are all strongly dependent on operating systems and personal skills. Documentation and online forums will assist you in case of any problem.

After installation kibana was alive and kicking at http://localhost:5601/app/kibana

So, time is come to feed elasticsearch with some data.

I have suddenly thought to the NYC flight dataset available in the nycflights13 package, including on-time data for all flights that departed NYC (i.e. JFK, LGA or EWR) in 2013.

library(nycflights13)
data(flights)
flights

I have installed the elastic package from CRAN

The connect command established the connection with my local elasticsearch

library(elastic)
connect()

Then I sent the data frame to elasticsearch with the simple bulk command

docs_bulk(flights, index = "flights_nyc_2013_idx")

The index argument provide the index name to use and is strictly required for data.frame input (optional for file inputs).

Opening kibana everything was ok and ready to play with

schermata 2019-01-24 alle 16.09.31

Then I have started to work on kibana for creating a dashboard for having useful insights from data. Not surprisingly june, july and December were the months at greater risk of delayed arrivals. Visualization and dashboard are ready for being included in websites trought specific iframe.

schermata 2019-01-24 alle 15.50.16

Rispondi

Inserisci i tuoi dati qui sotto o clicca su un'icona per effettuare l'accesso:

Logo di WordPress.com

Stai commentando usando il tuo account WordPress.com. Chiudi sessione /  Modifica )

Google photo

Stai commentando usando il tuo account Google. Chiudi sessione /  Modifica )

Foto Twitter

Stai commentando usando il tuo account Twitter. Chiudi sessione /  Modifica )

Foto di Facebook

Stai commentando usando il tuo account Facebook. Chiudi sessione /  Modifica )

Connessione a %s...