Year ago I had a possibility to test very nice Kibana feature - visualizing events on BetterMap that were indexed using Pig. I was able also to verify if geoquery works.
I will try to explain step by step how to achieve that.
My data has to be inserted into Elastic Search in GeoJson format.
GeoJson format is a a two element array [longitude,latitude]. Order is important, and is different than in other Elastic Search Geo formats.
POST /zagwozdka/
With Pig I can transform each record into datetime object, event and location tuple (longitude, latitude). Last tuple will be interpreted later as a geopoint in GeoJson format.
dumping transformed_events to stdout should give output similar to this:
- Is datetime mapping required in this scenario?
No, but other diagrams (like Histogram) in Kibana have useful filtering features based on timestamp, so it is worth to index properties with their correct types.
- Is geo_point mapping required in this case?
No, Kibana's BetterMap can display any array consisting of longitude and latitude as long as it is array and has proper order.
The missing part will be geo query in Elastic Search - it won't work without this mapping. Indexing Elastic Search will guess type and it may look like this:
Sample geoquery:
I will try to explain step by step how to achieve that.
Where is the challenge
GeoJson format is a a two element array [longitude,latitude]. Order is important, and is different than in other Elastic Search Geo formats.
Elastic Search has a capability to guess types and create/append mapping on runtime, but GeoPoint isn't one of those that is easy to discover.
Creating Index and Defining Type
First I have to create empty index with default settings:
POST /zagwozdka/
{}
Then I can create mapping for my new type, I call it events. My dataset row consists of timestamp, event string and location - I want to index all properties. POST /zagwozdka/_mapping/events
{
"events": {
"properties": {
"event_datetime": {
"type": "date"
},
"event": {
"type": "string"
},
"location": {
"type": "geo_point"
}
}
}
}
Dataset
My dataset is a csv file with columns:
- datetime (yyyy-MM-dd HH:mm:ss)
- event(chararray)
- latitude(double)
- longitude(double)
Datetime
|
Event
|
Latitude
|
Longitude
|
2009-01-24 12:00:00.000
|
Travis
|
40.5833333
|
-4.1166667
|
2009-01-28 11:19:00.000
|
Diamond
|
37.65361
|
-101.19056
|
2009-01-07 17:48:00.000
|
Stefan
|
32.51722
|
-80.07583
|
2009-01-23 12:42:00.000
|
Watson
|
-31.6333333
|
150.3333333
|
2009-01-07 19:48:00.000
|
Andrew
|
-32.8833333
|
152.2166667
|
2009-01-26 11:19:00.000
|
John
|
61.9666667
|
24.6666667
|
2009-01-05 13:23:00.000
|
Greg
|
44.2166667
|
15.3666667
|
Loading a File
First I need to load file using standard PigStorage input.
events = LOAD 'demo.csv' using PigStorage(';')
AS (event_datetime:chararray, event:chararray, latitude:double, longitude:double);
Transforming into proper types
transformed_events = FOREACH events GENERATE
ToDate(event_datetime,'yyyy-MM-dd HH:mm:ss') as event_datetime,
event,
TOTUPLE(longitude,latitude) as location;
dumping transformed_events to stdout should give output similar to this:
(2009-01-24T12:00:00.000Z,Travis,(-4.1166667,40.5833333))
(2009-01-28T11:19:00.000Z,Diamond,(-101.19056,37.65361))
(2009-01-07T17:48:00.000Z,Stefan,(-80.07583,32.51722))
(2009-01-23T12:42:00.000Z,Watson,(150.3333333,-31.6333333))
(2009-01-07T19:48:00.000Z,Andrew,(152.2166667,-32.8833333))
(2009-01-26T11:19:00.000Z,John,(24.6666667,61.9666667))
(2009-01-05T13:23:00.000Z,Greg,(15.3666667,44.2166667))
Indexing
First I register Elastic Search user defined functions for Pig which enable storing data into Elastic Search, next I store using EsStorage output.
REGISTER elasticsearch-hadoop-2.0.2.jar
STORE transformed_events INTO 'zagwozdka/events' USING org.elasticsearch.hadoop.pig.EsStorage('es.mapping.names=event_datime:event_datetime,event:event,location:location');
I could ommit es.mapping.names parameter - names in pig are the same in elastic search type. When they're different, this parameter will be helpful.
Configuring Panel
Finally, I can run Kibana and configure dashboard for new source of data. Just need to add BetterMap panel and configure field with proper location field and tooltip source.
Result
If configured properly Kibana presents great events on BetterMap.
QA
No, but other diagrams (like Histogram) in Kibana have useful filtering features based on timestamp, so it is worth to index properties with their correct types.
No, Kibana's BetterMap can display any array consisting of longitude and latitude as long as it is array and has proper order.
The missing part will be geo query in Elastic Search - it won't work without this mapping. Indexing Elastic Search will guess type and it may look like this:
GET /zagwozdka/events/_mapping
{
"zagwozdka": {
"mappings": {
"events": {
"properties": {
"event": {
"type": "string"
},
"event_datetime": {
"type": "date",
"format": "dateOptionalTime"
},
"location": {
"type": "double"
}
}
}
}
}
}
Sample geoquery:
POST /zagwozdka/events/_search
{
"query": {
"filtered" : {
"filter" : {
"geo_distance" : {
"distance" : "50km",
"location" : [-111.89028,40.76083]
}
},
"query" : {
"match_all" : {}
}
}
}
}
will give exception:
org.elasticsearch.index.query.QueryParsingException: [zagwozdka] field [location] is not a geo_point field
Update for Kibana 4:
Update for Kibana 4:
Kibana 4 uses tilemap but it still requires one field with same type.
bettermap required it to be geopoint type in geoJSON format, tilemap support more formats, for example "lat,lon”:
"location” : "41.12,-71.34”
There has to be some more mapping added, more information can be found in links below.



Brak komentarzy:
Prześlij komentarz