Presentation is loading. Please wait.

Presentation is loading. Please wait.

Amaze business, make your devs happy

Similar presentations


Presentation on theme: "Amaze business, make your devs happy"— Presentation transcript:

1 Amaze business, make your devs happy
curl -XGET ElasticSearch Amaze business, make your devs happy Kilka słów o sobie Sebastian 25/03/2013 #EllerslieDNUG

2 ElasticSearch You know for search
Real time search and analytics engine No-SQL Document database Use Lucene for indexing It’s horizontally and verticaly scalable Automatic cluster formation Fault tolerant Zero config (at the begining) Nice RESTfull API You know for search Structured data, like well defined json objects Unstructred data like logs Full text search (pdfs, real world documents) Real time search and analytics You basicly feed tons of data, then search it, and it’s lighting fast Document No-SQL database Use JSON Use Lucene for indexing Java library for creating full text search index It’s horizontally scalable Sharding Automatic cluster formation By defualt use multicast, new nodes connect to cluster with the same name Fault tolerant Partition tolerant, shrad repolication, automatic data recovery Zero config (at the begining) Later you need tune configuration to your need Who’s using GitHub (migrated form solar) Wikimedia Guardian LiveChat XING Fog Creek SoundClound

3 ElasticSearch Index data Search and retrive SQL DB Application

4 Data storege ElasticSearch stores documents in indices
Each index can contain multiple types of documents Index is splited into multiple shards Each shard may be stored on a different node ElasticSearc stores documents in indices Something like SQL Database Each index can contain multiple types of documents Something like table Each type has type specific schema, which tells what are types of fields Index is splited into multiple shards Each shard may be stored on a different node

5 Shrads allocation Node 1 Node 1 Node 2 P1 P2 P3 P1 P2 P3 R1 R2 R3
When we carete index we decide how many shrads we want By default it’s 5 which means we can have up to 5 nodes each containing one primary shard Primary shard means it’s not replica Each primary shard is mapped 1:1 to lucene index We use overallocation to accomodate index for future groth Depending on configuration search will be completed on a node we’re connected to or on a seperate nodes (if we require search to work on primary shards If we add a node shard distribution will be balanced

6 Shrads allocation Node 1 Node 2 Node 3 Node 1 Node 2 Node 3 P2 P3 P1
When we carete index we decide how many shrads we want By default it’s 5 which means we can have up to 5 nodes each containing one primary shard Primary shard means it’s not replica Each primary shard is mapped 1:1 to lucene index We use overallocation to accomodate index for future groth Depending on configuration search will be completed on a node we’re connected to or on a seperate nodes (if we require search to work on primary shards If we add a node shard distribution will be balanced

7 Quering Search Facets and Aggregations Suggestions Words and n-grams
Geo location Date and time Value ranges Fuzzy maching Facets and Aggregations Distinct values for given field with document count Statistics for numeric fileds (average, min, max) Time series Suggestions Autocomplete Did you mean More like Search based on number of cryteria Words and ngrams Geo location geo_distance geo_bounding_box Geo_polygon Time Statistics (facets,aggregations) Distinct values for given field with document count For numeric fileds statistics (average, min, max)

8 Query example

9 { "query": { "filtered": { "match": { "name": { "query": "amd" } },
"bool": { "must": [ "term": { "category": "CPUs" "range": { "price": { "from": 200, "to": 300 "cores": "4" ]

10 .net Clients NEST PlainElastic.Net ElasticSearch.NET NEST
Most mature, static or dynamic PlainElastic.Net jNo json generation ElasticSearch.NET Requires Thrift plugin

11

12 Scoring Scoring functions Boost queries Boost filters Decay functions
Custom score functions

13 Indexing Clinet Index Stored in transaction log Flush Indexed in ES
Refresh Available for search

14 Indexing When indexing large amount of documents adjust:
refresh_interval translog.flush_threshold_period translog.flush_threshold_ops

15 Testing

16 Deployment Requirements: Steps Java Server JRE
JAVA_HOME variable pointing to JRE (not bin) Steps From ElasticSearch dir run bin/service install Change service start mode to automatic and run service

17 Tools Sense Kibana Logstash Marvel Rivers

18 Tools

19 Learning materials http://goo.gl/JUNWRZ
Videos Articles Books tml


Download ppt "Amaze business, make your devs happy"

Similar presentations


Ads by Google