Amaze business, make your devs happy

Amaze business, make your devs happy
curl -XGET ElasticSearch Amaze business, make your devs happy Kilka słów o sobie Sebastian 25/03/2013 #EllerslieDNUG

ElasticSearch You know for search
Real time search and analytics engine No-SQL Document database Use Lucene for indexing It’s horizontally and verticaly scalable Automatic cluster formation Fault tolerant Zero config (at the begining) Nice RESTfull API You know for search Structured data, like well defined json objects Unstructred data like logs Full text search (pdfs, real world documents) Real time search and analytics You basicly feed tons of data, then search it, and it’s lighting fast Document No-SQL database Use JSON Use Lucene for indexing Java library for creating full text search index It’s horizontally scalable Sharding Automatic cluster formation By defualt use multicast, new nodes connect to cluster with the same name Fault tolerant Partition tolerant, shrad repolication, automatic data recovery Zero config (at the begining) Later you need tune configuration to your need Who’s using GitHub (migrated form solar) Wikimedia Guardian LiveChat XING Fog Creek SoundClound

ElasticSearch Index data Search and retrive SQL DB Application

Data storege ElasticSearch stores documents in indices
Each index can contain multiple types of documents Index is splited into multiple shards Each shard may be stored on a different node ElasticSearc stores documents in indices Something like SQL Database Each index can contain multiple types of documents Something like table Each type has type specific schema, which tells what are types of fields Index is splited into multiple shards Each shard may be stored on a different node

Shrads allocation Node 1 Node 1 Node 2 P1 P2 P3 P1 P2 P3 R1 R2 R3
When we carete index we decide how many shrads we want By default it’s 5 which means we can have up to 5 nodes each containing one primary shard Primary shard means it’s not replica Each primary shard is mapped 1:1 to lucene index We use overallocation to accomodate index for future groth Depending on configuration search will be completed on a node we’re connected to or on a seperate nodes (if we require search to work on primary shards If we add a node shard distribution will be balanced

Shrads allocation Node 1 Node 2 Node 3 Node 1 Node 2 Node 3 P2 P3 P1
When we carete index we decide how many shrads we want By default it’s 5 which means we can have up to 5 nodes each containing one primary shard Primary shard means it’s not replica Each primary shard is mapped 1:1 to lucene index We use overallocation to accomodate index for future groth Depending on configuration search will be completed on a node we’re connected to or on a seperate nodes (if we require search to work on primary shards If we add a node shard distribution will be balanced

Quering Search Facets and Aggregations Suggestions Words and n-grams
Geo location Date and time Value ranges Fuzzy maching Facets and Aggregations Distinct values for given field with document count Statistics for numeric fileds (average, min, max) Time series Suggestions Autocomplete Did you mean More like Search based on number of cryteria Words and ngrams Geo location geo_distance geo_bounding_box Geo_polygon Time Statistics (facets,aggregations) Distinct values for given field with document count For numeric fileds statistics (average, min, max)

Query example

{ "query": { "filtered": { "match": { "name": { "query": "amd" } },
"bool": { "must": [ "term": { "category": "CPUs" "range": { "price": { "from": 200, "to": 300 "cores": "4" ]

.net Clients NEST PlainElastic.Net ElasticSearch.NET NEST
Most mature, static or dynamic PlainElastic.Net jNo json generation ElasticSearch.NET Requires Thrift plugin

Scoring Scoring functions Boost queries Boost filters Decay functions
Custom score functions

Indexing Clinet Index Stored in transaction log Flush Indexed in ES
Refresh Available for search

Indexing When indexing large amount of documents adjust:
refresh_interval translog.flush_threshold_period translog.flush_threshold_ops

Testing

Deployment Requirements: Steps Java Server JRE
JAVA_HOME variable pointing to JRE (not bin) Steps From ElasticSearch dir run bin/service install Change service start mode to automatic and run service

Tools Sense Kibana Logstash Marvel Rivers

Learning materials http://goo.gl/JUNWRZ
Videos Articles Books tml

Amaze business, make your devs happy

Similar presentations

Presentation on theme: "Amaze business, make your devs happy"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Amaze business, make your devs happy

Similar presentations

Presentation on theme: "Amaze business, make your devs happy"— Presentation transcript:

Similar presentations

About project

Feedback