Anton Boyko aka @BoykoAnt Microsoft azure mvp, mcp Microsoft Devops TE Azure Search Anton Boyko aka @BoykoAnt Microsoft azure mvp, mcp Microsoft Devops TE
If you have any questions – interrupt and ask.
You consider yourself a: Web developer Mobile developer IT Pro Database engineer
What? A fully-managed search solution that allows developers to enable search experiences in applications. Embed a sophisticated search experience into web and mobile applications without having to worry about the complexities of full- text search and without having to deploy, maintain or manage any infrastructure.
Why? Users find search as a natural, low friction way to interact with applications that manage lots of data Web search engines have set the bar high for search Instant results, auto-complete, hit highlighting, great ranking, linguistics Search is hard and rarely a core expertise area From infrastructure standpoint: availability, durability, scale, operations From the functionality standpoint: ranking, geo-spatial, input handling
Azure Search functionality Simple HTTP/JSON API for creating indexes, pushing documents, searching Keyword search with user-friendly operators (+, -, *, “”, etc.) Hit highlighting Faceting (histograms over ranges, typically used in catalog browsing) Suggestions (auto-complete) Rich structured queries (filter, select, sort) that combines with search Scoring profiles to model search result relevance Geo-spatial support integrated in filtering, sorting and ranking
Language support Arabic Armenian Bangla Basque Bulgarian Catalan Chinese Simplified Chinese Traditional Croatian Czech Danish Dutch English Estonian Finnish French Galician German Greek Gujarati Hebrew Hindi Hungarian Icelandic Indonesian (Bahasa) Irish Italian Japanese Kannada Korean Latvian Lithuanian Malayalam Malay (Latin) Marathi Norwegian Persian Polish Portuguese (Brazil) Portuguese (Portugal) Punjabi Romanian Russian Serbian (Cyrillic) Serbian (Latin) Slovak Slovenian Spanish Swedish Tamil Telugu Thai Turkish Ukrainian Urdu Vietnamese
API
Managing indexes and documents
Define index { "name": "hotels", "fields": [ { "name": "hotelId", "type": "Edm.String", "key": true, "searchable": false, "sortable": false, "facetable": false }, { "name": "baseRate", "type": "Edm.Double" }, { "name": "description", "type": "Edm.String", "filterable": false, "sortable": false, "facetable": false }, { "name": "description_fr", "type": "Edm.String", "filterable": false, "sortable": false, "facetable": false, "analyzer": "fr.lucene" }, { "name": "hotelName", "type": "Edm.String", "facetable": false }, { "name": "category", "type": "Edm.String" }, { "name": "tags", "type": "Collection(Edm.String)" }, { "name": "parkingIncluded", "type": "Edm.Boolean", "sortable": false }, { "name": "smokingAllowed", "type": "Edm.Boolean", "sortable": false }, { "name": "lastRenovationDate", "type": "Edm.DateTimeOffset" }, { "name": "rating", "type": "Edm.Int32" }, { "name": "location", "type": "Edm.GeographyPoint" } ] }
Indexing data { "value" : [ { "@search.action" : "upload", … }, { "@search.action" : "merge", … }, { "@search.action" : "mergeOrUpdate", … }, { "@search.action" : "delete", … } ] }
Upload document { "@search.action": "upload", "hotelId": "1", "baseRate": 199.0, "description": "Best hotel in town", "description_fr": "Meilleur hôtel en ville", "hotelName": "Fancy Stay", "category": "Luxury", "tags": ["pool", "view", "wifi", "concierge"], "parkingIncluded": false, "smokingAllowed": false, "lastRenovationDate": "2010-06-27T00:00:00Z", "rating": 5, "location": { "type": "Point", "coordinates": [-122.131577, 47.678581] } }
Merge or upload document { "@search.action": "mergeOrUpload", "hotelId": "3", "baseRate": 129.99, "description": "Close to town hall and the river" }
Delete document { "@search.action": "delete", "hotelId": "6" }
Querying
Search GET https://[service name].search.windows.net /indexes/hotels/docs?api-version=2015-02-28 &search=budget&$select=hotelName POST https://[service name].search.windows.net /indexes/hotels/docs/search?api-version=2015-02-28 { "search": "budget", "select": "hotelName" }
Search GET https://[service name].search.windows.net /indexes/hotels/docs?api-version=2015-02-28 &search=*&$top=2&$orderby=lastRenovationDate desc&$select=hotelName,lastRenovationDate POST https://[service name].search.windows.net /indexes/hotels/docs/search?api-version=2015-02-28 { "search": "*", "orderby": "lastRenovationDate desc", "select": "hotelName,lastRenovationDate", "top": 2 }
Pagination GET /indexes/hotels/docs?api-version=2015-02-28 &search=budget&$select=hotelName & $top=15 & $skip=0 & $count=true { "values" : [ … ], "@odata.count" : 42 }
Facets POST /indexes/hotels/docs/search?api-version=2015-02-28 { "search": "test", "facets": [ "tags", "baseRate,values:80|150|220" ], "filter": "rating eq 3 and category eq 'Motel'" } "@search.facets": { "baseRate": [ { "count": 0, "to": 80 }, { "count": 10, "from": 80, "to": 150 }, { "count": 15, "from": 150, "to": 220 }, { "count": 4, "from": 220 } ]
Controlling relevance
Controlling relevance Relevant results Irrelevant results Returned results
Scoring profiles { "name": "hotels", "fields": [ … ], "scoringProfiles" : [ … ] }
Text weights scoring profile { "name": "boostName", "text": { "weights": { "hotelName": 10, "tags": 5, "description": 2, "description_fr": 2 }
Scoring functions - Freshness should be used when you want to boost by how new or old an item is. This function can only be used with datetime fields (edm.DataTimeOffset). Note the boostingDuration attribute is used only with the freshness function. - Magnitude should be used when you want to boost based on how high or low a numeric value is. Scenarios that call for this function include boosting by profit margin, highest price, lowest price, or a count of downloads. This function can only be used with double and integer fields. - Distance should be used when you want to boost by proximity or geographic location. This function can only be used with Edm.GeographyPoint fields. - Tag should be used when you want to boost by tags in common between documents and search queries. This function can only be used with Edm.String and Collection(Edm.String) fields.
Functions general properties { "name": "boostName", "function": { "type": "magnitude | freshness | distance | tag", "boost": # (positive number used as multiplier for raw score != 1), "fieldName": "...", "interpolation": "constant | linear | quadratic | logarithmic" }
Freshness function { "name": "boostName", "function": { … , "freshness": { "boostingDuration": "..." (value representing timespan over which boosting occurs) }
Magnitude function { "name": "boostName", "function": { … , "magnitude": { "boostingRangeStart": #, "boostingRangeEnd": #, "constantBoostBeyondRange": true | false }
Distance function { "name": "boostName", "function": { … , "distance": { "referencePointParameter": " … ", "boostingDistance": # (the distance in kilometers from the reference location where the boosting range ends) }
Tags function { "name": "boostName", "function": { … , "tags": { "tagsParameter": " … " (parameter to be passed in queries to specify a list of tags to compare against target field) }
Scoring functions - aggregation { "name": "hotels", "fields": [ … ], "scoringProfiles" : [ … ], "functionAggregation": "sum | average | minimum | maximum | firstMatching" }
Integration
Azure Search data sources SQL Server Azure SQL (aka SQL Database) Document DB Azure Blob Storage Azure Table Storage
Real life example
“Time Machine” app architecture API Topic Sub Sub Worker Worker Search Storage
Demo
Anton Boyko Microsoft Azure MVP, MCP Microsoft DevOps TE boyko.ant@live.com @BoykoAnt facebook.com/boyko.ant linkedin.com/in/boykoant