Presentation is loading. Please wait.

Presentation is loading. Please wait.

International Marketing and Output Database Conference 2005

Similar presentations


Presentation on theme: "International Marketing and Output Database Conference 2005"— Presentation transcript:

1 International Marketing and Output Database Conference 2005
Searching in StatLine Edwin de Jonge International Marketing and Output Database Conference 2005

2 What is StatLine? Online statistical database Statistics Netherlands (SN), since 1996 Contains all output figures of SN Currently > 1300 cubes More than 2 billion data cells (200 million figures)

3 StatLine search Large database needs good search!
Search available since 1997 Evolved into powerful search engine (StatLine 1, StatLine 2 and StatLine 3) Improved further in forthcoming StatLine 4

4 Search Issues (pop. mode)
Keep it simple All or no text? It’s the content, …! Broaden your scope Tiny hits Some properties are more equal than others Use to the max!

5 Search Issues (expert mode)
Easy to use Full text vs. Keyword search Quality of content Scope of search Cell based indexing Multiple weighted properties External usage

6 Easy to use Searching should be: Familiar Simple
Like Google StatLine 4 provides Complex search algorithms and queries Simple familiar user interface

7 Free text / keyword search
All content is assigned keywords Requires keyword knowledge (experts) Free text search: All texts are used during search Less specific StatLine : combined full text and keyword search

8 Quality of content Good titles, descriptions and labels crucial for searching! StatLine: All metadata can be shared in StatLine All metadata can have popular/layman variant “Consumer Price Index” vs. “Inflation”

9 Scope of search Search on Web site must return all relevant content
StatLine: Search on SN returns hits with all relevant content Narrow scope: search within a specific theme

10 Cell based indexing Some cubes > 1 million cells
Search result must be relevant selection in cube E.g. “Amsterdam inhabitants 2004” hits cube with 28 million cells. StatLine returns 1 cell! Cells in cubes are indexed Contains default selections

11 Multiple weighted properties
Cube has many textual properties. Title description, Classification label Classification description StatLine 4: All properties taken into account Each property different weight

12 External usage Search is effective when used by other organisations
StatLine: StatLine search = Web service SN Web site: calls StatLine web service combines search results of website and StatLine

13 Conclusion StatLine has mature search facility Easy to use
Full text combined with keyword search Meta data sharing/popular terms Search can be scoped Cell based indexing Weighted search WebService Future…

14 Future Semantic searching
Developed demo in cooperation University of Delft Use semantic information to support search Semantic Visual Graph: Show graph of concept semantically close to search word Graph is interactive

15

16


Download ppt "International Marketing and Output Database Conference 2005"

Similar presentations


Ads by Google