Presentation is loading. Please wait.

Presentation is loading. Please wait.

Oracle vs SQL Server Dr. Alex Wang. Oracle Text Oracle Text uses standard SQL to do almost everything. Full-text retrieval technology, deal with unstructured.

Similar presentations


Presentation on theme: "Oracle vs SQL Server Dr. Alex Wang. Oracle Text Oracle Text uses standard SQL to do almost everything. Full-text retrieval technology, deal with unstructured."— Presentation transcript:

1 Oracle vs SQL Server Dr. Alex Wang

2 Oracle Text Oracle Text uses standard SQL to do almost everything. Full-text retrieval technology, deal with unstructured data. Data source could be database table, flat files, web sites. Index, search, analyze text and documents. Searching: keyword searching, context query, pattern matching, thematic queries, HTML/XML section searching. Use relevance-ranking to improve search quality. Supported formats: PDF, MS Office, HTML, XML

3 Search Operators used in Oracle Context search Near - return a score based on the proximity of two or more terms. Pattern search Fuzzy - spelled similar. Soundex - sound alike. Stem - search for all terms with the same root. Use thesaurus Preferred Term - replace query term with prefered term define in a thesaurus. Related Term - Expand to all related term defined in a thesaurus. Synonym - Expand to all terms defined as synonyms. Narrow Term - Expand to all terms defined as the narrower/lower level terms. Broader Term - Expand to all terms defined as broader/higher level terms. Top Term -

4 Search Operators used in SQL Server CONTAINS can search for: A word near another word. The prefix of a word or phrase. Soundex Function (for search sound alike). A word inflectionally generated from another (for example, the word drive is the inflectional stem of drives, drove, driving, and driven). A word that is a synonym of another word using thesaurus (for example, the word metal can have synonyms such as aluminum and steel).

5 FeatureOracleMicrosoft Available inSE, EEEE Decision TreeYY Support Vector MachineYN Neural NetworkNY Naive BayesYY Adaptive Bayes NetworkYN K-meansYY Expectation MaximizationNY Orthogonal ClusteringYN Path clusterNY Minimal Descriptor LengthYN Time SeriesYY Association RulesYY Note: Minimal Descriptor Length, identifies the relative importance of an attribute in predicting a given outcome.

6 Oracle emphasize PL/SQL statement Simple Prediction Query Question: Select all customers who have a high propensity to attrite (> 80% chance) SQL Query: SELECT A.cust_name, A.contact_info FROM customers A WHERE PREDICTION_PROBABILITY(tree_model, ‘attrite’ USING A.*) > 0.8

7 An Example of Oracle Text Mining Building a DT Models CREATE TABLE dt_settings ( setting_name VARCHAR2(30), setting_value VARCHAR2(30)); BEGIN -- Populate settings table INSERT INTO dt_sample_settings VALUES (dbms_data_mining.algo_name, dbms_data_mining.algo_decision_tree); COMMIT; DBMS_DATA_MINING.CREATE_MODEL( model_name => 'sales_type_model', mining_function => dbms_data_mining.classification, data_table_name => 'sales_dataset', case_id_column_name => 'sales_id', target_column_name => 'sales_type', settings_table_name => 'dt_settings'); END;

8 An Example of SQL Server Text Mining A Tutorial for Text Classification using SQL Server 2005 Beta2 Data Mining Peter Pyungchul Kim SQL Business Intelligence Microsoft Corporation http://www.sqlserverdatamining.com/dmco mmunity/_tutorials/688.aspx

9 Data Source 5000 postings from 5 news groups We know which posting belong to which group Flat text file Goal: create a model based on these data to classify each posting to its group Randomly chose 70% for training, 30% for testing.

10 SQL Server You can do it by click through SQL Server GUI tools. 1. SQL Mgmt Studio - Create database, import the data 2. Business Intelligence Development Studio – Build a dictionary, term vectors. 3. Build/Test data mining models

11 Compare Classification Results


Download ppt "Oracle vs SQL Server Dr. Alex Wang. Oracle Text Oracle Text uses standard SQL to do almost everything. Full-text retrieval technology, deal with unstructured."

Similar presentations


Ads by Google