Presentation is loading. Please wait.

Presentation is loading. Please wait.

9 Chapter Nine Extracting and Transforming Data with SQL Server 2000.

Similar presentations


Presentation on theme: "9 Chapter Nine Extracting and Transforming Data with SQL Server 2000."— Presentation transcript:

1 9 Chapter Nine Extracting and Transforming Data with SQL Server 2000

2 9 Objectives Understand the option in SQL Server 2000 for inserting from and extracting to external data sources Use the BULK INSERT statement and the BCP utility in SQL Server 2000 to move records into and out of tables and views Use the Data Transformation Services (DTS) interface to import and export data

3 9 Objectives Create and save DTS packages to perform transformations and add workflow to packages Identify and implement an appropriate strategy for replicating data among instances of SQL Server 2000 Understand and configure SQL Server 2000 XML support

4 9 The BULK INSERT Statement BULK INSERT statement –Used in SQL Server 2000 to insert data into a table or view from a data file Architecture –The OLE DB provider in SQL Server 2000 interprets the BULK INSERT statement natively As a result, it is possible to use the bulk copy API from ODBC, OLE DB, SQL-DMO and DB-Library- based applications

5 9 The BULK INSERT Statement The BULK INSERT statement uses a simple syntax for inserting data from a file into a table –The destination table name, qualified by the database name, and the source data file are the only two parameters that must be specified for each BULK INSERT operation to succeed

6 9 BULK INSERT Statement Usage Scenarios The BULK INSERT statement is a high-performance way to insert data from a file into SQL Server 2000 using a standard T-SQL statement –Best used when the data to be imported exist in a tabular, non-hierarchical format, such as a comma-delimited file –Can be used within a stored procedure as part of a data feed and can be initiated by scheduling a job that executes the stored procedure –Can be used if an event or program external to SQL Server 2000 initiates the data import

7 9 BULK INSERT Statement Performance BULK INSERT statement and BCP are two high-performance ways to insert data into database tables in SQL Server 2000 –There is an overhead associated with logging the insert of records using these methods –SQL Server 2000 incurs this overhead to ensure that records inserted can be rolled back and rolled forward

8 9 BULK INSERT Statement Performance To optimize the performance of a bulk insert, it is possible to cause SQL Server 2000 not to log the inserts of records –To do this, ensure the following are true: Database option “select into/bulkcopy” is set to true Target table has no indexes or if the table has indexes, it is empty when the bulk copy starts Target table is not being replicated TABLOCK hint is specified

9 9 Bulk Copy Program –Used to move data into and out of SQL Server 2000 BCP syntax –BCP utility requires a slightly different syntax than the BULK INSERT statement but uses similar parameters A key difference in syntax is that it is necessary to specify user name, password, database name and server name when using the BCP utility

10 9 BCP in Action Figure 9-1: The BCP command to export order data

11 9 BCP Usage Scenarios The same requirements that lead to the selection of the BULK INSERT statement for inserting large sets of data into SQL Server 2000 can be used to select the BCP utility –In addition, you should select the BCP utility when you need to export data from a SQL Server table into a data file and extensive transformations are not necessary

12 9 BCP Performance Like BULK INSERT statement, BCP utility is a high-performance, batch-oriented mechanism for transferring large sets of records –Because both implement the Bulk Copy API, the performance of BCP will not differ much from the performance of the BULK INSERT statement

13 9 Data Transformation Services Data Transformation Services (DTS) –Set of features bundled together for use in Enterprise Manager Interface –SQL Server 2000 provides an intuitive, easy-to- use interface to the Data Transformation Services in the format of the Data Transformation Services Import/Export Wizard

14 9 Interface Figure 9-2: Invoking the DTS Wizard from Enterprise Manager

15 9 Interface Figure 9-3: Invoking the DTS Wizard from the program group

16 9 Executing a Transformation Using DTS The first two steps when executing a transformation using DTS are to correctly specify the source and target data repositories –Subsequently, it is necessary to specify several options about the data transfer and customer transformations At a minimum, the user executing a transformation needs to select one or more tables or views to copy from the source to the target data repositories

17 9 Choosing a Data Source/Destination It is necessary to identify source and target data repositories involved in the data transfer/transformation after invoking DTS Import/Export Wizard To identify each repository, specify the following: –Type of data held in each repository –Physical location of the repository –Any information required to gain access to the repository

18 9 Choosing a Data Source/Destination Figure 9-4: The Choose a Data Source dialog box in the DTS Wizard

19 9 Specifying Table Copy or Query Figure 9-5: The Specify Table Copy or Query dialog box

20 9 Specifying Table Copy or Query Figure 9-6: The Select Source Tables and Views dialog box

21 9 Specifying Table Copy or Query Figure 9-7: The Type SQL Statement dialog box

22 9 Specifying Table Copy or Query Figure 9-8: Using Query Builder to create SQL statements

23 9 Specifying Table Copy or Query Figure 9-9: Column Mappings tab in the Column Mappings and Transformation dialog box, which allows customization of destination column mappings

24 9 Specifying Table Copy or Query Figure 9-10: Transformation tab in the Column Mappings and Transformations dialog box, which allows custom transformations to the source data

25 9 Copying Objects and Data between SQL Server Databases Figure 9-11: The Select Objects to Copy dialog box

26 9 Saving DTS Packages Figure 9-12: The Save, schedule and replicate package dialog box

27 9 Saving DTS Packages SQL Server –With this default save option, you can store a package as a table in the SQL Server msdb database SQL Server Meta Data Services –With this save option, you can maintain historical information about the data manipulated by the package Structured Storage File –With this save option, you can copy, move and send a package across the network without storing the file a SQL Server database Visual Basic File –With this save option, you can programmatically customize a package created in DTS Designer or DTS Import/Export Wizard

28 9 Editing DTS Packages Figure 9-13: Invoking the DTS Designer from Enterprise Manager

29 9 Editing DTS Packages Figure 9-14: The DTS Designer interface

30 9 Replication –Process of synchronizing copies of data stored in multiple sites Uses of replication: –Allowing multiple locations to keep copies of the same data –Separating transaction processing applications from analytical processing applications –Allowing greater autonomy

31 9 Replication Uses of replication (cont.): –Creating a scalable application that is based on strategically locating data that will be accessed –Bringing data closer to individuals or groups –Using replication as part of a customized standby server strategy

32 9 Replication Overview Publisher –Server that makes data available for replication to other servers Subscribers –Servers that receive replicated data Distributor –Server that contains the distribution database and stores meta-data, history data and/or transactions

33 9 Snapshot Replication Snapshot replication –Periodically copies entire publications as they exist at a moment in time Figure 9-15: The components involved in snapshot replication

34 9 Usage Scenarios for Snapshot Replication Snapshot replication is the best choice when the following are true: –Most of the data are static and do not change often –It is acceptable to have copies of data that are out of data for some period of time –The total size of data publications to replicate is small –The sites are often disconnected, eliminating the possibility to send constant updates of data as in transactional replication

35 9 Transactional Replication Transactional replication –Incremental changes made on the Publisher are sent to each Subscriber Figure 9-16: The components involved in transactional replication

36 9 Usage Scenarios for Transactional Replication Transactional replication is an excellent choice when: –Data modification should be propagated to Subscribers within seconds of when they occur –Transactions should be atomic –Subscribers are connected to the Publisher most of the time –The application cannot tolerate out-of-date data on the subscribers that receive the changes

37 9 Merge Replication Merge replication –Data modifications can be made at each site Figure 9-17: The components involved in merge replication

38 9 Usage Scenarios for Merge Replication Merge replication is the best selection when: –Multiple Subscribers need to update data at various times and propagate those changes to the Publisher and to other Subscribers –Subscribers need to receive data, make changes offline and synchronize changes later with the Publisher and other Subscribers –The ability for each site to operate independently is critical

39 9 Replication Tools SQL Server 2000 provides several easy-to- use tools for establishing and configuring replication –Each of these tools is available through Enterprise Manager –Most take the form of wizards that allow the definition of Publishers, Distributors, Subscribers, subscriptions and their properties

40 9 Replication Wizards Configure Publishing and Distribution Wizard –Typically the first wizard that is invoked to establish replication Create Publication Wizard –Allows creation of publication Push Subscription Wizard –Allows the creation of a subscription to an existing publication Pull Subscription Wizard –Allows the creation of a subscription to a publication requested by the Subscriber on the SQL Server specified in this wizard Disable Publishing and Distribution Wizard –Allows removal of replication publications and distribution service from the SQL Server machine

41 9 Filtering Published Data By filtering published data, it is possible to: –Minimize the amount of data sent over the network –Reduce the amount of storage space required at the Subscriber –Customize publications and applications based on individual Subscriber requirements –Avoid or reduce conflicts that could result if the same data were sent multiple Subscribers

42 9 Replication Performance Optimization There are several application-specific and network- specific techniques for optimizing the performance of replication –Simple way is to set minimum amount of memory allocated to SQL Server 2000 by using the Min Server Memory option –Using a separate disk drive for the transaction log on any server involved in replication will help to avoid the disk I/O bottleneck that can result from logging many operations –Ensuring that the data that is replicated is then filtered as much as possible will also help to avoid unnecessary performance problems

43 9 Heterogeneous Queries A heterogeneous query retrieves, inserts or updates data by joining together sets of data from multiple data source types –It is necessary to initiate a heterogeneous query from a SQL Server database that is linked to another data source –A linked server configuration allows remote server access and allows access of non-SQL Server data sources to be treated the same way as the access of SQL Server data sources

44 9 Configuring a Linked Server To configure a linked server definition, a SQL Server administrator specifies an OLE DB provider and OLE DB data source Table 9-1: OLE DB Providers and Data Sources

45 9 Configuring a Linked Server Figure 9-18: Linked server configuration

46 9 Managing a Linked Server Definition When setting up a linked server, register connection information and data source with SQL Server 2000 –After registration is accomplished, it is possible to refer to the data source with a single logical name You can create or delete a linked server definition with stored procedures or through Enterprise Manager –It it possible to use stored procedures to manage a linked server definition

47 9 Arguments Table 9-2 describes data returned by sp_linkedservers system-stored procedure Table 9-2: Returned Data

48 9 XML Support in SQL Server 2000 Standard data format eXtensible Markup Language (XML) is a format for marking up data with extensible tags that can be used by applications to determine the purpose of the data –In SQL Server 2000, it is possible to make a request for relational data in the form of XML and have the SQL OLE DB provider in SQL Server construct an XML document from the resulting data –It is possible to insert, update or delete data in SQL Server using XML syntax to describe the data that should be inserted, updated or deleted –Much of this functionality is available through stored procedures and general T-SQL statements that are executed against SQL Server 2000

49 9 XML Support in SQL Server 2000 There are features that can be used directly from a browser and do not require writing T-SQL code, including: –Executing a T-SQL Query in the URL window of a browser to retrieve the results as XML –Executing a query against a database object using the Xpath query syntax –Defining an XML template with a T-SQL statement that can be executed to return a set of data in the form of XML –Defining an XML View that uses XML Data Reduced (XDR) syntax instead of T-SQL syntax

50 9 Configuring Internet Information Services for XML Support Figure 9-19: The MMC snap-in to configure SQL SML support in IIS

51 9 Virtual Directories sqlisapi.dll –Implementation of the Internet Services API (ISAPI) filter that provides SQL-specific functionality to Internet Information Services (IIS) To allow IIS to differentiate between request files that exist in conventional virtual directories and XML requests, it is necessary to create SQL virtual directories

52 9 Virtual Directories Figure 9-20: The settings tab of the Demos Properties dialog box

53 9 Virtual Names Virtual names –Sub-namespace in IIS whose HTTP requests are forwarded to SQL Server 2000 as a specific type of request Figure 9-21: The Virtual Names tab of the Demos Properties dialog box

54 9 Virtual Names Dbobject –Allows access to objects in underlying database of the Allow XPath option is selected for the virtual directory Schema –Allows access to data in the underlying database without the use of a T-SQL statement Template –Allows access to an XML file that acts as a template if the “Allow template queries” option is selected for the virtual directory

55 9 Chapter Summary There are a variety of data formats and methods for moving data into and out of SQL Server 2000 For one-time importing and exporting of data, the bulk insert task and BCP utility can be executed using T-SQL statements and Windows command-line statements To apply user-defined transformations to data imports and exports, DTS packages can be managed in Enterprise Manager If it is necessary to retrieve or to insert data in XML format, the FOR XML clause can be used in T-SQL statements


Download ppt "9 Chapter Nine Extracting and Transforming Data with SQL Server 2000."

Similar presentations


Ads by Google