Siebel CRM Unicode Conversion – The DBA Perspective Brian Hitchcock OCP 8, 8i, 9i DBA Sun Microsystems DCSIT Technical.

Slides:



Advertisements
Similar presentations
Copyright © 2003 Pearson Education, Inc. Slide 8-1 The Web Wizards Guide to PHP by David Lash.
Advertisements

Computer Basics Hit List of Items to Talk About ● What and when to use left, right, middle, double and triple click? What and when to use left, right,
NLS and The Case of the Missing Kanji Brian Hitchcock OCP DBA 8, 8i, 9i Global Sales IT Sun Microsystems NoCOUG.
20 Copyright © 2008, Oracle. All rights reserved. Globalization.
Java.  Java is an object-oriented programming language.  Java is important to us because Android programming uses Java.  However, Java is much more.
Let’s try Oracle. Accessing Oracle The Oracle system, like the SQL Server system, is client / server. For SQL Server, –the client is the Query Analyser.
Harvard University Oracle Database Administration Session 2 System Level.
Creating Database Tables CS 320. Review: Levels of data models 1. Conceptual: describes WHAT data the system contains 2. Logical: describes HOW the database.
2 Copyright © 2009, Oracle. All rights reserved. Installing your Oracle Software.
A Guide to Oracle9i1 Introduction to Oracle9i Database Administration Chapter 11.
Database Backup and Recovery
Chapter 1 Introduction to Databases
Migrating to EPiServer CMS 5 Johan Björnfot -
DB Audit Expert v1.1 for Oracle Copyright © SoftTree Technologies, Inc. This presentation is for DB Audit Expert for Oracle version 1.1 which.
CHARACTERS Data Representation. Using binary to represent characters Computers can only process binary numbers (1’s and 0’s) so a system was developed.
Database Upgrade/Migration Options & Tips Sreekanth Chintala Database Technology Strategist.
Apache Tomcat Web Server SNU OOPSLA Lab. October 2005.
Migration XenDesktop 7. © 2013 Citrix | Confidential – Do Not Distribute Migration prerequisites Set up a XenDesktop 7 Site, including the site database.
Best Implementation Practices for Discoverer April Sims OCP 8i 9i.
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
A Guide to SQL, Eighth Edition Chapter Three Creating Tables.
Module 3: Table Selection
Overview of SQL Server Alka Arora.
Best Implementation Practices for Discoverer April Sims, Senior DBA, OCP 8i 9i Southern Utah University Wednesday, September 10,2003 8:30am.
UNICODE Character Sets and Coding Standards Han Unification and ISO10646 Encoding Evolution and Unicode Programming Unicode.
APPX Unicode Support APPX Release 6.0 will support Unicode APPX will support languages worldwide.
Oracle Applications 11i Concepts I Brian Hitchcock OCP 11i DBA -- OCP 10g DBA Sun Microsystems Brian Hitchcock February.
Brian Hitchcock OCP DBA 8i Global Sales IT Sun Microsystems
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Oracle DataGuard Concepts and Architecture
Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
M1G Introduction to Database Development 6. Building Applications.
January 21, 2009 Migrating an 11i Database to Linux – Tips, Tricks & Gotchas Mark Morgan DBA Consultant siMMian systems, inc
Character Encoding, F onts. Overview Why do character encoding and fonts matter to linguists? How can you identify problems? Why do these problems arise?
Chapter 2: SQL – The Basics Objectives: 1.The SQL execution environment 2.SELECT statement 3.SQL Developer & SQL*Plus.
11 3 / 12 CHAPTER Databases MIS105 Lec15 Irfan Ahmed Ilyas.
INDIVIDUAL ACHIEVEMENT. EDUCATIONAL EXCELLENCE. ADMINISTRATIVE INNOVATION. INSTITUTIONAL PERFORMANCE. 1 Banner 8 Character Set Conversion Presented by:
Chapter 10: The Data Tier We discuss back-end data storage for Web applications, relational data, and using the MySQL database server for back-end storage.
Upgrading to SQL Server 2000 Kashef Mughal. Multiple Versions SQL Server 2000 supports multiple versions of SQL Server on the same machine It does that.
DataMAPPER - Applied Database Tech. 이화여대 과학기술대학원 석사 3 학기 992COG08 김지혜.
Oracle9i Database Administrator: Implementation and Administration 1 Chapter 14 Globalization Support in the Database.
Managing users and security Akhtar Ali. Aims Understand and manage profiles Understand and manage users Understand and manage privileges Understand and.
3 Copyright © 2009, Oracle. All rights reserved. Creating an Oracle Database Using DBCA.
 Database Administration Installing Oracle 11g & Creating Database.
Guide to Oracle 10g ITBIS373 Database Development Lecture 4a - Chapter 4: Using SQL Queries to Insert, Update, Delete, and View Data.
Week 7 Lecture 2 Globalization Support in the Database.
16 Copyright © 2006, Oracle. All rights reserved. Using Globalization Support.
Siebel CRM Unicode Conversion 2 – The DBA Perspective Brian Hitchcock OCP 8, 8i, 9i DBA Sun Microsystems DCSIT Technical.
Database Security. Multi-user database systems like Oracle include security to control how the database is accessed and used for example security Mechanisms:
Chapter 9: Advanced SQL and PL/SQL Guide to Oracle 10g.
Oracle RDBMS Patching Brian Hitchcock OCP 8, 8i, 9i DBA Sun Microsystems NoCOUG Brian Hitchcock May 6, 2004Page.
12 Copyright © 2009, Oracle. All rights reserved. Managing Backups, Development Changes, and Security.
Module 7: SQL Server Special Considerations. Overview SQL Server High Availability Unicode.
Managing Database With Oracle Replacement for Ch10 COP 4708.
Oracle Applications 11i Concepts II Brian Hitchcock OCP 11i DBA -- OCP 10g DBA Sun Microsystems Brian Hitchcock.
Understanding Character Encodings Basics of Character Encodings that all Programmers should Know. Pritam Barhate, Cofounder and CTO Mobisoft Infotech.
MISSION CRITICAL COMPUTING SQL Server Special Considerations.
Level 1-2 Trigger Data Base development Current status and overview Myron Campbell, Alexei Varganov, Stephen Miller University of Michigan August 17, 2000.
SunGard SCT Converter Tool Technical Consultant Welcome.
MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Sravanthi Lakkimsety Mar 14,2016.
Your current Moodle 1.9 Minimum Requirements Ability to do a TEST RUN! Upgrading Moodle to Version 2 By Ramzan Jabbar Doncaster College for the Deaf By.
Getting the Most outof EPM Converting FDM to FDMEE – What’s it all about? March 16, 2016 Joe Mizerk
Basics Components of Web Design & Development Basics, Components, Design and Development.
9 Copyright © 2004, Oracle. All rights reserved. Getting Started with Oracle Migration Workbench.
1 Copyright © 2005, Oracle. All rights reserved. Oracle Database Administration: Overview.
Database Management System
Creating an Oracle Database
Weird Stuff I Saw While ... Supporting a Java Team
DATABASE MANAGEMENT SYSTEM
XML Problems and Solutions
Presentation transcript:

Siebel CRM Unicode Conversion – The DBA Perspective Brian Hitchcock OCP 8, 8i, 9i DBA Sun Microsystems DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 1

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 2 CRM Unicode Conversion  Three separate presentations – 1) The overall conversion process  What we had, what we wanted, how to get there  Issues that come up during conversion – 2) Multi-byte data in the existing CRM db  What’s the issue, how did it happen  A general method to find and fix this problem – 3) The actual conversion  What really happened  Issues that came up and how they were resolved  Focus on DBA issues, not Siebel application

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 3 How Did I Get Involved?  Sleeping in a meeting…  Heard someone say – “We told the users to stop entering Japanese into the CRM system but we aren’t sure they stopped”  Woke up, said – “I’ve done that before…” – See “Case of the Missing Kanji”  Don’t wake up in meetings…

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 4 What’s The Issue?  Existing Siebel CRM system – Oracle – Single-byte character set (WE8ISO8859P1)  Interface systems – Multi-byte character set(s) (UTF8) – Handle data between single,multi-byte apps  Want to convert to Unicode – Siebel, database, interfaces all should be UTF8 – Eliminate interface systems

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 5 What we had Siebel CRM Oracle Db Custdb Apac Users Tcustdb Apac Custdb Emea Custdb Amer Tcustdb Emea Amer Emea Apac UTF8 WE8ISO8859P1 UTF8 WE8ISO8859P1 8859P1 Ordering System

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 6 What we wanted Siebel CRM Oracle Db Custdb Apac Users Custdb Emea Custdb Amer Amer Emea Apac WE8ISO8859P1 UTF8 AL32UTF8 UTF8 Ordering System

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 7 What We Wanted  All data in one database – All languages – Unicode  Eliminate interface systems – Reduce support costs  Support increased CRM functionality – All data in one place – Supports new business functionality

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 8 Would you like fries with that?  Unicode conversion includes – Oracle db  Convert to AL32UTF8 character set ­Required by Siebel for Unicode  Upgrade to ­Required to get AL32UTF8 character set – Remove Tcustdb databases  Modify triggers that link source db to Tcustdb

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 9 And A Shake?  And, while you’re at it… – Application GUI  Retrieve different data, multi-byte, local language – Clients  Upgrade to Oracle (SQL*Plus)  Lots of changes all at once – Testing – How to know impact of each change?

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 10 Converting to Unicode  It’s easy – right? – Siebel CRM  make some configuration changes – Oracle database  Export from single-byte database  Import into new db created with UTF8 char set – Testing – Done  This is the ‘management’ view

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 11 What Is Unicode?  International standard  Collection of characters – Covers most of the world’s languages  Chinese poetry? – All characters have unique byte-code  Application developers – Support Unicode – No need to worry about specific languages

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 12 You Make This Stuff Up!  What follows can be found in – Oracle9i Database Globalization Support Guide – Release 2 (9.2) – Part Number A  Or, you can trust me…  Character sets, Unicode – Consist of set of characters – Encoding of the characters to byte-codes

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 13 Single Byte Encoding Schemes  7-bit encoding schemes – Single-byte 7-bit up to 128 characters – normally support just one language – US7ASCII  8-bit encoding schemes – Single-byte 8-bit up to 256 characters – often support a group of related languages – WE8ISO8859P1

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page P1 Character set Oracle Character Set WE8ISO8859P1Hex 0x41 is A

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 15 Multi-byte Encoding Schemes  Fixed-width – each character occupies a fixed number of bytes – Faster text processing – AL16UTF8  Variable-width – one or more bytes to represent a single character – Saves disk space (typically lots of disk space) – UTF8, AL32UTF8  Shift-sensitive variable-width – use control codes to differentiate single-byte multi-byte characters with the same code values

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 16 UTF8 Byte Storage Different characters occupy 1, 2, 3 or 4 bytes

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 17 AL32UTF8  UTF8 – Supports Unicode 3.0 since – Up to 3 bytes per character – Supplemental characters  Pairs of 3 byte character codes  AL32UTF8 – Supports Unicode 3.1 (latest version?), since 9i – Up to 4 bytes per character  Supplemental characters

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 18 Confused?  Unicode, a set of characters  Character set, encoded set of characters  Encoding scheme, UTF-8, ISO standard for variable width encoding of Unicode character set  UTF8, Oracle implementation of UTF-8  If you’re not confused, you aren’t paying attention!

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 19 Changing Character Set  You can simply alter the database (right?)  Only works if – new character set is strict superset of existing character set – For all characters in existing character set  All exist in new character set  All have exact same code in new character set  Example – WE8MSWIN1252 (superset, includes euro) – WE8ISO8859P (subset)

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 20 Complexities  Even for the same character – Different encoding in different character set  Example – Latin (Western European) character á – E1 in WE8ISO8859P1 – C391 in UTF8  If existing character not in new char set – ? (replacement character) displayed

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 21 Cure  Create new database – Using new character set  Extract data from old database  Insert data into new database  Export/import is most often used – Could use other methods  Extract data to flat files  SQL*Loader

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 22 Database Conversion  Serial – Upgrade source, export, drop schemas, import  Parallel – Create target – Export source – Import to target  Chose Parallel – Source still available after target in use  User tablespace issue for example

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 23 Impact of Unicode  Table columns must be widened  Existing column – Holds up to 20 Latin characters – WE8ISO8859P1, each Latin character 1 byte – VARCHAR2(20)  New column – UTF8 – Each Latin character occupies 2 bytes – Need VARCHAR2(40)

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 24 Impact of Unicode  Worst case – UTF8 can have up to 4 bytes per character – For all existing character columns – Need to expand by 4x  Disk space – CHAR – 4x disk space – VARCHAR2 – 1x to 4x  Depends on specific characters inserted

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 25 Impact of Unicode  Tables – Columns must be wider – Each character can be up to 4 bytes  Triggers, PL/SQL code – Modify to handle multi-byte data  End-user front-end (browser) – Reconfigure to  Display multi-byte data, accept multi-byte data  All app components must handle Unicode

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 26 User Impact  VARCHAR2, AL32UTF8 – 4000 byte limit  How many characters can I enter? – Latin, 2000 – Japanese, 4000/3  If moving from Japanese character set  2 bytes per character  Max characters reduced by 1/3 – Supplemental characters, 1000  Characters like ‘treble clef’

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 27 Disk Space  How much multi-byte data do you have? – We found all of ours – Typically, 5-10% – See 2) Multi-byte data in the existing CRM db  Compute disk space requirement – If you have 5% multi-byte character data – Need maximum of 20% more disk space  Will you add more multi-byte data? – Once you have converted to Unicode…

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 28 Expanding Columns  Need to expand lots of columns – Individual SQL statements – Lots of SQL to generate  How to make Oracle do this for us? – Export existing database – New database has init.ora parameter  NLS_LENGTH_SEMANTICS = CHAR – Import into new database  All character columns widened as tables created ­VARCHAR(10) becomes VARCHAR(40)

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 29 Character Semantics – 9i  Change column data types – VARCHAR2(10 byte) – VARCAHR2(10 char) – Requires SQL statement for each column  NLS_LENGTH_SEMANTICS – Init.ora parameter – What happens if init.ora changed? – BYTE or CHAR – All character columns created with byte or char – Handles PL/SQL code as well

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 30 The Siebel Process  Create target database  Shutdown app  Upgrade Oracle client  Source db character set  Run migrate.sh script  Full export source  Import to target db  Modify target db

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 31 Create target database  Oracle  Character set AL32UTF8  Character semantics CHAR  Tablespace names same as source db – 15% more space than source db  Locally managed, uniform 130k  Auto UNDO, tablespace

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 32 Shutdown app  Shutdown various app servers  Shutdown source db  Cold backup  Upgrade source db to – Migrate to

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 33 Upgrade Oracle client  Upgrade Oracle client software to – For all machines that have SQL*Plus – Upgrade to – Install  Client install only – Tar up client ORACLE_HOME – ftp, untar on machines that need SQL*Plus

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 34 Source db character set  Fix any user tablespace issues – Import won’t fix them for you  Change source db character set – WE8MSWIN1252  Siebel requirement  Contains euro symbol  Is a strict superset of WE8ISO8859P1

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 35 Run migrate.sh script  Siebel supplied script – Generates various scripts  Expand.ksh ­Widen columns for Unicode  Impexp06.ksh ­Import individual tables for large dbs ­We use full export/import instead  Run sun_expand.sql – Widen columns in tables outside Siebel schemas

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 36 Export Source, Import Target  Full export of source db – Source db is now  NLS_LANG ­AMERICAN_AMERICA.AL32UTF8  Import into target db – Target db created as  NLS_LANG ­AMERICAN_AMERICA.AL32UTF8

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 37 The conversion setup Source Db Target Db export import Source Db WE8ISO8859P1 WE8MSWIN1252 AL32UTF8

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 38 Modify target db  Run impexp06.ksh – Handles sequences etc.  Run check_schema.sql – Find columns that didn’t get widened  Various changes on Siebel App side  Verify db links to Custdb databases

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 39 Conversion Complete?  Siebel process is done  Fix any data issues – Multi-byte character data in source db – Convert properly to AL32UTF8  Testing Unicode changes – GUI changes – Performance  Unicode processing  Users accessing different data

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 40 Multi-byte Data In Source Db?  Source db is WE8ISO8859P1 – Single-byte character set – Doesn’t support multi-byte characters  That’s the official story  The reality is somewhat different  What, if any multi-byte data is in source db? – How to determine correct character set? – How to find, how to fix? – Japanese, Russian, others?

DCSIT Technical Services DBA Brian Hitchcock September 15, 2004Page 41 CRM Unicode Conversion  Three separate presentations – 1) The overall conversion process  What we had, what we wanted, how to get there  Issues that come up during conversion – 2) Multi-byte data in the existing CRM db  What’s the issue, how did it happen  A general method to find and fix this problem – 3) The actual conversion  What really happened  Issues that came up and how they were resolved  Focus on DBA issues, not Siebel application