Presentation is loading. Please wait.

Presentation is loading. Please wait.

MySQL and GRID Gabriele Carcassi STAR Collaboration 6 May 2002 - Proposal.

Similar presentations


Presentation on theme: "MySQL and GRID Gabriele Carcassi STAR Collaboration 6 May 2002 - Proposal."— Presentation transcript:

1 MySQL and GRID Gabriele Carcassi STAR Collaboration 6 May 2002 - Proposal

2 Why?  STAR uses MySQL to keep track of data files (file catalog)  There are already many projects concerning file catalogs  MAGDA already uses MySQL to store file catalog information  We will not concentrate on this aspect

3 Why?  STAR reconstruction jobs need access to both data files and databases Job DB Raw data file Reconstructed data file Calibration constants Detector geometry Detector readings

4 Database replication  DB is being replicated through MySQL replication  Master/slave approach Master DB Slave DB Slave DB Slave DB write access update forwarded read access

5 GRID and DB replicas  Database replication is essential because:  Increase data availability  Allows locations to run jobs independently (in case of network congestion)  Correction to the DB are available to all locations  In order to successfully execute jobs in a GRID environment, we need database replication to be somehow integrated

6 Steps 1.Aid DB administrator to manage complexity of database replication 2.Tools to install a mirror using GRID technology 3.Database catalogs and integration with file catalogs 4.Integrate GRID authentication

7 1. Manage complexity  The more server you have, the more complicated is to manage the system  Build a GUI that helps the database administrator (DBA) to have a general picture

8 1. Manage complexity  Keep track of all the servers, and the relationship between each other.  Help manage more sophisticated network topologies, for example: The arrows shows the direction of the updates NB: This is an example, not STAR topology. BNLLBL Master BNL slaves Master mirrorSlaves Less traffic over WAN

9 1. Manage complexity  Configurations  Aid the comparison between server characteristics (i.e. OS version, MySQL version)  Aid the comparison of the settings of the different database servers  User management  Create and delete users on a group of servers

10 1. Manage complexity  Consistency checks  Compare row counts of different replicas  Compare master log pointer with slaves  Check slave connections to the master  Evaluate the replication  Monitor CPU and network activity of the servers to help decide if the current number of servers is sufficient

11 2. Creating a new mirror  MySQL keeps the slave synchronized, but you have to manually copy the db files during slave initialization  When the DB is already in place, you might have to copy each file by itself, since the total might exceed 2GB  GRID technologies can be used to transfer the first copy of the database from site to site

12 2. Creating a new mirror  New mirror  Steps for mirror creation 1. Install MySQL (manual?) 2. Copy database files (through GRIDFtp) 3. Configure master and slave  Creating/deleting a database on an existing mirror

13 2. Creating a new mirror  Ease of use  Integration with the previous GUI  Hide as much details as possible and encourage good configuration policies (i.e. create a user with suitable permissions to be used by the slave connection)

14 3. Catalogs  Jobs need to know to which database server to connect (for now, this is done by XML files)  In a GRID environment, jobs will contact a file catalog to determine the location of the files to be used  The scheme for the database catalog shouldn’t be different, and it should be as connected as possible to the file catalog

15 3. Catalogs  Database catalog  A job should query the catalog to know to which database server to connect  If possible, file catalogs could be used directly. For example, instead of a physical file location, the catalog could return the parameters for the database connection (i.e. “mydb.star.bnl.gov:3301”)  If no connection can be established, the job might ask the catalog for an alternative server

16 3. Catalogs  Administration  The DBA should be able to establish the policies that decide to which server a given job should connect (i.e. IP address, user groups)  Integration with the GUI  The GUI would use the catalog to keep track of the different servers  The GUI would help the DBA to assign policies, and check that the catalog information is not corrupted

17 4. Authentication  GRID authentication should be integrated to the whole scheme  Connection between database servers  Connection to the catalog  Connection job-database server

18 4. Authentication  MySQL 4.0 will integrate SSL for both user connections and between server connections  MySQL can be instructed to accept only certificate authentication  MySQL 4.0 is still in alpha and SSL is not GSI, but:  Some issues should be already addressed  A patch to enable GSI on SSH is available and might provide further insights

19 4. Authentication  Integration with the GUI  Just one authentication at the beginning of the application  Same authentication used to authenticate to the servers  User management


Download ppt "MySQL and GRID Gabriele Carcassi STAR Collaboration 6 May 2002 - Proposal."

Similar presentations


Ads by Google