C LIENT R EGISTRY OpenEMPI: Operations Support Training SYSNET International, Inc.
OpenEMPI Software Stack Database Server: PostgreSQL open source database, version 9.1 Application Server: JBoss version Web Server: Apache HTTP Server, version 2.4
Overview of Operational Support Tasks Starting the Application Server Stopping the Application Server Starting the Database Server Stopping the Database Server Changing database account passwords Managing OpenEMPI user accounts Back-up the application software Back-up the Client Registry data Viewing log files Monitor the server load
Starting the Application Server JBoss Installation Directory /home/sysnet/servers/jboss Setting the environment $ source /home/sysnet/openempi/openempi_env.sh Start the server $ cd /home/sysnet/servers/jboss $ bin/run.sh
Server’s Memory Configuration Will be configured appropriately but if necessary, adjustments can be made over time In bin/run.conf JAVA_OPTS="-Xms128m -Xmx2048m -XX:MaxPermSize=512m … -Xms:Sets the starting heap size -Xmx: Sets the maximum heap size -XX:MaxPermSize:Sets the size of the memory allocated to storing class information
Stopping the Application Server JBoss Installation Directory /home/sysnet/servers/jboss Setting the environment $ source /home/sysnet/openempi/openempi_env.sh Start the server $ cd /home/sysnet/servers/jboss $ bin/shutdown.sh --shutdown
Starting/Stopping the Database Server PostgreSQL 9.1 installed as Unix service Starting the server: $ sudo /etc/init.d/posgresql start Stopping the server: $ sudo /etc/init.d/posgresql stop
Database Accounts Account postgres is privileged; it is created when the software is installed Account openempi is created during installation of OpenEMPI To change the openempi password, first connect to Postgres server with client application psql --username=openempi --host=localhost openempi
Change Database Password Once connected, use the alter user command: ALTER USER openempi WITH PASSWORD ‘xxxxxxxxx'; Note: ‘xxxxxxxxx’ is just a placeholder for the password; so a strong password Must tell OpenEMPI that database password has changed $ cd /home/sysnet/openempi/conf $ vi jdbc.properties jdbc.username=openempi jdbc.password=openempi
Managing OpenEMPI Accounts Use the security tab to manage accounts and privileges
Manage Roles From the Security tab select Manager Roles
Manage Roles Roles have one or more permissions assigned to them.
Backup Database Postgres includes the pg_dump tool for backups pg_dump --username=openempi --password \ --host=localhost openempi gzip \ /home/sysnet/backups/openempi-db-backup-mm-dd-yyyy.sql.gz Software backs up everything within the database A script will be developed to automate this process Backups should be done on a daily basis
Backup Software All the software that needs to be preserved resides under: /home/sysnet Filesystem backups may be performed at the virtual machine level Backup strategy will be refined once the server hosting the client registry is made available to us
Log Files The primary log file for the Client Registry is: /home/sysnet/openempi/openempi.log Stores operational information at a configurable log level /home/sysnet/openempi/conf/log4j.properties log4j.rootCategory=warn, R, O Log level takes the values below (increasing detail): error, warn, info, debug, trace Application server’s log file: /home/sysnet/servers/jboss GA/server/default/log/server.log
System Monitoring It is useful to periodically monitor the performance of the server vmstat 5 (sar is another alternative) procs memory swap io---- -system cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa You want to verify: System has plenty of memory Disk I/O is low CPU Utilization is not too high (consistently in the %90s).
Why do we need an EMPI? Patient visits multiple separate healthcare providers Enterprise MPI Radiology Patient known as John Smythe Hospital Patient known as John Smith Laboratory Patient known as J. M. Smith Patient ID | John Smith | … Three entries merged into single record
Approaching Record Matching
Client Registry Architecture Overview of the software architecture of OpenEMPI
OpenEMPI Architecture Utilizing the idea of a service, interchangeable implementations of services can be plugged into the system transparently
Blocking Algorithms Comparison of records is quadratic in the number of records Two files of 300,000 each generate 90 billion pairs Blocking variables are used for partitioning Multiple passes are used to prevent errors Selecting blocking variables High selectivity factor Preferably uniformly distributed Wide variety of blocking algorithms available Sorted neighborhood Bigram Indexing Canopy Clustering 21
Field Comparison/Distance Algorithms Phonetic Encoding Algorithms Soundex: oldest and most well known algorithm Phonex: aims soundex by pre-processing names Phonix: extension of Phonex with > 100 rules NYSIIS: New York State Identification Intelligence System Metaphone/Double Metaphone Approximate String Matching Levenshtein or Edit Distance Longest Common Substring (LCS) Q-Grams Jaro/Jaro-Winkler Combinations of techniques 22
Matching Algorithms Variety of algorithms available, both deterministic and probabilistic Fellegi-Sunter is the most popular probabilistic algorithm 23