Download presentation
Presentation is loading. Please wait.
1
Optimized Data Migration within a System of Linked Medical Research Databases By Jared Christopherson U. of Connecticut
2
Problem to Address Medical research requires connections to multiple hospitals, institutions, or online databases Data must be compiled manually Process is time-consuming
3
General Project Goal
4
Project Goals: Present data as though from a single source Give the researcher flexibility with viewing the data Optimize data flow with site caching
5
Real World Issue: Data Formatting
7
Solution: Master Template
8
Master Templates and Display Templates
9
Basic Functionality Provides a simple search for users Researchers have the option of selecting a pre-set Display Template to only display data relevant to their needs Queries each database individually according to the Master Template Returns results and (optionally) compiles them into a single list Use AJAX to return results for each database
11
Caching and Optimization Goal: researchers should have fastest access possible to the info they seek X-RAY or MRI images could be 2-5Mb in size each What if researchers in the US consistently need access to data on a server in Asia? Local access would be fastest
12
Caching Goal
13
Caching and Optimization: Possible Solutions Move everything to a central server Move records around as they are accessed Cache everything Cache databases based on usage
14
Query and Result Set – No Caching
15
Caching Process
16
Caching Complete
17
Region Caching
18
Database Caching Queue – What to cache? For each region, determines the top external servers used based on a percentage of queries
19
Database Caching Queue Need a method to determine the most heavily requested external databases for each region Track statistics: Convert IP address -> region whenever a user performs a search Increment result count for the record that keeps track of the region ID and database ID
20
DB Queue to Cache
21
Caching and Optimization: Where to cache data Real-world constraints allow_cache supersite bandwidth cache_size
23
Caching and Optimization: Script Process Runs at frequency set by admin This process continues for each region with the program assigning data to servers with progressively lower bandwidth and cache_size scores until all the server space from that region is exhausted
24
Caching and Optimization: Script Process At the end, each region should have as many local copies of the most frequently requested databases as possible Cached copies are read-only
25
Further Work and Improvements Allow different types of databases (DAL) Remove overlapping data Script to determine when individual caches need to be updated
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.