Pleiades Software Development, Inc. Automatic Merging of Pedigree Information Annual Workshop on Family History Technology April 3, 2003 Sue Dintelman.

Slides:



Advertisements
Similar presentations
UNIQUE PATIENT ID BRAINSTORMING MEETING ON 10 TH NOVEMBER 2009 AT MAMBA VILLAGE, NAIROBI PRESENTATION FROM NATIONAL REGISTRATION BUREAU BY KENNETH NDUATI.
Advertisements

Reconstructing historical populations from genealogical data An overview of methods used for aggregating data from GEDCOM files Corry Gellatly Department.
Family History Intro. & Plan
Connecting NFS Ancestors Who Died in Utah to Records of their Deaths.
Conceptual Clustering
Unit 5 The Network Model  5.1 The Network Model  5.2 IDMS.
The New FamilySearch September 2008 New FamilySearch Announcement “One of the most troublesome aspects of our temple activity is…duplication of effort.
‘I have never tried to find my ancestors, How can the new FamilySearch help me?’
Genetic Genealogy A Report on The CLOUD DNA Project. 1.Our Data Examined 2.Intro to Genetic Genealogy & DNA Genealogical DNA and its Components 4.Examining.
United States Midwest Beginning Research Series Lesson 3: A Case Study Welcome! This series of lessons will help you learn how to conduct genealogical.
Using ICD Codes and Birth Records to Prevent Mismatches of Multiple Births in Linked Hospital Readmission Data Alison Fraser 1, MSPH, Zhiwei Liu 2, MS,
Graph Analysis Matching Program Burdette Pixton. Record Linkage Object Identification Problem Identifies possible links in pedigrees Advantages Compress.
Data Quality Class 10. Agenda Review of Last week Cleansing Applications Guest Speaker.
Lesson #3 Merge Duplicates, Edit Info, Establish Relationships.
DEBRA A. HOFFMAN 4 October 2014 Grow Your Family Tree.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Basics of RootsMagic.
Children Ever in Care Joseph Magruder, MSW Terry V. Shaw, MSW University of California, Berkeley School of Social Welfare This research is funded by the.
Genealogy System PRESENTED BY: Yongjie Fang Xue Li Ian Stuart ADVISOR: Dr. Tuohy Software Engineering Fall 2002.
Multi-Objective Evolutionary Algorithms Matt D. Johnson April 19, 2007.
Review of the module: History of Computing ANU Faculty of Engineering and IT Department of Computer Science COMP1200 Perspectives on Computing Chris Johnson.
Let’s Collaborate - Tools Challenges Traditional Tools Additional TFA Tools.
Finding Your Kin: An Introduction to Genealogy Presented October 16, 2009 by Jean Cooper.
Your Family Tree A personal exploration into your FAMILY ANCESTRY.
Merge and Identify Defined Merge combines two or more patient index references (that refer to the same person) into one. When the names are different,
November 4, How do I start when I have no information?  Create a Family Group Sheet with the following:  name  birth date, place  marriage date,
Merging Duplicate Records in Family Tree. Duplicate records – why not just delete one of them? This record for Elizabeth Berry shows her as the child.
Downloading and Installing PAF Insight PAF Insight can be easily downloaded Or can be installed from a CD A license is needed t0 activate the program.
Breast Cancer Genetic Risk Communicating with Your Family Mary B. Daly, M.D., Ph.D. April 3, 2012.
Collaborative Research Assistant 2007 Family History Technology Conference John Finlay Christopher Stolworthy Daniel Parker.
Sheri Lynn Lemon - 30 Mar Logan Utah Regional Family History Center Programs That Sync with New FamilySearch Ancestral Quest Family Insight Legacy.
The Complete Church Database Solution!. Features and Benefits Web-based = no software AND no more upgrades! Disaster Recovery! We perform multiple daily.
BELLWORK Welcome back!! I hope you all had a great Holiday Break! Sit anywhere you would like for today. Get out a piece of paper and answer the following.
Family Tree Maker 2006 Unlocking Its Mysteries. Getting Started.
Descendancy Resea rch Objective: Help my deceased family members receive the blessings of temple ordinances Option 1 - Find my grandparents, aunts, uncles.
DNA Profiling By: Larah, Hana, Luis, Sajid &Elianna.
Simplesoft Solutions, Inc. | 550 N. Main Street Suite A | Springboro, OH | Phone: (937) CRM in Session April 24, 2009 Data Maintenance in.
Bootstrapping Regular-Expression Recognizer to Help Human Annotators Tae Woo Kim.
Climbing the Family Tree First steps in discovering your family history.
FAMILY TREE 133 POINTS. FAMILY TREE – PART 1(15 POINTS) Students should prepare a prezi or powerpoint that includes the following information. Step 1:
CONCEPTS AND TECHNIQUES FOR RECORD LINKAGE, ENTITY RESOLUTION, AND DUPLICATE DETECTION BY PETER CHRISTEN PRESENTED BY JOSEPH PARK Data Matching.
1 Duplicate Analyzer Exercises. 2 Installation and Initial Configuration: Exercises Exercises 1.Install Duplicate Analyzer on your local PC. 2.Configure.
Family History Sunday School Class Spokane 22 nd Ward.
Understanding DNA and DNA Testing
FamilySearch - Basics A Quick Look at the Puzzle Pieces.
Probabilistic Record Linkage in Genealogical Research John Lawson, Dave White, Brenda Price and Ryan Yamagata Introduction Description of Probabilistic.
PRESERVING YOUR PAST AND YOUR PRESENT FOR THE FUTURE.
An Introduction to Your Ancestors GENEALOGY 101. Pulling your ancestors out of the tree... Does this look like you trying to find your ancestors?
Identity Linking An Alternative to Merging A comprehensive model for sharing genealogical evidence extracts and conclusions-in-progress Bill Harten
How To Get Started Presented By: Doris Ashley.  Develop a Plan  Gather info from family  Look for a published history  Document your sources  Forms.
1Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall. Exploring Microsoft Office Access 2010 by Robert Grauer, Keith Mast, and Mary Anne.
Family Genealogy Great-Great-Great Grandfathers/Grandmothers George Wellington Shelton Born in Spotsylvania County, VA. In 1826 Married Maria Perry in.
THRio Database Linkage and THRio Database Issues.
Genealogical Place Name Normalization Bob Leaman
 What Qualifies Individuals for Temple Work :  Deceased for 1 year and 1 day  A date and a place for a vital event (birth, marriage, death, burial,
CAS NOTIFY  Number of leads you received from Lead Distribution.
MEDICAL RECORD BROKER -LAVANYA GUNDAMARAJU Introduction Introduction n Database and database systems have become an essential part of everyday life.
Tracing your Heritage Part 1
Church Resources and GEDCOMs
Multiple Alignment and Phylogenetic Trees
FamilySearch.org Ancestral files Census Indexes
How to Accomplish Your Original Research
A service provided by the Blessed Family Association
Family History Intro. & Plan
Discovering Your Roots
Family History Merge Duplicates, Edit Info, Establish Relationships
Give great customer service with Microsoft Dynamics CRM
Your Full Name Event Date Place of Event (City, Provice) Birth
Genealogy with the Internet
Andrew Borthwick, Ph.D. Martin Buechi, Ph.D. ChoiceMaker Technologies
Presentation transcript:

Pleiades Software Development, Inc. Automatic Merging of Pedigree Information Annual Workshop on Family History Technology April 3, 2003 Sue Dintelman and Tim Maness Pleiades Software Development, Inc.

Source of Duplicates Common Ancestry Trees – Most large pedigrees have branches that intermarry Combining Data Sources – Working with other family members to build a common genealogy – Utilizing on-line or other sources to expand your genealogy Pleiades Software Development, Inc.

Current Solutions Not automated Utilize limited clustering options Utilize limited family information (Parents’ names)

Pleiades Software Development, Inc. Goals for Merge Utility Automatic Fast Accurate Eliminate duplicates in a single family database Combine multiple family databases

Pleiades Software Development, Inc. Record Linking Background Decide if two records are for the same individual Use sum of weights for a comparison of each common field in the records Use a cut off score to choose “true” links Pleiades Software Development, Inc.

Sample Scores

Pleiades Software Development, Inc. Problems Linking Individuals in Family Data Few fields that can actually be compared (name, birth date and place, death date and place) Many names will be similar or identical because of naming conventions Many places will be the same because these are families

Pleiades Software Development, Inc. Advantages Linking Individuals in Family Data Family members provide additional field values for comparison Additional family information helps prevent incorrect matches

Pleiades Software Development, Inc. Other Record Linking Considerations Misspellings of names and places Incorrect dates Initial inconsistencies – Any family database with 20+ generations has some type serious inconsistency

Pleiades Software Development, Inc. The Process Data preparation Find initial duplicates Use a recursive process to find other duplicates Pleiades Software Development, Inc.

Data Source Preparation Find loops (an individual is his own ancestor) Find inconsistent information (a person is born before his parents) Identify connected components Pre-process names, places and dates

Pleiades Software Development, Inc. Generate Duplicate List Cluster using last name variation – Transducer Compute score – Individual component – Family component Choose the links with the highest scores

Pleiades Software Development, Inc. Merge Duplicates For each pair of duplicates: Combine data Recursively consider the relatives of the duplicates Add any new duplicates to the list

Pleiades Software Development, Inc. New Duplicate Misspelling : – Jones, Jerrolyn, Mary – Jonesanderson, Jerrolyn, Mary Duplicate sib : – Kimball, Lanette 3/4/1905 – Kimball, Lannette 0/0/1905

Pleiades Software Development, Inc. The Merge Reports List of people who merged List of new people List of parent problems

Pleiades Software Development, Inc. Example Parent Problem Jonathan Anderson, born 07/07/1848 Nauvoo, Hancock, OH Spouse: Maria Babcock, born 08/09/1852 Nauvoo, Hancock, OH (five children Ann, John, Alex, Samantha, Elizabeth) Mother: Emily Adams, born 02/19/1823 Pomphret, Chautauqua, NY Father: Jonathan P. Anderson, born 10/28/1824 Wartrace Creek, Bedford, TN Jonathan Anderson, born 07/07/1848 Nauvoo, Hancock, OH Spouse: Maria Babcock, born 08/09/1852 Nauvoo, Hancock, OH (five children Ann, John, Alex, Samantha, Elizabeth) Mother: Theresa Johnson, born 04/17/1825 New York City, NY Father: Jonathan K. Anderson, born 08/15/1820 Weakly, TN

Pleiades Software Development, Inc. GenMerge Automates finding and eliminating duplicates in a single data source or when combining data sources Fast Accurate Allow review of inconsistencies