05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens 1 1. University College Ghent, 2. Ghent University (Belgium)
CRIS /06/2007 Overview Problems in CRISs Fuzzy sets and Rough sets PAS project
CRIS /06/2007 Overview Problems in CRISs Fuzzy sets and Rough sets PAS project
CRIS /06/2007 Problems in CRISs Fuzzy Rough Term = Term
CRIS /06/2007 Overview Problems in CRISs Fuzzy sets and Rough sets PAS project
CRIS /06/2007 Fuzzy sets and rough sets Traditional approach: crisp sets Young people = {x People | 0<age(x)<27}
CRIS /06/2007 Fuzzy sets and rough sets Fuzzy approach: fuzzy sets 0 if age(x) ≥ 30 1 if age(x) ≤ 20 (30 – age(x)) / 10 otherwise Young(x) =
CRIS /06/2007 Fuzzy sets and rough sets Rough approach: rough sets Upper approximation (R↑A) A = {Numerical Analysis} B = {Compilers} R↑A = {Num. Analysis, Ex. Sciences, Statistics,..., Coding Theory} R↑B = {Compilers, Programming, GCC, YACC}
CRIS /06/2007 Fuzzy rough sets Fuzzy approach on rough sets Fuzzy set A Fuzzy relation R R (x,y) Upper approximation (R↑A)(y) = min (R(x,y),A(y))
CRIS /06/2007 Fuzzy rough sets: application Query expansion Allows more results by using R↑A RProgrammingHardwareC++JavaLaptopAlgorithm Programming Hardware C Java Laptop Algorithm Query: “Programming” - Expanded query: {(“Programming”,1.0), (“C++”,0.8), (“Java”,0.8), (“Algorithm”,0.6)}
CRIS /06/2007 Overview Problems in CRISs Fuzzy sets and Rough sets PAS project
CRIS /06/2007 PAS-project What is the PAS-project? Personal Alert System (HoGent) Goal: to get the researcher’s attention on funding possibilities that match his/her profile Information: about researchers, projects, funding possibilities (grants etc.) → matching/collaboration Automation and intelligence
CRIS /06/2007 PAS – How does it work? -Name -Staff number -Department(s) -Group -Date of creation of the profile -Last update of the profile -Percentage research time -Skills description -Diplomas -Publications -IWETO-keywords -Free keywords Fill in IWETO Thesaurus HoGent Thesaurus User
CRIS /06/2007 PAS – How does it work? -Reference -Title -Content -Attachment(s) -Level -Duration -Institution -Deadline -Address -Contact person -IWETO-keywords -Free keywords IWETO Thesaurus Messages HoGent Thesaurus
CRIS /06/2007 PAS – How does it work? The IWETO-classification has 641 research fields: 5 at the 1st level, 31 at the 2nd level, 605 at the 3rd level 1 2 3
CRIS /06/2007 PAS – How does it work? By adding “free keywords” we can refine the classification
CRIS /06/2007 PAS – How does it work? Query: A = {k3} Expanded query: R↑A = {(k1,0.8), (k3,1.0), …} M1 → R2
CRIS /06/2007 PAS – How does it work?
CRIS /06/2007
CRIS /06/2007
CRIS /06/2007
CRIS /06/2007
CRIS /06/2007
CRIS /06/2007
CRIS /06/2007 PAS – Current implementation Prototype that will be used as skeleton for the final system Basic algorithm using weights and their products and basic fuzzy rough query expansion 1 Basic profiles and messages Manual processing of feedback and manual data extraction from text files. 1 P. Srinivasan, M. E. Ruiz, D. H. Kraft, J. Chen: Vocabulary mining for information retrieval: rough sets and fuzzy sets, Information Processing and Management, 37(1) (2001) 15-38
CRIS /06/2007 PAS – Future work Richer representation of profiles and messages Automation of the feedback mechanism Dealing with imprecision and words from different thesauri Dealing with ambiguity and incomplete profiles Tracking research activities for collaboration Automatic extraction of information from text files Search engine
CRIS /06/2007 Thank you