Download presentation
Presentation is loading. Please wait.
1
Important Task in Patents Retrieval Recall is an Important Factor Given Query Patent -> the Task is to Search all Related Patents Patents have Complex Contents and Technical Structure Diverse/Large Vocabulary Writers often intentionally use Vague Terms and Expression Query Terms are extracted from Query Patents Selecting Relevant Query Terms is a difficult Task Document-Terms Mismatch in Queries IR Systems Bias effects the Retrievability of Patents A Subset of patents become more Retrievable at the expense of others A large number of Patents either become Low Findable or Could not be Findable via any Query Missing Terms are Identified from Query Expansion Pseudo-Relevance Feedback Documents (PRF) used for identifying Expansion Terms Relevant PRF are identified using Query Patents Similarity with Retrieve Documents of Queries Different Fields of Query Patents used For Similarity Computation Title, Abstract, Claims, Background Summary, Figures Tags, Descriptions Patent Description, Background Summary gives Best Results Using full text Query Patent for PRF similarity computation has several limitations Full text Query Patent may have large number of Irrelevant Terms Our Approach We compute PRF similarity with only Relevant Terms of Query Patents Relevant Terms are identified based upon their Terms closeness/compactness with Queries terms Closeness/Compactness is identified based upon different Features and Trained Classifier Measures Low/High findable documents in collection (D) Defined as d D c denotes the rank user willing to proceed k dg is the rank of document d in query q Q f(k dg,c) is cost function, returns 1 if k dg <=c, otherwise 0 Lorenz Curve For Visualizing retrievability inequality between documents More skewed the Curve, greater the amount of bias Patents downloaded from US Patents and trademark office website (http://www/uspto.gov/)http://www/uspto.gov/ USPC class 422 ad 423 with 54,353 Patents Retrieval Systems TFIDF, BM25, Exact Match, Language Modeling (LM) Each Patent used as a seed for Query Generation Every Patent is considered as a Query Patent, and rest are considered for searching Related Patents Retrieval System are evaluated based upon Retrievability Measurement Prior-Art Search Prior-Art Retrieval Challenges Prior-Art Search Prior-Art Queries Construction Prior-Art Search Improving Patents Retrievability with Query Expansion Prior-Art Search Relevant PRF Identification with Query Patent Similarity Prior-Art Search Retrievability Measurement QueriesTotal Queries Average Retrievability/Query 3 Terms Queries2,908,972373.49 4 Terms Queries2,876,587282.24 Prior-Art Search is a Challenging Task Without Query Expansion, we experienced large Retrievability inequality We Improved Patents Retrievability using Query Expansion with PRF PRF Documents are identified via Query Patent Similarity and Query Documents Dataset and Experimental Setup Prior-Art Search Conclusion 32 nd European Conference on Information Retrieval (ECIR’10), Milton Keynes, UK 28 th -31 st March, 2010
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.