Information Management for Digital Humanities and Diplomatics

Slides:



Advertisements
Similar presentations
Information retrieval – LSI, pLSI and LDA
Advertisements

Multilinear Algebra for Analyzing Data with Multiple Linkages Tamara G. Kolda plus: Brett Bader, Danny Dunlavy, Philip Kegelmeyer Sandia National Labs.
Probabilistic Clustering-Projection Model for Discrete Data
INF 141 IR METRICS LATENT SEMANTIC ANALYSIS AND INDEXING Crista Lopes.
Comparison of information retrieval techniques: Latent semantic indexing (LSI) and Concept indexing (CI) Jasminka Dobša Faculty of organization and informatics,
Non-Negative Tensor Factorization with RESCAL Denis Krompaß 1, Maximilian Nickel 1, Xueyan Jiang 1 and Volker Tresp 1,2 1 Department of Computer Science.
An Introduction to Latent Semantic Analysis
1 Latent Semantic Indexing Jieping Ye Department of Computer Science & Engineering Arizona State University
Vector Space Information Retrieval Using Concept Projection Presented by Zhiguo Li
Indexing by Latent Semantic Analysis Written by Deerwester, Dumais, Furnas, Landauer, and Harshman (1990) Reviewed by Cinthia Levy.
Distributions and Distributional Lexical Semantics for Stop Lists Corpus Profiling 2008 BCS London Neil Cooke BSc DMS CEng FIET PhD Student CCSR Dr Lee.
Information Retrieval in Text Part III Reference: Michael W. Berry and Murray Browne. Understanding Search Engines: Mathematical Modeling and Text Retrieval.
Latent Dirichlet Allocation a generative model for text
Using TF-IDF to Determine Word Relevance in Document Queries
Singular Value Decomposition in Text Mining Ram Akella University of California Berkeley Silicon Valley Center/SC Lecture 4b February 9, 2011.
Indexing by Latent Semantic Analysis Scot Deerwester, Susan Dumais,George Furnas,Thomas Landauer, and Richard Harshman Presented by: Ashraf Khalil.
IR Models: Latent Semantic Analysis. IR Model Taxonomy Non-Overlapping Lists Proximal Nodes Structured Models U s e r T a s k Set Theoretic Fuzzy Extended.
SLIDE 1IS 240 – Spring 2007 Prof. Ray Larson University of California, Berkeley School of Information Tuesday and Thursday 10:30 am - 12:00.
Adding Semantics to Information Retrieval By Kedar Bellare 20 th April 2003.
Probabilistic Latent Semantic Analysis
LATENT DIRICHLET ALLOCATION. Outline Introduction Model Description Inference and Parameter Estimation Example Reference.
CS276A Text Retrieval and Mining Lecture 15 Thanks to Thomas Hoffman, Brown University for sharing many of these slides.
Latent Semantic Analysis Hongning Wang VS model in practice Document and query are represented by term vectors – Terms are not necessarily orthogonal.
Introduction to Machine Learning for Information Retrieval Xiaolong Wang.
Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.
LOGO Recommendation Algorithms Lecturer: Dr. Bo Yuan
CONCLUSION & FUTURE WORK Normally, users perform triage tasks using multiple applications in concert: a search engine interface presents lists of potentially.
PrasadL18LSI1 Latent Semantic Indexing Adapted from Lectures by Prabhaker Raghavan, Christopher Manning and Thomas Hoffmann.
Introduction to Information Retrieval Introduction to Information Retrieval CS276: Information Retrieval and Web Search Christopher Manning and Prabhakar.
Matrix Factorization and Latent Semantic Indexing 1 Lecture 13: Matrix Factorization and Latent Semantic Indexing Web Search and Mining.
Introduction to Information Retrieval Lecture 19 LSI Thanks to Thomas Hofmann for some slides.
INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.
Which of the two appears simple to you? 1 2.
Information Retrieval by means of Vector Space Model of Document Representation and Cascade Neural Networks Igor Mokriš, Lenka Skovajsová Institute of.
Introduction to Information Retrieval Introduction to Information Retrieval CS276: Information Retrieval and Web Search Christopher Manning and Pandu Nayak.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
Finding the Hidden Scenes Behind Android Applications Joey Allen Mentor: Xiangyu Niu CURENT REU Program: Final Presentation 7/16/2014.
Text Categorization Moshe Koppel Lecture 12:Latent Semantic Indexing Adapted from slides by Prabhaker Raghavan, Chris Manning and TK Prasad.
Introduction to Information Retrieval Introduction to Information Retrieval CS276: Information Retrieval and Web Search Christopher Manning and Pandu Nayak.
Latent Semantic Indexing
Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang
Topic Modeling using Latent Dirichlet Allocation
Modern information retreival Chapter. 02: Modeling (Latent Semantic Indexing)
Project 2 Latent Dirichlet Allocation 2014/4/29 Beom-Jin Lee.
Latent Dirichlet Allocation
1 CS 430: Information Discovery Lecture 11 Latent Semantic Indexing.
Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.
Web-Mining Agents Probabilistic Information Retrieval Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Karsten Martiny (Übungen)
Web-Mining Agents Topic Analysis: pLSI and LDA
ITCS 6265 Information Retrieval & Web Mining Lecture 16 Latent semantic indexing Thanks to Thomas Hofmann for some slides.
Web Search and Data Mining Lecture 4 Adapted from Manning, Raghavan and Schuetze.
Memoryless Document Vector Dongxu Zhang Advised by Dong Wang
PrasadL18LSI1 Latent Semantic Indexing Adapted from Lectures by Prabhaker Raghavan, Christopher Manning and Thomas Hoffmann.
Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li
Latent Semantic Analysis (LSA) Jed Crandall 16 June 2009.
Lübeck, 17. Juni 2016 Model-based Data Science Causality, Learning, Adaptation Heinz Handels, Karsten Keller, Ralf Möller, Philipp Rostalski MINT-Sektion,
Latent Semantic Analysis John Martin Small Bear Technologies, Inc.
Semantic Web and Enterprise Systems Semantic Integration and Interpretation Ralf Möller, Christian Neuenstadt, Özgür Özçep Institut für Informationssysteme,
Web-Mining Agents Cooperating Agents for Information Retrieval
LSI, SVD and Data Management
Networked Information Resources
Michal Rosen-Zvi University of California, Irvine
Latent Dirichlet Allocation
Topic Models in Text Processing
Non-Standard-Datenbanken
Information Retrieval in Digital Libraries: Bringing Search to the Net
Latent Semantic Indexing
Information Retrieval and Web Design
Restructuring Sparse High Dimensional Data for Effective Retrieval
Latent Semantic Analysis
Presentation transcript:

Information Management for Digital Humanities and Diplomatics Ralf Möller Universität zu Lübeck Institut für Informationssysteme

Charters in Information Systems Steganographic representation

Document Representation ring jupiter ••• space voyager car company ••• dodge ford car company ••• dodge ford

Matrix Representation C. Eckart, G. Young, The approximation of a matrix by another of lower rank. Psychometrika, 1, 211-218, 1936

Principle Components set smallest r-k singular values to zero VkT t 3 d2 d1 x1 t 3 x2 t 2 t 1 q set smallest r-k singular values to zero VkT k Scott Deerwester, Susan Dumais, George Furnas, Thomas Landauer, Richard Harshman: Indexing by Latent Semantic Analysis. In: Journal of the American society for information science, 1990

Tagging

Matrix for Relational Structure Maximilian Nickel, Volker Tresp, Hans-Peter Kriegel A Three-Way Model for Collective Learning on Multi-Relational Data In Proc. 28th International Conference on Machine Learning, 2011

Documents and Representations D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993-1022, January 2003 Documents and Representations C W N M  b Z C W N M  b Z car company ••• dodge ford ring jupiter space voyager Pseudo Rk

Latent Relational Structure: Generative Model W N M  b Z C Xkij NxNxk M  b Z Pseudo Rk

Charters in Information Systems Steganographic representation

Achievements / Short-Term Goals Association of documents Certificate retrieval shows associated reports Added value for users Structure building based on sensible document grouping due to steganographic data associated with picture documents Relational descriptions for text sharpen associations Goal: Compute relational descriptions automatically Latent relational structures behind text/images

Long-Term Goal: Integrate Databases

Take home messages Contact Humanities researchers working on databases and text documents and can benefit from ... ... new ambient services Goal: compute underlying data automatically Computer science researchers help achieving these goals... ... in cooperation with humanities researchers Contact Prof. Dr. rer. nat. Ralf Möller Institute for Information Systems Universität zu Lübeck Ratzeburger Allee 160 Haus 64 23562 Lübeck Tel: +49 451 3101 5700 moeller@uni-luebeck.de