Martin Repka, Ján Paralič Katedra kybernetiky a umelej inteligencie Fakulta elektrotechniky a informatiky Technická univerzita v Košiciach {Martin.Repka,

Slides:



Advertisements
Similar presentations
Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin.
Advertisements

Stock Price Prediction Based on Social Network A survey Presented by: CHEN En.
Arnd Christian König Venkatesh Ganti Rares Vernica Microsoft Research Entity Categorization Over Large Document Collections.
Does one size really fit all? Evaluating classifiers in Bag-of-Visual-Words classification Christian Hentschel, Harald Sack Hasso Plattner Institute.
Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
The Painter’s Feature Selection for Gene Expression Data Lyon, August 2007 Daniele Apiletti, Elena Baralis, Giulia Bruno, Alessandro Fiori.
TÍTULO GENÉRICO Concept Indexing for Automated Text Categorization Enrique Puertas Sanz Universidad Europea de Madrid.
A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts 04 10, 2014 Hyun Geun Soo Bo Pang and Lillian Lee (2004)
Naïve Bayes based Model Billy Doran “If the model does what people do, do people do what the model does?”
1. Abstract 2 Introduction Related Work Conclusion References.
Lesson learnt from the UCSD datamining contest Richard Sia 2008/10/10.
Mapping Between Taxonomies Elena Eneva 30 Oct 2001 Advanced IR Seminar.
Variations of Minimax Probability Machine Huang, Kaizhu
Mapping Between Taxonomies Elena Eneva 11 Dec 2001 Advanced IR Seminar.
Text Classification: An Implementation Project Prerak Sanghvi Computer Science and Engineering Department State University of New York at Buffalo.
Sentence Classifier for Helpdesk s Anthony 6 June 2006 Supervisors: Dr. Yuval Marom Dr. David Albrecht.
Large Lump Detection by SVM Sharmin Nilufar Nilanjan Ray.
Text Classification Using Stochastic Keyword Generation Cong Li, Ji-Rong Wen and Hang Li Microsoft Research Asia August 22nd, 2003.
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
Data mining and machine learning A brief introduction.
VŠB - Technická univerzita Ostrava Text Mining Services for Trialogical Learning Pavel Smrž 1, Ján Paralič 2, Peter Smatana 2, Karol.
Towards Improving Classification of Real World Biomedical Articles Kostas Fragos TEI of Athens Christos Skourlas TEI of Athens
FIIT STU Bratislava Classification and automatic concept map creation in eLearning environment Karol Furdík 1, Ján Paralič 1, Pavel Smrž.
Use of web scraping and text mining techniques in the Istat survey on “Information and Communication Technology in enterprises” Giulio Barcaroli(*), Alessandra.
Application of Data Mining Algorithms in Atmospheric Neutrino Analyses with IceCube Tim Ruhe, TU Dortmund.
AUTOMATED TEXT CATEGORIZATION: THE TWO-DIMENSIONAL PROBABILITY MODE Abdulaziz alsharikh.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice George Forman Martin Scholz Shyam.
An Ensemble of Three Classifiers for KDD Cup 2009: Expanded Linear Model, Heterogeneous Boosting, and Selective Naive Bayes Members: Hung-Yi Lo, Kai-Wei.
Gary M. Weiss Alexander Battistin Fordham University.
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
META-LEARNING FOR AUTOMATIC SELECTION OF ALGORITHMS FOR TEXT CLASSIFICATION Karol Furdík, Ján Paralič, Gabriel Tutoky {Jan.Paralic,
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
A Repetition Based Measure for Verification of Text Collections and for Text Categorization Dmitry V.Khmelev Department of Mathematics, University of Toronto.
Consensus Group Stable Feature Selection
Text Document Categorization by Term Association Maria-luiza Antonie Osmar R. Zaiane University of Alberta, Canada 2002 IEEE International Conference on.
Nuhi BESIMI, Adrian BESIMI, Visar SHEHU
Text Annotation By: Harika kode Bala S Divakaruni.
Area/Density/Edge Metrics Patch radius of gyration – measure of avg distance organism can move within a patch before patch bnd(extent) Correlation length.
FUZZ-IEEE Kernel Machines and Additive Fuzzy Systems: Classification and Function Approximation Yixin Chen and James Z. Wang The Pennsylvania State.
Show Me Potential Customers Data Mining Approach Leila Etaati.
Ping-Tsun Chang Intelligent Systems Laboratory NTU/CSIE Using Support Vector Machine for Integrating Catalogs.
Does one size really fit all? Evaluating classifiers in a Bag-of-Visual-Words classification Christian Hentschel, Harald Sack Hasso Plattner Institute.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Data Mining and Text Mining. The Standard Data Mining process.
Classify A to Z Problem Statement Technical Approach Results Dataset
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
A Smart Tool to Predict Salary Trends of H1-B Holders
Aim: Full House Use: 9 Grid Work out estimated answer & cross it off
Applying Deep Neural Network to Enhance EMPI Searching
CATEGORIZATION OF NEWS ARTICLES USING NEURAL TEXT CATEGORIZER
Alan D. Mead, Talent Algorithms Inc. Jialin Huang, Amazon
Statistical Techniques
Juweek Adolphe Zhaoyu Li Ressi Miranda Dr. Shang
CS6604 Project Ensemble Classification
Schizophrenia Classification Using
Project 2 k-NN 2018/11/10.
Classifying enterprises by economic activity
Feature Engineering Studio Special Session
Ninja Trader: Introduction to data mining in financial applications
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Text Classification - Accelerator
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Information Retrieval
Dynamic Category Profiling for Text Filtering and Classification
Elena Mikhalkova, Nadezhda Ganzherli, Yuri Karyakin, Dmitriy Grigoryev
Verification of forecasts against analysis - some problems
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Martin Repka, Ján Paralič Katedra kybernetiky a umelej inteligencie Fakulta elektrotechniky a informatiky Technická univerzita v Košiciach {Martin.Repka,

Company Network publicly accessed sources as Business Register of Slovak Republic (ORSR) aggregated, analyzed and integrated within project ITLIS 1 featuring complete data model, which integrates structural, compositional and temporal (historical) data aim correlation between compositional data and online catalogue categories 1

Data Amadeus - compositional data about companies Azet - online catalogue entries in cases Itlis - companies in 10 largest online cat. categories, 2040 companies (ranging from 105 to 389 companies in category)

Company categorization classification of documents into categories Text-mining technique List of company activities as unstructured document

Results average accuracy 10-fold cross-validation average performance using mode values = 19.07% RapidMiner

Conclusions data integration from several sources experimentally verified correlation of compositional data (company activities) and online categorization several model learners time complexity and model accuracy very efficient model learners seems to be SVM, Neural Network and Naive Bayes (Kernel) model information could be useful as support to other analyses of company network.