Supervisor: Mr. Phan Trường Lâm Supervisor:. Team information.

Slides:



Advertisements
Similar presentations
Final Project Instructor: Nguyen Anh Tu Students: Tran Tien Tai Tran Tien Tai Tran Ngoc Mai Tran Ngoc Mai Tu Kim Tuan Tu Kim Tuan Nguyen Ngoc Phuong Nguyen.
Advertisements

Introduction to Computers Lecture By K. Ezirim. What is a Computer? An electronic device –Desktops, Notebooks, Mobile Devices, Calculators etc. Require.
AIMSweb Progress Monitor Online User Training
1 Configuring Internet- related services (April 22, 2015) © Abdou Illia, Spring 2015.
Single Search By Rakphao Theppan, librarian Searching Online Resources.
Supervisor: Mr. Huỳnh Anh Dũng
1 CS 502: Computing Methods for Digital Libraries Lecture 22 Web browsers.
1 Configuring Web services (Week 15, Monday 4/17/2006) © Abdou Illia, Spring 2006.
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
Interpret Application Specifications
Software Process and Product Metrics
Software Documentation Written By: Ian Sommerville Presentation By: Stephen Lopez-Couto.
Members: Trần Huy Thường Trần Mạnh Cường Đào Anh Thư Nguyễn Duy Tiến
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
Computer Skills Preparatory Year Presented by: L.Obead Alhadreti.
Managing and Monitoring Windows 7 Performance Lesson 8.
Supervisor: Mr. Phan Trường Lâm Supervisor:. Team information.
Searchlets Customer: Paul English Advisor: Prof. Ethan Bolker Team : Satish, Di, Quan
GoodsWayGoodsWay Capstone Project Team information Goodsway.
Master Thesis Defense Jan Fiedler 04/17/98
Word Weighting based on User’s Browsing History Yutaka Matsuo National Institute of Advanced Industrial Science and Technology (JPN) Presenter: Junichiro.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
CSE 548 Advanced Computer Network Security Document Search in MobiCloud using Hadoop Framework Sayan Cole Jaya Chakladar Group No: 1.
Windows Small Business Server 2003 Setting up and Connecting David Overton Partner Technical Specialist.
SUPERVISORS : Mr. Huynh Anh Dung Mr. Nguyen Tat Trung STUDENTS: Nguyen Thanh Long Nguyen Ky Thanh Duong Quynh Hoang Thi Minh Sau.
MEDIU Learning for HE Ahmad Nimer | Project Manager.
3TC Company e-Restaurant Project management plan lick to add Title 2 Contents Introduction add Title 1 Requirement Specifications 3 Design Description.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Personalization with user’s local data Personalizing Search via Automated Analysis of Interests and Activities 1 Sungjick Lee Department of Electrical.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
Connecting with Computer Science2 Objectives Learn how software engineering is used to create applications Learn some of the different software engineering.
Members: Trần Huy Thường Trần Mạnh Cường Đào Anh Thư Nguyễn Duy Tiến.
Library Online Resource Analysis (LORA) System Introduction Electronic information resources and databases have become an essential part of library collections.
8/23/2012 FPT University1. Agenda  BTS Introduction  BTS Structure  BTS Functions  BTS Summary  BTS Demo 8/23/2012 FPT University 2.
SOFTWARE. Software… Instructions that are stored electronically that tell the computer what to do.
Authors: Yutaka Matsuo & Mitsuru Ishizuka Designed by CProDM Team.
Online Newspaper CMS 1 Date: 27/12/2012. Contents Introduction Project Management Requirement Specifications Design Description Test Documentation Summary.
Capstone Project FPT University VIETNAM TRAVEL SHARING NETWORK Supervisor: Mr. Nguyen Van Hien Team member: Dao Dang Dan Tran Van Su Nguyen Van Nam Nguyen.
Programming Contest Management System Supervisor : Lecturer Phan Tr ư ờng Lâm Students : Hoàng Quang Mạnh Trần Đình Tuấn Nguyễn Thành Trung Phạm Thị Hồng.
LOGO Song Identification System Team members: Nguyen Ngoc Tan Ho Vinh Thinh Nguyen Huu Duy Nguyen Hoang Diep Nguyen Trong Dai Le Thanh Tung Supervisor:
ITMT 1371 – Window 7 Configuration 1 ITMT Windows 7 Configuration Chapter 8 – Managing and Monitoring Windows 7 Performance.
Hanoi - December 2012 Capstone Project. Project Team Supervisor: Mr. Nguyen Hong Ky FLIS Team: Pham Hoang Phuong Chu Dinh Nam Pham Van Quyen-
Maintaining and Updating Windows Server 2008 Lesson 8.
DEAL AGGREGATOR Supervisor Supervisor: Trần Đình Trí Students Students: Nguyễn Mạnh Huy Nguyễn Thanh Thủy Nguyễn Quốc Tuấn Đinh Văn Thể Đỗ Duy Việt.
Smart Navigator Application Supervisor: Mr. Phan Truong Lam.
Supervisor: Tran Dinh Tri Group Members: Duong Ngoc Nhat-NhatDN01687 Nguyen Quang Minh-MinhNQ01717 Nguyen Quang Minh-MinhNQ01717 Duong Hoang Nam-NamDH01552.
Supervisor: Mr. Tran Binh Duong Students: Nguyen Duc Thuong Nguyen Duc Thuong Duong Hong Loc Duong Hong Loc Chu Minh Tung Chu Minh Tung Pham Van Khanh.
SMART HOME Capstone project introduction Capstone project _Star team. Dec-12 1.
Interesting Fact Group Members Cù Hữu Hoàng Đặng Ngọc Dũng Đặng Việt Hùng Phạm Đức Vũ SE01966 SE02251 SE02063 SE02129 SupervisorBùi Ngọc Anh MSc.
ROBUST MARKETING. RM TEAM MEMBER Mr.TrungNT Supervisor DucMA Developer HaDV QA & Tester Leader HuongPM QA & Tester TungNT Developer AnhND Project Manager.
RFH Team Supervisor: Mr. Phan Trường Lâm Nguyễn Việt Nam – Cao Quốc Hưng – Võ Hoàng Việt – Trần Thị Bích – Nguyễn Tiến Chung -
Capstone Project Chatting secure on Android. Introduction Project Management Plan Requirement Specification Software Design Specification Testing & Evaluation.
LOGO Supervisor: Mr.Huỳnh Anh Dũng Students: Nguyễn Công Tuyến Nguyễn Cảnh Phương Phạm Thị Hằng Bùi Thị Huệ Trần Đức Bình Nguyễn.
House Finding Management Supervisor: Mr. Trần Đình Trí & Avengers Team 1.
ISS Team Group Member ◦ Nguy ễ n Nh ậ t Minh ◦ Nguy ễ n Kh ắ c Khu ◦ Ph ạ m Ng ọ c Hi ế u ◦ Nguy ễ n Ng ọ c Khánh ◦ Nguy.
Cemetery Information Management System CEMETERY INFORMATION MANAGEMENT SYSTEM Supervisor: Mr.Trần Bình Dương Team Members:  Phạm Văn Bình  Ngô.
LOGO AutoCarParking Capstone Project. LOGO Project Role HungPD Supervisor Huynb Project Manager, Developer Truongpx Developer Tuanhh Developer, tester.
LOGO Supervisor: Mr. Tran Binh Duong Students: 1 4. Nguyen Huong Thanh 5. Pham Thi Bich Thuy 6. Nguyen Thi Thu Huyen Funny Contents Sharing Community 1.
Architecture Review 10/11/2004
Coach Route Searching System
Capstone Project W-CMS `.
Supervisor : Trần Vinh Thu
W3 Status Analyzer.
Software Documentation
SharePoint 2019 Changes Point of View.
Configuring Internet-related services
Data Mining Chapter 6 Search Engines
Presentation transcript:

Supervisor: Mr. Phan Trường Lâm Supervisor:

Team information

Agenda Introduction Project plan System Requirement Specifications System Analysis and Design Testing Deployment and User Guide Summary Demo and Q&A

Introduction Initial IdeaLiterature Review of Existing SystemProposal & Product

Initial Idea

We decide to develop a new system that integrated:  Collect documents  Organize these documents  Extract keyword  Ranking  Searching

Literature Review of Existing System  Methods that these websites use to build their systems: Big database Search Ranking and highlight return results Compare documents to detect plagiarism

Literature Review  Achievements of the existing systems Attractive Easy to use Speed & Reliability Quality Results Ensuring Security Awareness  Limitations of the existing systems  Costs  Privacy

Proposal Collect and manage Capstone projects Support looking up Capstone projects Avoid repeating and copying idea Ranking results Refer to other materials Friendly interface like Google Chipper to build Free to use Public for everyone Inside and outside University

Product (in future) Mobile application Web application

Project Plan Development environment Process Project organization Project schedule Risk management

Development Environment Gb of RAM 100Gb of hard disk Core 2 Duo 2.0 GHz 2 Gb of RAM 100Gb of hard disk Core 2 Duo 2.0 GHz HARD WARE SOFT WARE

Process Follow Waterfall model

Project organization

 Controlling and Monitoring Meeting Assign task Tracking task Issue resolve Review task Report Project organization

 Communication control  Online activity Chat Phone  Offline activity Kick-Off project Team building Project organization

Project Schedule Overall plan

Risk Management RiskManagement People risk Estimation risk Technology risk Requirement risk Schedule risk

System Requirement Specifications User Requirements System Requirements Non-functional requirements

User Requirements  Lecturers and Students: Search project documents. Download documents.  Librarians: Edit profile. Search documents. Add/Edit/Delete document. Add/Edit/Delete category.  Administrator Edit profile. Add/Edit/Delete account.

User Requirements  Other requirement Searched results will be ranked. Document has following information:  Name  Author  Supervisor  Category  Description

User Requirements Input files:  Keyword file  Abstract file  Full document file  Other materials

System Requirements  Communicate via the protocol HTTP to complete interactions based on service with client computers and use standard protocols.  Configuration  Server: Windows Server 2008 operating system.NET framework 3.5 SQL server 2008 IIS 7  Client: Web browser

Non-functional Requirements Usability Availability Security Reliability Performance Security Maintainability

System Analysis and Design Architectural design Detail design Database design Coding convention Extract Keyword algorithm Ranking

Architectural design Overall architecture MVC architecture design pattern

Detail design CProDMS Component Diagram

Database design Entity diagram

Coding convention Follow:  Microsoft.NET Library Standards  FxCop rules and Code Analysis for Managed Code Warnings

Extract Keyword Algorithm Introduction Study Algorithm Evaluation Keyword Extraction from a Single Document using Word Co-occurrence Statistical Information (YUTAKA MATSUO and MITSURU ISHIZUKA) (Dec. 10, 2003)

Algorithm – What is the keyword? Position Meaning Frequency

Algorithm – Step by step Preprocessing Processing Discard stop words Stem Extract frequency Calculate X’ 2 value Calculate X’ 2 value Output Expected probability Select frequent term

Algorithm – Studying Original Text Information is the most powerful weapon in the modern society. Every day we are overflowed with a huge amount of data in form of electronic newspaper articles, s, web pages and search results. Often, information we receive is incomplete, such that further search activities are required to enable correct interpretation and usage of this information. Example: Information powerful weapon modern society day overflowed huge amount data electronic newspaper articles s web pages search results Often information receive incomplete such further search activities required enable correct interpretation usage information Stemmed Words Information is the most powerful weapon in the modern society. Every day we are overflowed with a huge amount of data in form of electronic newspaper articles, s, web pages and search results. Often, information we receive is incomplete, such that further search activities are required to enable correct interpretation and usage of this information. Discarded Stop Words Step1 Step2 Using Porter Stemming Algorithm Information is the most powerful weapon in the modern society. Every day we are overflowed with a huge amount of data in form of electronic newspaper articles, s, web pages and search results. Often, information we receive is incomplete, such that further search activities are required to enable correct interpretation and usage of this information. Informat power weapon modern societi day overflow huge amoun data electronic newspaper articl web page search result Often informat receive incomplet such further search activ requir enable correct interpret usag informat

Algorithm – Studying The top ten frequent terms (denoted as G) and the probability of occurrence, normalized so that the sum is to be 1. Select frequent Term As study, number of keyword is about 10% number of term in document and no more than 30 terms.

Algorithm – Studying Two terms in a sentence are considered to co-occur once. Co-occurrence and Importance Example: The imitation game could then be played with the machine in question and the mimicking digital computer and the interrogator would be unable to distinguish them. “imitation” and “digital computer” have one co-occurrence

Algorithm – Studying Co-occurrence and Importance

Algorithm – Studying The degree of biases of co-occurrence can be used as a indicator of term importance Co-occurrence and Importance

Algorithm – Studying The statistical value of χ 2 is defined as p g Unconditional probability of a frequent term g ∈ G (the expected probability) n w The total number of co-occurrence of term w and frequent terms G freq (w, g) Frequency of co-occurrence of term w and term g

Algorithm – Studying p g (the sum of the total number of terms in sentences where g appears) divided by (the total number of terms in the document) n w The total number of terms in the sentences where w appears including w We consider the length of each sentence and revise our definitions

Algorithm – Studying

the following function to measure robustness of bias values Subtracts the maximal term from the X 2 value

Algorithm – Studying

To improve extracted keyword, we will cluster terms Two major approaches (Hofmann & Puzicha 1998) are:  Similarity-based clustering If terms w1 and w2 have similar distribution of co-occurrence with other terms, w1 and w2 are considered to be the same cluster.  Pairwise clustering If terms w1 and w2 co-occur frequently, w1 and w2 are considered to be the same cluster. Eg: Monday is a day in week. Tuesday is a day in week. Wednesday is a day in week Algorithm – Studying

Similarity-based clustering centers upon Red Circles Pairwise clustering focuses on Green Circles Algorithm – Studying

Where: Similarity-based clustering Cluster a pair of terms whose Jensen-Shannon divergence is and: Algorithm – Studying

Cluster a pair of terms whose mutual information is Pairwise clustering Where: Algorithm – Studying

Algorithm – Evaluation Precision: Ratio of right keyword to number of keywordCoverage: Ratio of indispensable keyword in list to all the indispensable terms Frequency index: average frequency of keyword in list

Ranking – Why? Ranking Result

Ranking

Ranking Use rank calculate formula Term in a collection documents: ( Automatic Keyword Extraction for Database Search First examiner : Prof. Dr. techn. Dipl.-Ing. Wolfgang Nejdl Second examiner : Prof. Dr. Heribert Vollmer Supervisor : MSc. Dipl.-Inf. Elena Demidova ) R(t) = Fd(t)*log(1 + N/N(t)) (1) Rank of Term t in all the collection Total number of documents in the collection Frequency of Term t in the given document Total number of documents that contain Term t Ranking formula : Rank = d * Rd(t) / R(t) (2) =>Rank = d * Rd(t) / (Fd(t)*log(1 + N/N(t))) (3) reliability coefficient Rank of Term t in document, which extracted by Extract Service

Searching

Testing V - model

Testing

Testing NoTesterModule codePassFailUntestedN/ANumber of test cases 1 AnhNT Master Page AnhNT Home Page AnhNT Search Result AnhNT User Account AnhNT Error Page NamH Category NamH Document NamH Authenticated NamH User Document Detail Sub total Test coverage % Test successful coverage % Test result

Deployment  Package Source Code  Client side  Server side

User guide

Summary  Strong point Enthusiasm Creative Cope with change  Weak point Lack of technical skill Lack of management skills  Lessons learned Improve technical & management skills Release on-time product with the restriction of time and resource Improve communication skills & problem solving

Demo & Q&A