PRIVACY IN A DEMOGRAPHIC DATABASE Milestone #1 Razi Mukatren, Golan Salman.

Slides:



Advertisements
Similar presentations
Using EBSCOs Search Box Builder Tool Tutorial. Would you like to promote your EBSCOhost resources by adding an easy-to-use search box to your website?
Advertisements

1 The Social Survey ICBS Nurit Dobrin December 2010.
Data Mining and Text Analytics By Saima Rahna & Anees Mohammad Quranic Arabic Corpus.
Year Up Professional Resources, PBC
Terrapin Trader Transformation by Oliver Stohr - Olga Kuznetsova Tyler Cordrey - Brett Holbert December 9, 2008.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
CAREWare Training Webinar Canned Reports (Prebuilt Reports)
Checking for Eligibility and Successfully Completing a Claim with the OHCA Secure Site For ResCare Providers Department of Mental Health and Substance.
4. FREQUENCY DISTRIBUTION
Faculty of Sciences and Social Sciences HOPE Structured Problem Solving Week 5: Steps in Problem Solving Stewart Blakeway FML 213
Calendar Browser is a groupware used for booking all kinds of resources within an organization. Calendar Browser is installed on a file server and in a.
1 An Introduction to IBM SPSS PSY450 Experimental Psychology Dr. Dwight Hennessy.
LSP 121 Week 2 Intro to Statistics and SPSS/PASW.
Under age 16? In an institution? On active duty in the military? Been actively searching for work? Available to start a job? Not surveyed by the Bureau.
Academic Advisor: Dr. Yuval Elovici Technical Advisor: Dr. Lidror Troyansky ADD Presentation.
Searching the University of Alberta Library’s Statistics Canada-based Websites 2001 Census of Canada Canadian Centre for Justice Statistics Canadian Business.
WAGES Eastbourne Citizens Advice Bureau Financial Literacy Wages
S519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 8: Significantly significant.
Copyright©2004 South-Western 28 Unemployment and Its Natural Rate.
Business Statistics If you are interested in business statistics, the Census Bureau’s web site is the place to start. In the Census Bureau’s web site you.
National Institute of Science & Technology Algorithm to Find Hidden Links Pradyut Kumar Mallick [1] Under the guidance of Mr. Indraneel Mukhopadhyay ALGORITHM.
Think of a topic to study Review the previous literature and research Develop research questions and hypotheses Specify how to measure the variables in.
1 ThinkLink Learning Online User Manual for Predictive Assessment Series Go to www2.thinklinklearning.com/pas4mlwk. Click Educator Login. Your username.
The Field (California) Poll. What is the Field Poll? The Field Poll was established in 1947 by Mervin Field. An independent non-partisan survey of California.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
LABOR MARKET INDICATORS  Current Population Survey Every month, 1,600 interviewers working on a joint project of the Bureau of Labor Statistics (BLS)
What is Statistics? Chapter GOALS 1. Understand why we study statistics. 2. Explain what is meant by descriptive statistics and inferential statistics.
Chapter 1: The What and the Why of Statistics
4/22/2017 5:36 PM EViews Training Creating Workfiles.
Installing Ricoh Printers There are two basic steps: 1. Acquire the drivers. 2. Use the Windows Add Printer Wizard to install the drivers within the operating.
V |© OverDrive, Inc | Page 1 Track circulation and make informed purchases using the Reports feature in Content Reserve. Contact:
Copyright 2007, Paradigm Publishing Inc. ACCESS 2007 Chapter 4 BACKNEXTEND 4-1 LINKS TO OBJECTIVES Query Design Query Criteria Modify a Query Using OR.
SEMS allows you to create reports based on your files. This guide will give you the steps to create a report on the SEMS site using the Question Wizards.
Just A Few More Fun Objectives 1 Having Some Fun With Java Script 2 Using Style Sheets.
BODY MASS INDEX. BMI CHARTS Several charts have been develop for use in assessing the BMI of clients: The chart below provides a quick and easy way to.
Continuous Distributions. The distributions that we have looked at so far have involved DISCRETE Data The distributions that we have looked at so far.
SEO  What is it?  Seo is a collection of techniques targeted towards increasing the presence of a website on a search engine.
Statistics and Quantitative Analysis U4320 Segment 2: Descriptive Statistics Prof. Sharyn O’Halloran.
A student guide To completing Level 1 & 2 portfolios.
DC 2004 Metadata Generation and Accessibility Auditing Liddy Nevile La Trobe University, Australia Mail
GOLAN SALMAN RAZI MUKATREN PRIVACY IN A DEMOGRAPHIC DATABASE PROJECT PLAN.
In Excel 2&3 we saw that females’ brains are significantly smaller than the males'. If we standardise brain size relatively to body weight, then there.
Razi Mokatren Golan Salman Privacy in a Demographic Database.
Microsoft Access Database Software.
Razi Mukatren Golan Salman 1 Workshop in information security Privacy in a Demographic Database Lecturer: Dr. Eran Tromer Teaching assistant: Mr. Nir Atias.
Damian Tamayo Tutorial DTM Data Generator Fall 2008 CIS 764.
Diagnostic Pathfinder for Instructors. Diagnostic Pathfinder Local File vs. Database Normal operations Expert operations Admin operations.
Using As series of training presentations How to edit an existing project September,
SESSION 3.1 This section covers using the query window in design view to create a query and sorting & filtering data while in a datasheet view. Microsoft.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
REVIEW OF UNIT 1 1) The table displays the number of videos rented. Number of Videos Rented Number of Families a. How many families.
There are seven main components of a database in Access 2000: Tables. Use tables to store database information. Forms Use forms to enter or edit the information.
Basic Facts about the Census Population count of the nation The Census is conducted every 10 years. Helps the government meet the needs of the community.
Chapter 11 Understanding Randomness. What is Randomness? Some things that are random: Rolling dice Shuffling cards Lotteries Bingo Flipping a coin.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 1 Chapter Descriptive Statistics 2.
Presenting By CH . MADHURI(12QU1D5806) Under the supervision of
Microsoft FrontPage 2003 Illustrated Complete Integrating a Database with a Web Site.
MINDBODY ONLINE Business Management Application. Description The Mindbody Business Application is an all inclusive business application that allows business.
What is Web Information retrieval from web Search Engine Web Crawler Web crawler policies Conclusion How does a web crawler work Synchronization Algorithms.
Using DataCounts! And WebCHIP. Welcome! Introduction Session I – How to teach with contingency tables – DataCounts! tutorial – Interactive DataCounts!
Term Project Math 1040-SU13-Intro to Stats SLCC McGrade-Group 4.
HMIS Management Reports and Data Quality Training Last updated:1/26/2012.
{ Chapter 3 Lesson 9 Z-Scores  Z-Score- The value z when you take an x value in the data set, subtract the mean from it, then divide by the standard.
Chapter 29 Conducting Market Research. Objectives  Explain the steps in designing and conducting market research  Compare primary and secondary data.
Converting CSV Files to Excel
Directions MUST BE DONE IN PENCIL ( Use the large sheet of butcher paper provided 18 x 24 inches) Label North, East, South, and West on the Map. Size of.
How to Run a DataOnDemand Report
Descriptive Statistics
EQ: What effect do transformations have on summary statistics?
G.A. Project Steps GO IN THIS ORDER!!
Presentation transcript:

PRIVACY IN A DEMOGRAPHIC DATABASE Milestone #1 Razi Mukatren, Golan Salman

MILESTONE #1 We started the privacy analysis of the Data. we manually generate tables from the Israel Central Bureau of Statistic's website - more than 40 tables. Understanding the specific technique that the CBS uses for their website. From the pulled Data, we learned the tables, we manually looked for intersection between the data in order to understand more about the surveys Next Step: pulling the data/tables from the website using a script.

THE PRIVACY ANALYSIS OF THE SYSTEM We run manually tests, we saw it’s possible to create information about specific participant in the survey. For example: Taking all 7,500 participants data and filtering only those who: 1) Studied some subject that connects to education. 2) Has incoming profit of more than 24,000 NIS per month.

FOR EXAMPLE :  We generated 10 Tables and use the following filters:  Arab villages and  Religion – Muslims.  Filter used to reduce the size of the table, what we mean that we will get the info only related to the above Filters.

 The survey has only 12 people who live in Arab villages and Muslims (we can learn this from Table #1. Six of them are men, and six are women. Also, we can see the ages of those 12 people in the tables below.  Now we’ll look in the tables which includes in total 12 participates, since they for sure will include all the 12 participates from the Table #1.

 Table 5,7,9,10 includes all the 12 participates.

 From table #5 we can learn that for example the participates between age one his height the second  Now if we go back to table #1 we will see that one is man one is women, to see who is who we will generate new table includes same filters and we will add second column for gender  Will name it table 11, from table 11 we can see the Women her height is , and the men  Let’s focus only on this 2 participates for example because one of them appears in all the 10 tables (we have age in all the 10 tables).

 From table #2, we can see that one of them hired worker, let’s generate new table (called table12) and check who is the hired worker the man or the women. We can see from table number 12 that the man is the hired worker.  So far we know about the Man, his age 20-24, Muslim, from Arab village, his height , and hired worker.

 From table #3 and table #4, we can learn that he work in the constructions and he far about min driving from his work.  From table #6, both of them the man and the women study years

 From table #7 one of them weight and the other , let’s generate new table (13) and check which one is the man, from table 13 we can see that the man weight between Kg.

 From table #8 he makes from 5K – 6K NIS gross.  Table #9 he is from the north.  Table #10 we need to generate new table #14, from table 14 we can see that his family includes more than 7 members.

IN CONCLUSION:  We know about the Man,  His age  Muslim  From Arab village,  His height  Hired worker  Distance from work min driving  Studying Years11-12  he weighs Kg  His salary 5K-6KNIS gross per month  he is from the north  His family includes more than 7 members.

WHERE ARE WE GOING FROM HERE NEXT STEPS  Two major points (the plane is to finish them until milestone 2):  automatic extracting and generating survey’s tables from the CBS (it will be the first script).  Start working in the algorithm for searching in the data for the “1”, and try to find intersections between this information (it will be the second script).

THE FIRST SCRIPT AND MAJOR ISSUES  The website support only IE.  We though that we can use a macro script using FF or Chrome, but since the IL Governments sites support only IE so we can’t use the macros scripts.  Now we are testing alternatives:  Either Scrapy:  used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.  or curl in bash  or java with - JTidy is a Java port of HTML Tidyhttp://jtidy.sourceforge.net/HTML Tidy