Download presentation
Presentation is loading. Please wait.
Published byAllan Norman Modified over 9 years ago
1
Secure Search Engine Ivan Zhou Xinyi Dong
2
Project Overview The Secure Search Engine project is a search engine that utilizes special modules to test the validity of a web page. These tests consist of verifying the web page's certificates and determining if the page in question is a phishing site. Changed goal : Setup a proxy in the cloud that would be the medium to communicate between the client and the SSE.
3
Detailed architecture Components: Browser / Phone Proxy SSE Certificate verification module Phishing status verification module
4
Detailed architecture Cell Phone Browser SSE Internet Proxy Certificate Verification Phishing Verification
5
Project Description Migration of a different SSE project to the Mobicloud. Test and modify if necessary the SSE in this new environment. Set up a proxy and its code to make it able to communicate to the SSE server and do proper work.
6
Roadmap 2/62/152/273/103/204/84/18 Migration Testing / Fixing Background Crawler Android SSE Migration of SSE (another version) Modify and Test SSE Setup Proxy Proxy SSE Communication
7
Task Allocations Ivan: Migration of another version of SSE Modify and test new SSE Xinyi: Setup Proxy Communication between SSE & Proxy
8
Technical Details for task 1 Task 1: Migrate the existing SSE project from a local environment to Mobicloud All software installation: Apache Tomcat, MySQL, Netbeans, SVN, Java JDK, Jython. Configuration: VM’s Internet connection, VNC configuration, PATH for Java/Tomcat/SVN, connection for MySQL server Publish website to Apache Tomcat
9
Technical Details for new task 2 Two parts need to be tested carefully Phishing Filter Crawler Phishing Filter Checks with the database if it is a phishing site or not See if a third party site(phishtank) has said it is a phishing site Compute the confidence ourselves.
10
Technical Details for task 2 Crawler.py: A Python implementation of java code to crawl webpage’s information Seeds in Database Crawl domain Crawl domain path Crawl child links Difficulties encountered: Webpages’ particularity (Localhost) (solved) Only connect with port 443. Port 80? (solved) Unreasonable logic in crawler.py(depth..) (exploring) Other problems (exploring)
11
Technical Details for task 3 Develop a background process to frequently update the bank database for the crawler. crontab -e Syntax: min|hour|day|month|weekday|command 00*** /sse/crawler.py
12
Technical Details for task 4 Create an Android component to integrate SSE into a mobile device (tentative). All applications are written using the Java programming language. Android SDK. Eclipse: ADT Plugin. Current firmware v2.1 update 1 on Droid. Newest firmware available v2.2.1
13
Technical Details for task 5 Migration of another version of SSE Reasons: Previous SSE was buggy and therefore not stable. Previous SSE’s phishing filter was not working. Previous SSE was not working properly on some sites. Same procedure as last version, but use Eclipse IDE instead of Netbeans IDE.
14
Technical Details for task 6 Modify and test new SSE Cleanup multiple copies of code. Broken PhishingFilter / Google Pagerank Used to point to: http://zquery.com/api?q= Now uses (limited): http://webinfodb.net/a/pr.php?url= Additional: http://api.exslim.net/pagerank/check Change of threshold value
15
Technical Details for task 7 Setup Proxy We set another VM in our mobicloud system as proxy. The proxy c-icap forward request to web server. Use VPN to connect from client to the proxy.
16
Technical Details for task 8 Communication between SSE & Proxy At the proxy, add code in check_url module to get features: Request SSE server with CURL and get returned value. Parse the returned webpage and analyze which kind of site it is(hasCertificate, isPhishing). Warn and block the do-not-has-certificate and phishing site.
17
Demo
18
Conclusion The project is completed. The SSE server is modified from “do not have phishing checking” to being able to check both certificate and phishing site. The proxy takes the computation load off the client side. So now the requests to the SSE, and parsing and analyzing of the results, can all be done at the proxy level.
19
Thank you! Comments & Questions.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.