Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Fighting Comment Spam Employing the site’s audience, coding skills, and free distributed solutions to fight back.

Similar presentations


Presentation on theme: "1 Fighting Comment Spam Employing the site’s audience, coding skills, and free distributed solutions to fight back."— Presentation transcript:

1 1 Fighting Comment Spam Employing the site’s audience, coding skills, and free distributed solutions to fight back.

2 2 Brickboard.com A friendly and open site focused on Volvo owners and enthusiasts. Has been around since 1997. Introduced membership in 2001 but still allow “anonymous” posting. Have always held anti-spam efforts high, introduced an anonymous email system much like Craig’s List. Makes supplemental income with limited demand for my time. 100% custom web app code. No WordPress, phpBB or other. (Not easy to add to your target list with Google hacking.)

3 3 Why Comment Spam? Used to be enough to play with meta tags and to setup sites that pointed to other sites to improve page rank. Google changed that. Stealing Google juice. The little sites add up to a lot of relevance. They mean a lot to Google advertising and they mean a lot to the spammers. Sites with questionable legality need to be found quickly when moving from address to address. Tricking affiliate (pay per click/PPC) programs for fun and profit. Porn, pills, and casinos.

4 4 Under Siege

5 5 Distressed Users Users send emails alerting me to trouble. Deleting as I find them. Easy enough. There’s a real risk that if these are allowed to stay on the site. People will abandon a site with too much irrelevant noise.

6 6 Shooting Gallery I use RSS feeds that are used to both alert me of new spam (as marked by users) and allow me to go to the posting and delete it with my admin account.

7 7 What am I up against? A market for speedy top 10 search result listings. They aren't spammers, they're "search engine optimizers." Automated, scripted, smart tools. Someone sets the tool up with all of the targeted sites and all of the spam content and sets it loose. “Work from home” industries with turk-in-a- box tools and business models.

8 8 Joining the Fight with Code Easy. Check the submitted post for Viagra, Cialis, and so on. Black listing. People will legitimately use the words so some qualifiers are needed. I check for a link and the word among other qualifiers. $body=~ /http:\/\//i && ($body =~ /zulubucks/i || $body =~ /obsq/i || $body =~ /conegliano/i || $body =~ /protezione/i || $body =~ /\bcialis/i || $body =~ /viagra/i || $body =~ /zithromax/i || $body =~ /doxycycline/i || $body =~ /accutane/i)

9 9 Joining the Fight with Code Some were just stupid and decided to send the tags field with the same value that was unlikely to be sent by another user. These were easily stopped. $tag == “VDD 122 TACH” # Thanks!

10 10 Joining the Fight with Code Matching IP manually isn’t valuable at all. Analysis shows that what appears to be the same spammer script uses a different IP address for every POST. Spammers use open proxies and botnets. mysql> select ip,count(ip) from posts where deleted='Y' group by ip limit 1000,10; +----------------+-----------+ | ip | count(ip) | +----------------+-----------+ | 200.29.167.10 | 1 | | 200.31.123.162 | 1 | | 200.31.42.3 | 1 | | 200.35.164.226 | 1 | | 200.35.237.189 | 1 | | 200.35.36.143 | 1 | | 200.35.75.93 | 1 | | 200.37.212.8 | 2 | | 200.41.203.90 | 1 | | 200.42.216.147 | 1 | +----------------+-----------+

11 11 Anti-Spam Tricks Tried Commented hidden fields that only a stupid bot would submit. E.g.: ”>--> Moving the post submit URL. Nofollow meta tags and a links. Don’t let them have your google juice! Disallowing if the POST wasn’t preceded by a GET(form). Doesn’t work on all since some scripts do GET/POST. Timestamps and signatures on forms. Doesn’t work on all since some scripts do GET/POST. Using javascript to set a flag on the form. Doesn’t always work. Counting URLs. Most users wouldn’t submit more than one or two. Counting real words.

12 12 Other Anti-Spam Tricks Reverse Turing including CAPTCHA. IRRITATING to users! Not accessibility friendly. Easily beat. Adds expense to their operations, however. Redirects instead of direct links to forms and posts. Increases load. Probably not effective. CSS links. A script would have to have some sophisticated rendering to follow. Could backfire and block out some users.

13 13 A Distributed Solution: Project HoneyPot

14 14 Project HoneyPot HTTP Blacklist allows you to check an IP prior to allowing a post. Modeled after DNSBL (email UCE systems that use DNS reverse lookups to flag untrusted sender IPs). Takes advantage of efforts by people in the same fight.

15 15 Project HoneyPot Uses all of the IPv4 octets to give detail about IP: last seen, threat level, spammer type. This detail allows you to set the bar as needed.

16 16 Detecting Bad Traffic Using the HoneyPot A user submits a form. Check IP against HoneyPot server using DNSBL approach (backend call to a remote server). Compare results with established bar (# of days, risk, etc.). When a match is found, redirect the spammer to a local URL that executes a Project HoneyPot script. The script sends the request and any data it can gather to the HoneyPot server on the backend (the spammer never leaves my site).

17 17 Giving Back It was very easy to use my other efforts to push spammers into the HoneyPot. This has consistently made me a top 10 contributor to ProjectHoneyPot.

18 18 Driving the Bad Traffic Into the HoneyPot A user submits a form. Any number of rules including keyword blacklists and expired form URLs indicate that this is a spammer. Redirect the spammer to a local URL that executes a ProjectHoneyPot script. The script sends the request and any data it can gather to the HoneyPot server on the backend (the spammer never leaves my site).


Download ppt "1 Fighting Comment Spam Employing the site’s audience, coding skills, and free distributed solutions to fight back."

Similar presentations


Ads by Google