Finding exploit attempts

As I get down to just a dozen or so spam hits a day on my sites, I am getting bored with it all. Spammers don’t seem to run out of new servers to make into zombies, but it is trivial to add them to the list. Every day I find a new hosting company with a website that has been hacked and is sending out robots to annoy me. I have most of the process automated, but I have to keep one last step manual because RSS GRAFFITI uses an AWS instance and looks just like a spam robot. They keep getting sucked into my nets. Also, for some reason, I keep catching Yahoo. I think they are checking a login file that makes my programs think Yahoo is trying to hack my site. I can’t make it so they are ignored because there are spammers which spoof googlebot, slurp and bingbot meta tags.

Since detecting new spammers is mostly automated, I am examining the logs a little closer. I get several thousand unique hits a day, which is not that hard to browse. I started looking closer at who was not hitting my sites. I summarized the 404 hits. I found a few places where I had mistyped a url, but I get lots of hits on something called “connector”. This is vulnerability in fckeditor, a web page editor. Looking at what the same robot did, I was able to find that there is also a problem in an upload.php, browse.php, several asp files, and a plugin called “newtype”. I added these to my automated blocker programs and now they are happily sending 403 errors to 20 or 30 robots a day who are trying to hack my site.

I need to find a better list of these types of exploits. I can identify some of them based on manual examination of the logs, but it would nice to grab a good long list and include it in my software.

I am getting many hits on xmlrpc.php. This is harder to work with. I know that xmlrpc can be used to ping back another site, but I have ping backs turned off. I can’t assume that any hit on xmlrpc is a spammer. I’ve seen where xmlrpc has been used in DOS attack by using the ping back functionality, but I would like to know if there is a way to tell a good hit from a bad hit. I am banning anyone from China, Russia, or eastern Europe who hits xmlrpc.php, just on principle.

I have also found that robots tend to use METHOD=”HEAD”. This a method that is like method=”GET”, but the server does not return anything except the headers. Robots use this to find out if a file exists or not before they try to get it. I guess that it saves quite a bit of bandwidth. So far, it looks like only spam robots are using it, but some browser or app might use it and I want to be sure before I ban anyone for using method=”HEAD”.

What I need now is a truly large WordPress site with a page rank of 5 or better who would not mind me installing this stuff on his site and playing with the results (and occasionally crashing for a minute or two). I need more data. My websites are most PR 3s which get a little traffic, but are not particularly a targets for spammers.

Leave a Reply