Question for those more IT proficient than I. If I am wrong about a certain block, I want to eliminate a particular rule that has blocked a human. But if I am right, I want them to figure out what is wrong with their browser.
First let me describe two of the rules I use to block access to my site.
- If a browser tries to load “…./crosdomain.xml” I block that connection.
- If a browser presents a cookie with the name “mp_72366557fd3f1bd4fc7d4bfca5cd0a12_mixpanel” to my site, I block that connection. The reason is that there is no javascript, php or anything else that sets a cookie of that name at my site. (Or at least I think there isn’t!)
Note: Every single time I have seen a browser present that cookie name it has gone on to request “…./crosdomain.xml”. Also, “…./crosdomain.xml” does not exist anywhere on my site.
However, if I am wrong about this rule, I would like to lift it. Otherwise, I want to know what and why something is trying to connect to that resource– which does not exist. Is this something like a favicon.ico that I ought to create? Or what? I note that only a very small fraction of browsers try to hit it– but obviously, if requesting that is becoming some sort of routine, I don’t want to be blocking people.
#: 66276 @: Tue, 19 Jun 2012 07:19:43 -0700 Running: 0.4.10a1
Host: ----blanked out--
IP: ----blanked out--
Score: 2
Violation count: 2
Why blocked: ; You asked for crossdomain.xml ? Hack. bad cookie:(mp_72366557fd3f1bd4fc7d4bfca5cd0a12_mixpanel,); cookies:(good:7 other: 0 length:0) ( 0 ); c= AU
Query:
Referer:
User Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; AskTbIMH6/5.13.1.18261)
Reconstructed URL: http:// rankexploits.com /crossdomain.xml
(Note: I do have many draconian blocks. If people ask nicely I’ll look into it. But I did get whammied after a complete stranger on Bezequint requested I unblock that site…. so I am super cautious if someone whose name I don’t recognize emails me or if they resort to irony in their first email and so on. Sorry….but… well.. In this case, someone asked nicely. I’m sure he’s human. I’d like to find out if my block is wrong. )
Update: Some are just suggesting I just ignore these attempts to access /crossdomain.xml for no good reason. However, I would like to point them to what Wikipedia which mentiones this resource in their article aboutCross Site Forgery which they describe as “under-reported”. Here’s the relevant bits :
The attack works by including a link or script in a page that accesses a site to which the user is known (or is supposed) to have been authenticated.[1] For example, one user, Bob, might be browsing a chat forum where another user, Fred, has posted a message. Suppose that Fred has crafted an HTML image element that references an action on Bob’s bank’s website (rather than an image file), e.g.,
If Bob’s bank keeps his authentication information in a cookie, and if the cookie hasn’t expired, then the attempt by Bob’s browser to load the image will submit the withdrawal form with his cookie, thus authorizing a transaction without Bob’s approval.
Note: Cookies are used to identify which users are “known” to be authenticated. That’s why I worry about browsers presenting cookies I did not set.
CSRF attacks using image tags are often made from Internet forums, where users are allowed to post images but not JavaScript.
Note that users are allowed to post images here– but not JavaScript. This is common to WordPress blogs and would be a reason why a script kiddie might try this attack at a blog (or even why someone might just write malware that rides along making attemps wherever it goes).
Web sites have various CSRF countermeasures available:
- Requiring a secret, user-specific token in all form submissions and side-effect URLs prevents CSRF; the attacker’s site cannot put the right token in its submissions[1]
- Requiring the client to provide authentication data in the same HTTP Request used to perform any operation with security implications (money transfer, etc.)
- Limiting the lifetime of session cookies
[Since I don’t set these, I’m watching for faked cookies whose lifetime I obviously cannot control.]- Checking the HTTP Referer header or(and) Checking the HTTP Origin header[16]
- Ensuring that there is no clientaccesspolicy.xml file granting unintended access to Silverlight controls[17]
- Ensuring that there is no crossdomain.xml file granting unintended access to Flash movies[18]
[I don’t have this file. But I am suspicious when something tries to request it for no reason. –l]- Verifying that the request’s header contains a X-Requested-With. Used by Ruby on Rails (before v2.0) and Django (before v1.2.5). This protection has been proven unsecure[19] under a combination of browser plugins and redirects which can allow an attacker to provide custom HTTP headers on a request to any website, hence allow a forged request.
There is more. But basically, unless I read of a justifiable reason why a browser hits /crossdomain.xml, I’m continuuing this block.
Adobe flash uses a file called crossdomain.xml to allow a flash object to access files outside the primary domain the flash ‘swf’ file was served from. Note this is spelled slightly different than you are asking about.
The mixpanel cookie seems to be associated with some kind of web analytics from http://www.mixpanel.com.
Maybe the offending browser is infected with malware using a similarly named xml file. Though my Bing/Google search didn’t turn up any reference to such a thing I didn’t look very far in the search results either.
Stephen–
I had found that site. But the cookies convention means that my domain rankexploits.com should see browers presenting me cookies I didn’t set. So I haven’t figured out why I should see that. Some browsers show me cookies for blogger_ID’s. My theory is those are bots trying to spam blogger.
I know that’s what crossdomain.xml is for. But why should anyone visiting my site try to load it? I don’t have any flash going on here.
This is my concern. If the browser is infected, it would prefer people to find the malware and disinfect their machine rather than permitting the malware to do whatever it is programmed to do when it accesses my crossdomain.xml file!
If I can learn a good reason why browsers requesting that file is ok, I’ll stop blocking them. But right now… I can’t think of a good reason. And the only reason I can think of is… their browser is infected by malware!
I’ve been involved in building web sites since 1994, and doing so as my fulltime profession since 1996-ish. I’m not the most-expert on the subject by a long way. But from where I sit, I’m not sure I understand the larger point of what you’re trying to achieve with these bans.
There are a great many non-human bots (for want of a better term) that access any website. They range from benign (well-behaved search engine spiders), to annoyingly bothersome (poorly-behaved content scrapers that ignore robots.txt and issue too many requests in too short a time), to outright malicious (scripts probing for exploitable vulnerabilities, or intentionally DOSing the host). A reasonable goal for someone operating a website is to be cooperative with the first group, and resilient in the face of the inevitable presence of the second and third groups.
The problem with going too far into the details of any given badly-behaved bot is that it’s just going to end up consuming more time and trouble than it’s worth. Sure, if you can identify one particular type of malicious behavior you can block the IP address from which it originates. You can automate the imposition of such blocks. But the more of that you do, the greater the risk you run of blocking legitimate traffic. And even without that, you’ll be sinking lots of time and mental energy into this jousting with nameless antagonists, most (all?) of whom are not giving your individual website any thought at all, but are simply running some script kiddie’s probe across a large list of hosts.
I’m probably missing something that would explain why you want to invest so much energy into this. Maybe it’s just a hobby of yours? You enjoy the idea that you are jousting with invisible antagonists, blocking their access to your content? If that’s the case, then I guess what you’re doing makes a certain kind of sense. But that’s not how I use the web, at least.
Websites work well as a tool for communicating information quickly and efficiently, where the publisher doesn’t worry too much about just who it is that is accessing it. Yes, you can selectively block access by certain individuals, or certain programs. But it’s hard to do that consistently or effectively, because the technology isn’t really set up for it, and you have limited knowledge about the enemy, and there are many, many more of them than there are of you. If you don’t want bad guys to access your content, don’t put it online. If you don’t want malicious bots hitting your website, don’t have a website.
With all that said, there are workable strategies for identifying abusive access patterns, and throttling IP addresses associated with them. A good approach for that is to impose an enforced waiting period before you’ll allow a subsequent request, with the wait time increasing in response to abuse and decaying back to 0 if the abuse stops. That’s nice because you don’t have to worry so much about false positives blocking legitimate traffic.
But again, I don’t really know why you want to do this, so I don’t know if that would be helpful.
John
I want to block bots because otherwise my site
a) gets hacked
b) goes down because the bots suck excess server resources and cause my blog to crash.
No. You can’t block XSS attacks based on IP address alone. You can’t block attempts to access “timthumb.php” using IP alone. You can’t block all sorts of things based on IP address alone. But I do have plenty of IP blocks in place.
The fact that a script kiddie that wants to hack in and turn my domain into a zombie drone to do whatever isn’t personally out to get me is irrelevant. I don’t want the ones who are attacking everyone to hack in either. I’ve spend practically no time on my solution to the hack attempts in the past month.
Because my site was crashing non stop all during Nov.- February due to the bots and also some climate blogs have been hacked and I don’t want to be hacked. The former interfered with people visiting much more than my current rules do. After implementing the rules, my real traffic is much higher than it was in Nov.-February when people arrived to find a crashed site and they constantly lost comments because the site crashed when they hit “submit”.
Out of curiosity, are you going to suggest a workable strategy? Or are you just going to tell me that they exist. I have no idea how in terms of nuts and bolts you think I could do anything more nuanced– but as far as I am aware, I can’t do most of the things people who don’t propose concrete solutions think I “should” do because I’m on shared hosting.
FWIW: I do impose an enforced waiting time after I detect a violation of “my rules”. It’s 7 days.
————- ————–
Out of curiosity, do you have an answer to my specific question? Because from my point of view, even though I’d like my false positive rate, I’ve solved the problem of constant site crashes that were occurring before I implemented a solution to my very real problem. If you can give me a information about the crossdomain.xml issue that suggests I should drop that block, I can do it.
If you’re on shared hosting without root on the server where you’re hosted, then your options for these sorts of things are going to be constrained. I don’t know enough to suggest a workable strategy for your earlier performance problem, but it sounds like you already solved that, so that’s a moot point.
I don’t have much more to suggest of a concrete nature, no. You asked for opinions from people “more IT proficient than I”. I’m not sure that I qualify there, but I had an opinion, so I offered it. True, it wasn’t an opinion on the specific thing you were asking for, but that’s not uncommon when you ask for opinions from people more knowledgeable than you: sometimes the answer is that the respondent thinks you’re asking the wrong question.
I think it’s probably not worth your time to worry about blocking people merely because they’ve issued a request for a file called crossdomain.xml. At least, I’m pretty sure that the vast majority of people operating websites (climate websites or otherwise) are not doing so, and they seem to be getting by more or less okay. But you know much more than I do about the specifics of your own hosting situation, so if you think that advice is bad, feel free to disregard it.
I’m not sure that anything requesting crossdomain.xml is necessarily going to be a bad bot.
It is used only by Flash (and maybe Silverlight?), and sites I manage get heaps of requests for this file which, after some study, I now simply ignore.
Given you have no Flash elements on the site, I think you can safely ignore this to be honest.
John– I know and I thank you for taking the time. But I have discussed the why I am blocking frequently here. In this particular case, I want to know why a normal uninfected browser used by a person surfing might ask for crossdomain.xml out of the blue.
mct–
But since I don’t have Flash, why is a browser asking for it? And equally, why is the browser presenting me with a cookie as if I set one?
Those are my questions. Do you know?
The more interesting thing is the UA “AskTbIMH6/5.13.1.18261” if you go to http://www.useragentstring.com/. This is not a recognized UA string. You could block on it instead of the cookie.
Kan–
I could block that, but it’s not the only user agent string that does this. If you read about requests to crossdomain.xml, you will see that lots of people see it, and many think it’s a) a broken browser add on or b) malware from a specific company, c) spyware a browser picked up (unbeknowst to the browser owner) or c) brown spyware!
I could block ‘AskTbIMH6/5.13.1.18261’ too. But that wouldn’t help the actual humans blocked. After all: all this shows is that the particular person has three things going that look like they might be malware/blot etc. If his browser is infected by malware, my preferred method of his gaining access is that he fix his browser rather than my changing my rules to permit people with whose machines are infected gain access!
There’s another possibility. A user might have been the victim of a DNS poisoning attack that pointed rankexploits.com at a server that did have flash and did set the mp_… cookie and then redirected to your site.
You’re going to find that you’ll block more and more legitimate traffic as you improve your blocking strategy. You’ll have to decide where to draw that line.
Agreed on both counts. Because I don’t sell a product and this is hosted on my dime, I’m more tolerante of false positives that some others might be. (The people who are blocked might not like this… but…well…)
On this particular issue, I’m trying to find out what’s up with this thing. I do intercept quite a few of these. Up until now, they all carried another independent suspicious trait that people discuss as evidence of malware– and no human complained.
But now… I’m seeing them without that other 2nd suspicious trait. I still can’t help but think something happened with their browser– or to them. But if there are lots of people seeing this, I’m going to relax it. If it’s just this one person this one time….
Maybe I should delete their cookie if I see it? That way at least after they are blocked for that once, they won’t get blocked again?
That’s a pretty good test for a legitimate browser: it deletes the cookie if you tell it to.
EDIT: This may be of interest to you too: http://webmasters.stackexchange.com/questions/17323/why-am-i-seeing-unexpected-requests-for-crossdomain-xml-in-my-logs
Not Sure– A common referrer hitting …/crossdomain.xml is
Referer: http://s.nsdsvc.com/app/dddwrapper.swf?c=4
Ah, so it might be a Flash-laden add-on.
I was going to suggest maybe you were seeing visitors from a site which is showing this site within a frame, with Flash or advertising running in a bar across the top of the page.
For example, in About.com is an article about a literary journal:
http://fictionwriting.about.com/od/thebusinessofwriting/p/Versal-An-International-Literary-Journal.htm
In the section “Submit to Versal” is a link to a site for submitting text. Clicking on that link makes the site show up underneath an About.com header which has advertising.
However, if this was happening I’d expect you’d notice a Referer in your logs which would tip you off.
Sounds like you need a deny-all crossdomain.xml
.
http://webmasters.stackexchange.com/questions/17323/why-am-i-seeing-unexpected-requests-for-crossdomain-xml-in-my-logs
.
And maybe start warning your users away from Yontoo plugins
Lucia, no question, I’d block them. Crossdomain.xml is a ‘permissions’ file, a bit like robots.txt is for search spiders and robots, but with much wider capabilities. Offhand, I’d say you don’t want one, and if there is one, get rid of it.
see e.g.
http://jeremiahgrossman.blogspot.co.uk/2008/05/crossdomainxml-invites-cross-site.html
Mixpanel is a web 2.0 type web analytics crowd, so I’d imagine their cookies would be tracking the &((%$ out of anyone inflicted with said cookie, and that they’d be a useful screen for malicious cookies to mimic.
Same simple rule, if it looks odd to you, bye bye. Your site, your rules. And asking for absolution is much easier than permission or justification 🙂
Well, actually, you *do* on the odd occasion… witness the War of 1812 post for example.
I agree on the cookie… definitely not right.
lucia:
I believe you do have at least a few options for throttling people without outright banning them if you want to take the time to implement them. I don’t know that it’d be worth your trouble though as I don’t see any benefit to throttling over out-right banning.
mct:
That’s not actually having Flash on lucia’s server. It’s just embedding a link. All the Flash aspects of it are handled by the site she links to (in this case, YouTube). It’s an understandable confusion.
Ok. Based on the comments of people who seem to know what /crosssite.xml does (as opposed to those who just notice lots of browsers ask for it), I have edited the message ZBblock will present to those who present bad cookies or ask for /crosdomain.xml
I’m taking this path because merely noticing there are many attempts to connect doens’t mean this is ok. I see lots of attempts to connect to potentially vulnerable resources everyday. The attempts fail– but I know they are attempts to hack. So, I ban those. Until such time as someone provide a valid reason why something would be trying to connect to the non-existant /crossdomain.xml or feed my server cookies I didn’t set, I’m banning connections that try to do either.
I reactivated Bad Behavior– test.
Also not quite the issue. Any old Flash can generate a request for crossdomain.xml, at least as I understand the architecture.
mct:
I’m afraid I don’t understand what point you’re trying to make. Could any Flash app make “a request for crossdomain.xml”? In theory, sort of. So what? What does that have to do with the fact lucia embedding a link to a YouTube video doesn’t mean she has Flash on her site?
Beyond which, while any Flash app may be able to make such a request, it would have to be coded to do so. What kind of legitimate app would make such a request from lucia’s site? There’s no reason for it.
The only reason would be extremely poor coding or malevolence. That’s why it deserves to be blocked.