So the Climategate investigation is closed. We are told it was a “sophisticated and carefully orchestrated attack on the CRU’s data files, carried out remotely via the internet”
So, today, I saw something that made me wonder if The Blackboard is currently the focus of a “sophisticated and carefully orchestrated attack on [its] data files, carried out remotely via the internet”. Here’s a connection my software blocked today:
#: 77327 @: Fri, 20 Jul 2012 00:51:24 -0700 Running: 0.4.10a1
Host: 95-65-87-21.starnet.md
IP: 95.65.87.21
Score: 1
Violation count: 1
Why blocked: Bothost and/or Server Farm (credit: eclecticdjs.com). ( 0 ); c= MD
Query:
Referer:
User Agent: WhatWeb/0.4.7
Reconstructed URL: http:// rankexploits.com /
As happens on a nearly daily basis, something emanating from a suspicious server in Moldova got blocked. I think: “What the heck is WhatWeb/0.4.7? ”
So I googled to find their self description:
Next generation web scanner. Identify what websites are running.
Download whatweb-0.4.7.tar.gz
Latest Version 0.4.7, 5th April 2011
License GPLv2
Author urbanadventurer aka Andrew Horton from Security-Assessment.com
Wiki The WhatWeb Wiki
Development Version WhatWeb on GitHubIntroduction
WhatWeb identifies websites. Its goal is to answer the question, “What is that Website?â€. WhatWeb recognises web technologies including content management systems (CMS), blogging platforms, statistic/analytics packages, JavaScript libraries, web servers, and embedded devices. WhatWeb has over 900 plugins, each to recognise something different. WhatWeb also identifies version numbers, email addresses, account IDs, web framework modules, SQL errors, and more.
WhatWeb can be stealthy and fast, or thorough but slow. WhatWeb supports an aggression level to control the trade off between speed and reliability. When you visit a website in your browser, the transaction includes many hints of what web technologies are powering that website. Sometimes a single webpage visit contains enough information to identify a website but when it does not, WhatWeb can interrogate the website further. The default level of aggression, called ‘passive’, is the fastest and requires only one HTTP request of a website. This is suitable for scanning public websites. More aggressive modes were developed for in penetration tests.
It continues….
For those wondering what I think this looks like:
Since I’m not running the scan, it looks like it could be someone running a scan to figure out how to hack my blog. More creatively, it could be someone who wants me to learn of the existence of their software hoping I will see this in my logs, visit their site and then buy the software.
For another discussion of “webapplication fingerprinting” you can go here:
Usage of Web Application Finger Printing
Web Application finger printing is a quintessential part of Information Gathering phase [4] of (ethical) hacking. It allows narrowing / drilling down on specifics instead of looking for all clues. Also an Accurately identified application can help us in quickly pinpointing known vulnerabilities and then moving ahead with remains aspects. This Step is also essential to allow pen tester to customize its payload or exploitation techniques based on the identification and to increase the chances of successful intrusion.
Remember: I didn’t order this. So, it’s up to you to try to decide how one would “move ahead” after discovering known vulnerabilities.
It’s also your guess whether a person running using available software to scan my site for vulnerabilities is “sophisticated” or whether this type of thing could be done by almost any motivated computer literate high school student. I suspect that the Norfolk Police would categorize it as the former. I suspect it’s the latter. Both are just my guesses.
I don’t know if it’s related, probably not, but for the last two weeks, I’ve encountered extremely slow service at only two websites: ClimateAudit and WUWT. I use an imac. I don’t know if it’s something unique to my computer or server or if it’s more widespread than that. I find it annoying. If you type a sentence, it takes 30 seconds to show up on the screen. It’s particularly egregious at WUWT. I’ve now taken to typing the posts on a notepad and copying and pasting them into the comment block.
It just seems like more than a coincidence that it only happens at those two websites. It’s not happening here, but I will keep you posted.
Anybody got any ideas as to what is going on?
It seems to be more in the annoying category like spam rather than personal. GitHub terms of service might help get them removed.
Okay, I just updated Firefox for Mac and disabled FoxyProxy and things seem to be working a lot better at ClimateAudit and WUWT.
Sorry for the irrelevant post, Lucia.
“It’s also your guess whether a person running using available software to scan my site for vulnerabilities is “sophisticated” or whether this type of thing could be done by almost any motivated computer literate high school student.”
But thats just getting in the front door. From what I’ve read, they did more than just get in.
From what the Norfolk police say, they first gained entry to the CRU web server – that could be where they used these types of tools or “kiddie scripts”. However, they gained access to the backup server from the CRU web server. How these tools might accomplish is unclear especially since we don’t know specifically what was done.
Also, the Norfolk police said:
“We’ve used the expression ‘sophisticated’ and that’s because that’s the view of our experts who conducted that side of the investigation for us. They identified that, as well as achieving the breach, they also took significant steps to conceal their tracks and lay false trails and change information available to us in order to frustrate the investigation. The conclusion was the person /s were highly competent in what they were doing.”
So they are talking about more than just using a proxy server and whatever tools to get in, They also speak of what was done after access was gained – “steps to conceal their tracks and lay false trails and change information available to us in order to frustrate the investigation” speaks to some specific things they would have done – mainly certain logs that would have been altered. Some are simple to change, others not so simple. Some would be impossible if they didn’t crack the root password which we don’t know if they managed or not. However, since the police said that “We identified that the attackers breached several password layers to get through…” indicates to me that they did not gain root access, or they would have had the ability to remove these traces as well (or they just weren’t sophisticated enough)
So IMO, its these other things besides just using a proxy that led them to call it “sophisticated” – after all, they pretty much say that in the quote above.
MrE-
It may not be personal. But that doesn’t put fingerprinting my site in the category of ‘only in the annoying’ category. Hackers who have nothing personal against me or my blog can cause just as much damage as hackers that have it in for me personally.
GitHub is not the possible hackers host. Github describes perfectly legitimate useful software that can be used for good or for evil. For example: If I downloaded the software and used it, that would be fine. I could learn where my security holes are and fix them. This is perfectly valid. The difficulty is if someone else downloads and scans my site, I would be foolhardy to assume their intention was to send me a report that I could use to fix my security holes. It’s much more likely they intend to exploit holes they find.
GitHub isn’t going to deep six the software just because some people can use it for evil. Website admins just need to learn to recognize when they are being scanned by strangers who might be up to no good.
Of course the people who copied the CRU emails did more than get in the front door. They copied files “in the living room” or what have you.
The attempt I show in the post douns my front door locked because ZBblock blocks that host. But that still doesn’t tell us whether the Norfolk police would consider someone scanning my site remotely to “fingerprint” sophisticated or not. The reason it doesn’t is that the Norfolk police have been so vague as to make it impossible to tell what level of hack they consider “sophisticated”.
Translation: We aren’t telling you what sort actions we or our experts consider “sophisticated”. If they say sophisticated, then it’s sophisticated. By. Definition.
Translation: We also aren’t going to describe what sorts of steps they too, nor what we would call “significant”. We aren’t goign to tell you what might constitute a “false trail” nor what sort of “information” they tried to change.
Translation: Take our word for it. Even though we aren’t going to tell you squat about the sorts of things they managed to do, believe us when we tell you they were highly competent.
Whatever highly competent might mean that’s what they were.
Come one: That paragraph is amazingly unnformative. It tells us almost nothing.
Sure. A proxy server and then… something. Well… duh-huh! Proxy servers don’t just create commands on their own.
How in the world does this speak to anything specific? Using a proxy server is a step to conceal ones tracks. What you quote is utterly ambiguous. Did they do things to avoid being identified? Yes. We knew that: Among other things, they used a proxy server. Duh. What the Norfolk constabulary says tells us nothing more than what we already knew.
Of course we don’t know because the Norfolk constabulary didn’t tell us. In fact, they tell us almost nothing. We don’t even know if they tried to crack the root password. Nor do we know how they tried to crack any passwords.
We know that there appear to be attempts to crack some passwords. Well… there are attempts to crack my login password at the blackboard multiple times a day. I have various methods at various layers to ban these attempts.
Sure. They did more than use a proxy. No one has ever suggested that merely connecting with a proxy server would cause the CRU server to say “proxy server detected! Here: let me give you these emails!” and then suddenly start sending out secret emails.
But based on the quote we have no idea whether what they did or attempted to do should be described as “sophisticated” or whether it is on the level of hack attempts that can be achieved by motivated computer literate high school students. The reason we don’t know is the constabulary didn’t describe what they did at all.
Mind you: They aren’t required to describe it. Nevertheless, it is perfectly fair to observe that they did not describe what was done and for all we know ‘sophisticated” involved:
1) using a proxy server.
2) running some downloadable for free dictionary attack software to guess passwords
3) somehow finding the directory which contained the motherlode of email data. (If they were insiders or knew insiders, this might have been found by knowing more or less where it was.)
4) copying the data.
5) fiddling with a few unix commands to try to change a few records (and doing it in a way that was detectable.)
6) Nothing more.
Very few people would call this “sophisticated”. But it is “more” than just using a proxy server. But of course they did more than “just” use a proxy server. At a minimum, they found the directory and copied the data. That’s action– which everyone knows they did– is already “more”. But it’s not necessarily “sophisticated”.
Lucia,
“The attempt I show in the post douns my front door locked because ZBblock blocks that host. But that still doesn’t tell us whether the Norfolk police would consider someone scanning my site remotely to “fingerprint†sophisticated or not. The reason it doesn’t is that the Norfolk police have been so vague as to make it impossible to tell what level of hack they consider “sophisticated”.”
They said they called it sophisticated for more reasons than just an attempt to get in.
“Translation: We also aren’t going to describe what sorts of steps they too, nor what we would call “significantâ€. We aren’t going to tell you what might constitute a “false trail†nor what sort of “information†they tried to change.”
The Norfolk constabulary isn’t going tell you this is and neither would I. You’re basically complaining that they are not telling you what you need to do when you hack into a server.
“steps to conceal their tracks and lay false trails and change information available to us in order to frustrate the investigation†speaks to some specific things they would have done – mainly certain logs that would have been altered.
“How in the world does this speak to anything specific? Using a proxy server is a step to conceal ones tracks. What you quote is utterly ambiguous. Did they do things to avoid being identified? Yes. We knew that: Among other things, they used a proxy server. Duh. What the Norfolk constabulary says tells us nothing more than what we already knew.”
It speaks of specific things to me from experience.
Once you get in, there are certain things you would need to do to conceal your tracks and lay false trails. That involves deleting. changing or replacing records of what you actually did with something different. That requires knowing what’s recorded, where its recorded, and what you need to do to make it look like you did something else. That’s a bit more than “fiddling with a few unix commands to try to change a few records”
It is pretty clear (to me at least) they did quite a bit more and needed to know quite a bit more than just using a proxy server. Whether that qualifies as “sophisticated” or not is a matter of opinion.
WTH
Sure. The Norfolk police don’t tell us what those reasons are and so we can’t begin to guess what sorts of things they Norfolk police call “sophisticated”. For all we know what they did would be called “highschool level script kiddie hack attempt” by others.
Fair enough. But I merely observed they didn’t tell us.
No. I’m saying that because they don’t tell us what was done we have no way to know whether what was done was “sophisticated” or “bungling attempts”.
Oh? Well, quite honestly, I don’t believe you can tell what the people who got the data did from what was said either.
Sure. “Certain things.” And we don’t know whether the “certain things” the people who got in were, so we can’t tell if those things were “sophisticated” or “bungling”. We don’t know if they were the sorts of thing any high school student who has learned unix could try or something else.
How do you know? The Norfolk police didn’t tell us whether they succeeded or failed to cover the tracks only that they attempted to cover them. In fact, it appears whatever they tried failed. Given that whatever they tried failed, how in the world can you say it was not just “fiddling with a few unix commands”. The fact that one must do more to succeed doesn’t mean they did more– after all they failed.
As far as we can tell based on what the Norfolk police said, the use of the proxy server was sufficient to cover tracks. Additional manipulations were tried, but those didn’t necessarily accomplish their intended result of changing logs.
We already all agreed they did more than use a proxy server since connecting with a proxy server does not cause all data one seeks to spill out of a server “poof”.
What we don’t know is precisely what they tried to do and whether it was sophisticated. If you agree that we don’t know what they did and don’t know it’s level of sophistication, then we agree.
WTH:
In which case nobody including you should believe a word they say if they refuse to explain, and if you gave me a similar value judgement without explanation, I wouldn’t accept a word of what you said as truth.
Not because they or you are necessarily lying but because they or you are giving us a judgement without giving us any substantiation for how they or you came to that judgement.
Maybe the blind trust of the judgement of insular officials is part of your worldview, it’s not part of mine
“It speaks of specific things to me from experience.”
“Oh? Well, quite honestly, I don’t believe you can tell what the people who got the data did from what was said either.”
No, I don’t know exactly what they did. What I’m saying is that what Norfolk is saying is in line with what I have seen before
(including reports from the authorities) while knowing what was done.
“Sure. “Certain things.†And we don’t know whether the “certain things†the people who got in were, so we can’t tell if those things were “sophisticated†or “bunglingâ€. We don’t know if they were the sorts of thing any high school student who has learned UNIX could try or something else.”
I know that what Norfolk said is in line with what I’ve seen done.
When I hear Norfolk say they changed “information available to us in order to frustrate the investigation”, I understand what information that may be and what I’ve seen done. What I have a hard time understanding is what a “bungling” might have changed (or tried to) to lead Norfolk to say that. I’ve seen “bungling” attempts that got in, and none of them included making changes to cover their tracks or change stuff to make it harder to figure out what they had done.
“As far as we can tell based on what the Norfolk police said, the use of the proxy server was sufficient to cover tracks.”
Perhaps, but we don’t know if just using a proxy was sufficient or not because they listed other things as well. In the paragraph explaining their use of “sophisticated” they don’t even mention proxy servers, so this seems unlikely to me. In fact in first paragraph they say “which may have been proxy servers”. Does that mean then that they called it sophisticated because they -may have been- proxy servers? Lastly, the report says it was “from a number of different IP addresses, in various countries”, so it would be more than “a” proxy server – that doesn’t sound like a “bungler” to me.
“What we don’t know is precisely what they tried to do and whether it was sophisticated.
Depends on what you consider “sophisticated”. Apparently they got through at least 2 password levels – on the web server and again on the backup server (perhaps that could have been as simple as logging in as admin/admin…) but does having the knowledge to get to the backup server via the web server constitute “sophisticated”?
“If you agree that we don’t know what they did and don’t know it’s level of sophistication, then we agree.”
Sure – we don’t know enough as to whether either of us would consider it sophisticated, but it sounds to me like they considered more than just the use of a proxy in calling it sophisticated.
“WTH:
The Norfolk constabulary isn’t going tell you this is and neither would I”
“In which case nobody including you should believe a word they say if they refuse to explain, and if you gave me a similar value judgement without explanation, I wouldn’t accept a word of what you said as truth.
Not because they or you are necessarily lying but because they or you are giving us a judgement without giving us any substantiation for how they or you came to that judgement.”
Well, there a valid reasons for not telling you specifics. As I said, they are not going to give you information on what to do when hacking a server. Thats something no police department would do.
“Maybe the blind trust of the judgement of insular officials is part of your worldview, it’s not part of mine””
Blind trust? Thats a bit of a leap. Sure Norfolk could be lying through their teeth, but I don’t make that assumption beforehand. Besides, what Norfolk has said is in line with what I’ve seen other police departments say about breeches that I was familiar with.
WTH:
LOL. So what?!
It isn’t relavant whether they are lying or mistaken or otherwise. Absent a detailed explanation, I have no reason to trust their value judgement on the sophistication of the “attack”, and plenty of reason to mistrust the judgment.
It’s as simple as this, distrust in their judgement does not speak to character but to competency.
Maybe so. And I can simply observe that they have not told me specifics. Consequently, what they have said is utterly uniformative.
I’m really not seeing how this is moving away from what I have been saying all along: They have told us practically nothing. So we learn nothing from what they said.
I admit they have suggested conclusions they have made based on non-evidence they have not revealed. I accept they may have good reasons to reveal nothing. But … well.. then they have revealed nothing other than what amounts to the spin they should put on the nothing they have revealed. Sorry, but I really see no reason to accept the truth about what might be inside the “spin” cycle without being provided any glimpse of what’s inside the washing machine!
One thing that this doesn’t address is the possibility that an insider created all of this as a false trail, making it look like the breach came from outside, but basically letting someone know how to get in and what to do to cover their tracks. Paranoid, but still a possibility.
cloudflare back on…
Carrick:
“It isn’t relavant whether they are lying or mistaken or otherwise. Absent a detailed explanation, I have no reason to trust their value judgement on the sophistication of the “attackâ€, and plenty of reason to mistrust the judgment.
It’s as simple as this, distrust in their judgement does not speak to character but to competency.”
What I’m trying to say is that the police are not going to release the details you need to trust their judgment. What Norfolk has said is in line with police reports for breeches I’ve been involved in – if you saw those reports you wouldn’t have reason to trust their judgment either.
lucia:
“I’m really not seeing how this is moving away from what I have been saying all along: They have told us practically nothing. So we learn nothing from what they said.”
I’m not sure why you would expect more, especially for a case that is unsolved.
“Sorry, but I really see no reason to accept the truth about what might be inside the “spin†cycle without being provided any glimpse of what’s inside the washing machine!”
OK.
Let me try and explain how I see it. I know that someone trying to gain access needs to do x, y & z, I’ve worked on cases where I know x, y & z were done and have seen police reports on those cases. I read the Norfolk report and it sounds to me like they did x and y but maybe not z. I don’t know that Norfolk knows x & y were done, they could be spinning or outright lying to make it sound like they know more. All I know is it sounds like pretty standard police stuff to me.
As for sophistication – I personally wouldn’t call doing x & y sophisticated, but it does indicate they had more knowledge of what to do than the ones than the ones that didn’t. Maybe doing x & y would be enough for Norfolk to call sophisticated, I dunno. Most of the cases I’ve seen where they didn’t do x & y, they did (or tried to do) other things that indicate a lack of knowledge and sophistication (bungling if you will) – and I know that because they left a detailed record, not having done x & y.
WTH:
Um… no. I don’t need to trust a bloody thing they say. Period.
“Transparency in government”. New phrase?
Let’s skip tot his:
What detail record? Oh you mean the one you haven’t seen.
It occurs to me that if I were an insider that wanted to extract data from CRU and cover my tracks, then this path is a plausible one to use. However, I am a moderately sophisticated computer user and off hand, I do not know how to set up the proxies.
So to me, it is not as plausible as downloading to a thumb drive.
If, however, the forensics can reconstruct the path from the server to the proxy, perhaps it could reconstruct the path from the server to my desktop.
You posted lots of text that suggested they had informed us. I said they did not. You now seem to agree they did not.
I have no idea why you think my correctly observing they didn’t tell us things translates into “expect more”. I was merely pointing out that claiming they did tell us much of anything is incorrect.
Fine. And what’s standard is to leave us in the dark. That’s fine. But the fact that this behavior might be standard does not mean we should pretend we’ve learned anything. We haven’t learned anything.
Sure. If they did “x” they knew enough to do “x”. Some people don’t even know that “x” could be attempted.
But since we haven’t been told what “x” might even be, we have no means of assessing whether what they did is “sophisticated”.
As it stands: I have very little idea what the people who got the data did beyond some rather cursory — and all was stuff people either flat out knew or assumed to be true before the police said anything at all:
1) They used proxy servers. We knew that on day 1.
2) They had to enter passwords to get at files. Nearly everyone guessed that.
3) They tired to guess passwords. We couldn’t be sure of that, but many knew this was a possibility. People have talked about the variety of ways the might have tried to do this since day 1. We still don’t know which of the many methods they might have tried they did try— because the cops didn’t say. (Which is fine.) The only alternative to guessing passwords would have been someone knowing the password just using it. But since almost no one thought Phil Jones just logged in and sent the stuff out, this alternative was never seriously considered by anyone. So, the police information is not especially informative here.
4) They tried (and apparently failed) to alter log files to cover their tracks. So, they were someone who knew you could try to do this– but evidently they weren’t someone whose knowledge permitted them to do actually do it.
None of this points to “sophisticated”. Guessing passwords? Anyone can google and find free dictionary attack software. Use a proxy server? Anyone can do that. Try to alter log files? Yet another thing a computer literate high school student could do.
Maybe the things people did was “sophisticated”. Or maybe it was the level that could be achieved by a computer literate highschool student. Nothing we have been told can permit us to assess what level of activity is “sophisticated”.
I’m not saying “sophisticated” is a lie. I’m just saying that for all I know the cops are using “sophisticated” to mean something like “something a moderately computer literate person would have to use google to learn how to do because it’s above the level of sending and receiving email or writing a comment at a blog.”
My mother would then fall in the category of “not sophisticated” and I — and at least half the regular blog commenters here,at RC, WUWT or CA and even half the Blawgs would then be “sophisticated”. I have no idea.
But we know the access was achieved by someone who has a better grasp of computer technology than my mom!
RobertInAZ
You don’t need to set up a proxy. Just use one.
Here’s how to find one:
Google “free proxy server”. (I did that.)
Click the first link. (I did that.)
I found this:
http://freeproxyserver.net/
See the box? Enter a uri. (I block lots so you may not be able to load my blog. It worked for me, but I’m going to go block that one now. 🙂 )
If a uri is not blocking that proxy, you’ll be able access through a proxy server. Once you know these exist you can find more proxies– and better ones. I wouldn’t call “knowledge that these exist” sophisticated. But maybe the Norfolk police do. Beats me.
Has anyone posed the valid reasons for the police not revealing the details from which their conclusions were made in their report? The most likely reason is the one staring us in the face and that being that police are making vague generalities in order to excuse their failure to apprehend the perpetrator within the allotted time. That would be even more obvious if we were told that police departments routinely make these comments when they meet with failure.
Perhaps we could learn more by looking at a case where the police actions led to a successful prosecution in a similar case – as I would think that all the police activities would have to be made public if and when the case went to trial.
Ok… I’m now blocking
http://whois.domaintools.com/67.159.44.96
” 5 websites use this address. (examples: freeproxyserver.net freewebproxy.net fsurf.com privacy-world.net) ” 🙂
By the way, when I googled the second link is to:
http://hidemyass.com/proxy-list/
This lists a whole bunch of proxies. These are easy to find and easy to use. If I notice a particular proxy is used by hackers, I tend to block that IP. So, for example. relakks in Sweden is blocked. People who have paid for that service and what to visit The Blackboard will have to get off their expensive VPN and access using their ISP. If they complain they “need” the protection of a VPN to avoid tracking, I’ll tell them I “need” the protection of blocking the bots using that service. (It’s not most of their customers– but enough. So… block!)
Kenneth
No. I’m sure they have reasons and they consider them valid. Maybe I — or we– would consider them valid or not. None of us know what they are, so we can’t know whether they are valid. Of course we can all speculate — but that’s speculation.
I was contacted a couple of days ago by an IT security specialist and have been chatting with him. I asked him what questions could be asked of the police that would clarify whether RC-FOIA had something like defence-grade skills or merely IT-grade or student-geek-grade skills. His response was that the police should be able to release most of their forensic report without compromising anything relevant:
Personally, I see no reason to doubt the police or their experts, but I also find it hard to dismiss Mosher’s certainty that an insider may have been involved.
How about this: the insider had a friend who was a sophisticated hacker. Perhaps a high school chum who was already something of a computer whiz back then, and eventually got a job in IT. The two have stayed in touch, they are probably of the same political persuasion, the plot is hatched over a few beers.
Unlikely, I know, but not outlandish. I think that chance (in the form of coincidence, or opportunity) probably plays a big part in all successful crimes. And this particular coincidence might explain why only CRU was hacked (and not, say, GISS, or the University of Virginia), and also why an insider would dare to take such a risk…
” and I know that because they left a detailed record”
“What detail record? Oh you mean the one you haven’t seen.”
I said “Most of the cases I’ve seen”, are you saying I didn’t see detailed records?
lucia:
“You posted lots of text that suggested they had informed us. I said they did not. You now seem to agree they did not.”
In my first post I said “speaks to some specific things they would have done”. You asked about that and I tried to explain and now you think I’m saying something else. On all of the cases I’ve worked that the police reported on covering track and changing information, x, y and sometimes z were done or attempted. Thats why when I hear Norfolk reporting covering tracks and changing information, that speaks to some specific things – x, y and z. I don’t know how to explain it any simpler than that.
julio–
I don’t think it is even remotely outlandish to think two people were involved. I also don’t think it’s remotely outlandish to image that an insider at any research institute (including CRU) would have a friend who ended up with a job in IT nor that the two might hatch something over a beer.
Many people in IT learn a fair amount about how attacks are accomplished (and thwarted) and they know what might be worth a shot. An insider who didn’t know how to do this might know information worth exploiting existed. And then…. mission accomplished.
Heck, it’s not even outlandish to image an insider who is not particularly motivated to do anything at all, but nevertheless discussing all the FOI requests emanating from blogs said something about “Come on! We all know that tons of emails are routinely backed up on X.” while sharing a beer with an IT friend over a beer . Then later the IT guy with garden variety skillz (for an IT guy) thinking “worth a shot to look!” acted on his own.
Is this speculative? Yes. Might the police think it ‘nonesense”? Sure. But to the extent that the police pretty much told people nothing, theres no reason to exclude all sorts of ways things might have been accomplished. We simply don’t know the level of sophistication involved.
Right now, we don’t even need to suggest this was an “IT” guy. Based on what we know, it could be a mischivious HS student. (Though, in that case, I would imagine they’d get caught because at that age, they might blurt it out to friends. Someone older is more likely to shut up for 3 years.)
This is what you said. And my point is what they wrote did not speak of anything specific. (Or at least to the extent they were specific, they told us almost nothing we did not already know: Example they used proxy servers.)
As far as I can tell, your explanation is that you agree they did not tell us anything specific and then told us we shouldn’t expect they would or complain about it.
Could you please clarify how you use the word “specific”? In my universe literally saying they did “x, y, and z” using letters is not saying anything specific. In contrast, saying “They drove to the grocery store, bought fruit, left through the side door and ate a banana” is “speaking to some specific things”.
I don’t know how to make this simpler for you either. But as far as I can tell, you seem to be using some sort of coded language where “specific” means “vague”. What the police wrote simply did not “speak to … specific things”. If they had spoken to specific things you could state what those specific acts were. But as far as I can tell, you can’t.
@Steve McIntyre (Comment #99840) “what questions could be asked of the police that would clarify whether RC-FOIA had something like defence-grade skills or merely IT-grade or student-geek-grade skills. ”
What about another option – “were completely devoid of any skills what-so-ever”?
Re: lucia (Comment #99843)
Yeah, I only suggested an IT guy on the assumption that the insider might be a professor, and the hacker a relatively close friend. If the insider was a grad student, the hacker would most likely be another grad student somewhere. There are many possibilities like that. (If he was a postdoc, who knows–the friend might even literally be in another country. Postdocs move around a lot.)
It still takes a particular conjunction–the person with the “right” motivation knowing somebody with the right skills–but a particular conjunction may ultimately be the reason why something happens somewhere and not somewhere else…
Since wild speculation seems the rage today…
I would expect some inside information to have been needed, while outsiders could then pull off the attack. Someone need only have had a pasword pilfered (or just selected one that was easy to guess) and the rest would not seem too difficult.
In terms of motivation, political disagreement would seem the most likely, but it could also have been a personal grudge…. maybe a really unhappy post doc or a grad student who felt mistreated by their advisor (Phil Jones maybe?). Once someone got into the email files, the potential for scandal and embarrassment would have been obvious. Whoever it was, they knew enough to do a keyword search and only release messages with those keywords. That suggests that they were aware of the FOIA controversy and looking for damaging information.
SteveF–
The situation we are faced with is that what the police said is so non-specific that almost none of the “favored theories” in the blogosphere can be excluded. There are some theories that can be excluded:
1) Someone walked by a camera, bashed down the door to whatever room (or even entered using a key) containing the computer, sat there and downloaded material onto a thumb drive.
2) Someone who didn’t know what a proxy server was accessed the computer, and within a few hours of obtaining the data, learned all there was to know about proxy serves and used one to post links to the files at various blogs.
Some people may have had theory 1. I can’t believe anyone had theory 2. I really don’t think any other theories people had are now excluded. Because the police statement is pretty vague and uninformative.
I don’t recall seeing a timeline for the original alleged hack which allowed enough time for the alleged hacker to do some work on those e-mails. Perhaps the alleged hacker had access starting in a time period far earlier than the police thought? and any evidence of a break-in that the police did find was there to make a distraction.
Lucia,
I don’t expect the police will ever disclose substantive information. It serves no purpose to disclose that information….. and I suspect they want still to ‘catch the perp’ whether they can prosecute or not (and some law enforcement agency somwhere almost certainly can still prosecute). All the talk about it being an external attack is probably political perfume to mask the unpleasant odor at UEA of a possible inside job. That may be the only motivation for the announcement.
lucia (Comment #99849)
July 22nd, 2012 at 4:53 pm
Very specific and no speculation here:
foi2009.zip file access times:
Wednesday Sept. 16, 2009 6:58 PM
Saturday Sept. 26th. 2009 5:04 AM
Sunday Sept. 27th. 2009 12:23 AM
Sunday Sept. 27th. 2009 7:58PM
Monday Sept 28th. 2009 7:37 and 9:49 AM
Tuesday Sept. 29th. 2009 6:12 AM
Weds. Sept. 30th. 2009 2:12 AM (FOIA/documents/briffa-treering-external, 2,224 files)
(After a short break he resumed at 2:16 AM and added /harris-tree/ and /osborn-tree3/)
Thursday Oct. 1st. 2009, 5:19PM, 7:03PM, 9:38PM
Saturday Oct. 3rd. 2009 1:00AM and 1:15AM (more Briffa stuff added)
Thursday Oct 8th. 2009 2:45PM and 5:25PM
Saturday Oct. 10th. 2009 1:25AM
Sunday Oct. 11th. 2009 3:06AM, 10:46AM, 12:30PM
Thursday Oct. 15th. 2009 9:19AM
Saturday Oct. 24th. 2009 6:00PM
Sunday Nov. 15th. 2009 5:55PM, 8:43PM
Monday Nov. 16th. 2009 4:43PM (Mbh98-osborn.zip), and at 7:27PM the last file (FOIA/documents/EURO4M_DoW_v2.doc)
Observations-
Our “file releaser” seemed to prefer weekends and late night/ early morning hours. This is when most of the heavy lifting was accomplished. Just about all of the data files were added during these times. Most of the odds-and-ends (doc, txt, pdf xls jpg png, etc.) were added between Oct 8th and Oct 11th.
Interesting to note that most of the weekday activity was outside normal (9 to 5) work hours, with the exception of Monday Sept. 28th. Thursday Oct. 8th. and Thursday Oct. 15th. And there was no activity on Fridays.
This is the basic gumshoe detective work that should be expected from any police department. Norfolk should be no exception.
SteveF
I will be surprised if they ever do tell us much. I’m not surprised they are vague. I only object to people who suggest they told us ‘something’ specific when we can’t even say what the ‘something’ was based on what the police actually said.
Here’s a reason (in bold, below)
http://www.allpolicejobs.co.uk/index.php?p=police_recruitment_roles&roleID=14
dukec, I’ve been re-examining that timeline because there is one new fact/confirmation in the police statement: that the intrusion did not begin until September 2009. IMO, this is an extremely important of information that has not been considered in analysing the affair to date. I’m working on a CA post on this topic.
A related piece of information also pointed out by Frank Swifthack. RC-FOIA assembled a directory of yamal data in five batches, with the first batch being placed in the yamal directory almost immediately after my first Yamal post. The Yamal data was also among the earliest collected data (of the documents with known access dates.)
On the other hand, the MBH data was accessed very very late in the process and the files were released within a couple of days of RC-FOIA locating this data.
dukec, excellent points about the access times. If you factor in British Daylight Time ( in effect until Oct 25 2009), the intrusions of Sep 28 and Oct 15 are both somewhat before 9 am. The intrusions of Thu Oct 1 were just after 4 pm British and Thu Oct 8 was at 13:54, a somewhat unique event.
Also if you stratify the times, here’s something else that’s interesting. In a first cut at days, there is access on all days of the week except Friday (although allowing for daylight time, there is access very late on a Friday.)
However, there is a change in pattern after the first week. There are documents dated every day from Sat Sep 26 to Thu Oct 1. After that, all the accesses are on the weekend, other than the two Thursday accesses on Oct 8 and Oct 15.
Steve McIntyre (Comment #99866)
July 23rd, 2012 at 5:59 am
I created a stripped down version (access dates/times, time zone, uids/gids, and filenames only) of the spreadsheet Frank Swifthack compiled back in Dec 2009. OpenOffice format:
http://dl.dropbox.com/u/18009262/foi2009atimes.ods
It occurred to me that the 4/5 hour offset may alter the timeline scenario a bit, in that the access times could be GMT -4/5 (East coast time). Maybe Frank will read this and chime in.
Re: lucia (Comment #99849)
Well, not knowing anything about this, I may have been willing to believe in the thumb drive idea originally. Now it seems pretty clear that the deed involved a level of hacking which–regardless of what you choose to call it–is not the kind of skill you associate with college professors, outside of a computer science department. (Most of my colleagues here have trouble figuring out how to configure an e-mail client correctly, and these are people with Ph.D.’s in physics.)
So I would personally be willing to rule out the insider hypothesis, except, as I said above, for the fact that Mosher still seems to cling to it. My comment #99841 above is just an attempt to make it work with the new (to me) data.
Duke C – thank you for that as it gives me some info that was on my to do list to look at. The file Frank created is broken. I am interested in the time zone derivation as it is likely to be the default value chosen for the operating system at install. The time zone is one of the reasons why I dont think much of Steven Moshers loose cannon/false flag theory. Steven lost sight of the wider picture.
why do some documents, even in the same batch, have -400 and some -500?
Mosher’s candidate is not a generic professor, but someone who, on the available record, appears to have the requisite skills.
Steve – and also why over an extended period – too long for daylight savings?
One possibility is 2 PCs being used for compilation on a network but not sure that works either.
If one looks at access times on Sunday, in British time, they range from 9:46 to 23:23 (with a number of times in between) , whereas in Eastern time, they range from 4:46 to 18:23, the latter IMO both an early start and an early evening.
Thursday also has access time ranging from 09:19 to 21:38 UK time, with Eastern times leaving a very early middle of the night start and implausibly early end.
While it’s hardly more than a slight indication, the access times do fit better with UK time than Eastern time.
The inconsistency of -400 and -500 really needs to be explained as well. It’s only been armwaved so far.
Steve – I would speculate the file harvesting was performed outside of normal hours so any compilation times could be influenced by that.
I agree the -400/-500 inconsistency is interesting. I had not previously seen the pattern. Frank Swifthack mentioned -400 in relation to daylight saving but it is an odd timezone and the pattern is very strange.
The -400/-500 is even stranger on a second look.
Within batch inconsistencies occur on Sep 16; both Sep 27 batches; both Sep 30 batches; and Oct 8 14:54 batch. Other batches have either -400 or -500 but not both.
In the foi2009atimes.ods spreadsheet, the times listed are access times. The timezone information is being computed from the creation times of the files.
In Standard Time (winter), Eastern is -0500 to both British Standard and UTC. But during Daylight, Eastern Daylight it -0400 to UTC and -0500 to British Daylight (which is +0100) , all attested in emails.
I don’t see any way that the same computer could on its own generate both -0400 and -0500 times in documents bearing the same date. Perhaps this points (very slightly, I agree) towards the offsets being intentionally planted as part of the bleaching, with the files being bleached in two batches, one slightly inconsistent with the other by (a rare) mistake.
Well I will ask an obvious question before embarking on a search for the less obvious.
Do we know if Franks program works correctly?
I will now go hide in case missiles come my way!
Steve – re bleaching.
On my to do list is trying to establish how the “attribute file” works in BackupPc.
Only one copy of any file is held on the server regardless of how many times it appears on different backups. The metadata is held in the attribute file. There are some reported issues with the retention of metadata for Windows systems.
I dont know how readily the metadata is retrieved for the different types of queries from BackupPc. I would like to rule out the bleaching as being part of the retrieval process as opposed to a deliberate act.
clivere – I have no idea. both checks seems worthwhile before placing any weight on arguments deriving from these times.
Steve M, Clivere
Here’s the zip containing the unaltered raw text data files created/ used by Frank B. It’s possible I ran a sort routine that mis-ordered the time zones, but that doesn’t appear to be the case. Strange indeed.
http://dl.dropbox.com/u/18009262/vomit-zip-FOI2009-all.out.zip
3 files w/ corrupt time zone data-
2009-09-30 02:12:17 [ tz -44 ] uid 1002 gid 1002 name FOIA/documents/briffa-treering-external/ecat/yamal/rw/82/l00331.rw
2009-09-30 02:12:17 [ tz -38 ] uid 1002 gid 1002 name FOIA/documents/briffa-treering-external/ecat/yamal/rw/82/l00321.rw
2009-09-30 02:12:17 [ tz -16 ] uid 1002 gid 1002 name FOIA/documents/briffa-treering-external/ecat/yamal/rw/82/l00311.rw
Duke C – thank you.
I suspect your earlier version combining all into one could be obscuring the patterns.
In addition I am wondering how much came directly from BackupPc and what timezone the server was set to.
http://backuppc.sourceforge.net/BackupPCRestoreOptions.html
Re: Steve McIntyre
> I don’t see any way that the same computer could on its own generate both -0400 and -0500 times in documents bearing the same date.
If the files in the zip archive where on a unix file system when they were zipped then the time is stored in 4 bytes as the number of seconds since 1st Jan 1970 UTC.
If the files where on an NTFS system when they where zipped then the time is stored in 8 bytes as (I think) the number of 100 nano second intervals since 1/1/1601 UTC.
In both cases the -0400 and -0500 are the result of the timezone on the system that interpreted the stored times within the zip file and nothing to do with either the system that zipped them up or the files themselves.
If I get a chance later I’ll extract what information I can and post a link to it with all times in UTC.
TerryS, that would be pretty interesting. Looking forward to your analysis.
julio, you are right that most scientists havn’t got a clue how their computers/drives and servers work. However, like all social structures, there is always someone who knows how things work. In almost every group you have ‘someone who knows about computers’.
Steve M., Clivere TerryS
Okay. I think the DST time zone data is associated with the original file modification times, which existed in one of the columns I deleted from the spreadsheet. It is not associated with the last access times.
I visually compared the tz value with the mtimes using the chart below and, so far, they all match.
DST START DST END
2009 Sun March 8 02:00 Sun November 1 02:00
2008 Sun March 9 02:00 Sun November 2 02:00
2007 Sun March 11 02:00 Sun November 4 02:00
2006 Sun April 2 02:00 Sun October 29 02:00
2005 Sun April 3 02:00 Sun October 30 02:00
2004 Sun April 4 02:00 Sun October 31 02:00
2003 Sun April 6 02:00 Sun October 26 02:00
2002 Sun April 7 02:00 Sun October 27 02:00
2001 Sun April 1 02:00 Sun October 28 02:00
2000 Sun April 2 02:00 Sun October 29 02:00
1999 Sun April 4 02:00 Sun October 31 02:00
1998 Sun April 5 02:00 Sun October 25 02:00
1997 Sun April 6 02:00 Sun October 26 02:00
1996 Sun April 7 02:00 Sun October 27 02:00
Double Forehead Slap.
Fully restored spreadsheet here:
http://dl.dropbox.com/u/18009262/FOI2009AtimesWithAllColumns.out.ods
i surmise that julio has not checked what I checked the first day.
it’s not the loose canon, although there is bit of motive there.
It’s really obvious.
I wonder if there was anyone at CRU who worked from home on Thursdays in 2009? No lectures that day, perhaps?
test TOR blocking
Re: DocMartyn (Comment #99899)
DocMartyn, that’s true, but those people you mention tend to stand out. A brief enquiry on computer skills would quickly rule out 80% of the faculty, and if you start looking for motive among the rest I can’t imagine you’d be left with many viable options. (On the “lone insider” assumption, that is.)
That seems to be what Mosher (Comment #99901) is saying: that if you know these people, the answer is obvious. Now, I have a lot of respect for Mosher’s investigative skills (after all, he ousted Gleick in no time flat), but I remain puzzled: if it is so obvious, then why has the police, after all this time, closed the case with an emphatic denial that any insiders were involved? Did CRU put up such a believable united front that nobody ever gave the insider theory any deep thought?
Note: I’m not fishing for any names here. This is none of my business, and I *really* do not want to know who the “prime suspect” might be (it would not mean anything to me, anyway). I’m just puzzled, that’s all.
I’ve extracted the information from FOI2009.zip and put in an libreoffice spreadsheet. You can download it here
It doesn’t look to have much more information than the original spreadsheet from Duke C.
Each file in the zip file has 3 different date/time stored in the zip header. Two of the dates are the unix modification and last access times stored as a 32bit number representing the number of seconds since 1/1/1970. The third date is the modification date and time as interpreted on the system that zipped up the files. This is stored in MS-DOS format so it has a resolution of 2 seconds and can only deal with dates after 1980.
By calculating the difference between the unix modification time and the MS-DOS modification time you can work out what the timezone was on the computer that created the FOI2009.zip. It looks like the timezone was for the Eastern United States/Canada.
Here’s a description of some of the fields in the spreadsheet:
System: The type of system used when adding the file to the zip archive. All files in the archive have this set to “unix”. Examples of other possible values are: Tandem, OpenVMS, Atari, Windows NTFS.
Version: The zipfile specification used when adding the file. All files used version 2.3.
Zip MDate: The modification date, taking account of the timezone, that the file was last modified.
Zip MTime: The modification time, taking account of the timezone, that the file was last modified.
Raw MTime: The number of seconds since 1/1/1970 that the file was last modified.
Raw ATime: The number of seconds since 1/1/1970 that the file was last accessed.
MDate: The date the file was modified (YYYY-MM-DD)
MTime: The time the file was modified
ADate: The date the file was accessed
ATime: The time the file was accessed
Time Difference: The difference, in seconds, between “Zip MTime” and MTime.
If you want to have a look at what the Raw MTime looks like in different time zones then you can run the following command in linux:
$ TZ=’America/New_York’ date –date=@1062157981
Fri Aug 29 07:53:01 EDT 2003
$ TZ=’America/Los_Angeles’ date –date=@1062157981
Fri Aug 29 04:53:01 PDT 2003
$ TZ=’Europe/London’ date –date=@1062157981
Fri Aug 29 12:53:01 BST 2003
The 1062157981 is the Raw MTime from FOIA/documents/ECLAT2.doc
TerryS – thanks
I now apologise to Frank.
Julio,
“after all, he ousted Gleick in no time flat”
I assume you mean ‘outed’!
But if you really meant ‘ousted’, it was only temporary… Gleick is back in his ‘job’ already. 😉
.
WRT obvious: Harry (who was mucking about in ugly code…. and was critical of the muck) always seemed to me a logical candidate for an insider who might be motivated to at least supply someone else a password or two.
Re: SteveF (Comment #99915)
Ouch! Yes, I meant “outed”, thanks. Freudian lapse?
TerryS, Thanks.
The Atimes in your spreadsheet (columns J/K) appear to be CRU server times.
The Atimes in the Swifthack data (Columns H/I located here- http://dl.dropbox.com/u/18009262/FOI2009AtimesWithAllColumns.out.ods) Don’t appear in your spreadsheet.
I can only surmise that columns H/I are an additional attribute added at the time the files were copied to the destination computer running version 2.3. This would be local time as determined by the system clock (East coast in this case). They would have nothing to do with cruback3 access times. Am I correct in my thinking?
Re: Duke C
Oops, I appear to have copied the MTime into the ATime, I will correct it shortly…
I’ve corrected it now so if you re-download it.
I think it is time for someone here to provide a summary of what all these time stamps on the emails mean. I assume that the information we are looking at here came from the emails that the CRU “provider” zipped. What does the information mean with regards to hacked versus leaked?
Re: Kenneth Fritsch
A zip archive is structured as follows:
Local File Header (LFH)
Compressed File
Local File Header
Compressed File
…
….
Central File Header (CFH)
Central File Header
….
The LFH and CFH contain nearly the same information.
In the standard LFH and CFH there is a field for the last modification time of the file and because zip dates back to early MS-DOS this time is in MS-DOS format and is the local time for when the file modified. So if you are in New York and the local time for the last mod is 6:00pm then it will store 6:00pm in the time part of the standard header.
As new systems came along extensions were added to LFH and CFH to store information that more advanced file systems had. There are extensions for Unix, OpenVMS, Amiga, Atari, Windows NTFS, Tandem, OS/2 and several others.
One of the Unix extensions contains the modification time of the file, but, unlike the one in the standard header fields, this is stored as the number of seconds since 1/1/1970 GMT and is therefore independent of the timezone. A second Unix time field is the last access time which is also stored the same way. By subtracting the Unix modification time from the modification time in the standard LFH you can calculate the timezone for when the file was added to the archive.
As an example, if the Unix modification time shows it was modified at 11:00pm GMT and the standard LFH mod time shows it was modified at 6:00pm then you know the timezone was 5 hours behind GMT.
In Unix, the access time is the time the contents of the file was last looked at. It is reasonable to assume whoever collated the documents and emails looked at the contents of the files when deciding whether to release them or not and therefore the access time shows when they last looked at the file.
Since the hacker was “sophisicated” you can’t draw any concrete conclusions from the timezone or access times since these are easily manipulated.
I believe the Atime is also set when a read or write call from some other application (zip, for example) is initiated.
There are 26 unique Atime values starting in Sept. 2009 and ending mid Nov. 2009 spanning 3504 files. Hardly the trail that would be left by someone reading or opening the files one at a time. And it is supported by the Norfolk statement. Yes, the time attributes can be manipulated (the date/time being “touched” to 2009-01-01 00:00, for example). In this case, (wrt the Atimes starting 0n 9-16-2009 running through 11-16-2009), they’re authentic, IMO.
Is it possible that the timestamps are the way that Norfolk concluded the times of access?
TerryS and others,
I appreciate the diligence, but the new spreadsheet still doesn’t resolve the problem of -0400 and -0500 within the same batches, which still occurs as 14400 and 18000. The variants occur within the same batch of documents – no one has yet explained this. Even for documents coming from the same directory that must surely have been copied in the same copy command.
Steve – you are missing the patterns. Go to Franks text files to see them. There are groupings.
I always thought 0400 was odd when Frank mentioned it which is why I was looking to check the daylight saving theory. I now think the daylight saving theory is probably junk.
I think I am seeing 2 separate machines running here. I would like to know if -0400 was being created from BackupPc and whether the server timezone is -0400. It should also be noted that BackupPc has batch scripting options which could also account for any strange run times. What I would also like to know is do those options allow adding to an existing zip file.
MikeN – BackupPc has logs which the Police must have checked and which would verify at least some actions such as logins and retrievals. I am a bit bemused by what was apparently going on with the CRU Web server where the Police described some effort to hide activity. I dont understand why RC/FOIA would have felt the need to go to that effort unless they were concerned about giving away how they first got in.
Steve M-
The -0400 and -0500 time zone offsets are linked to the “local mtime” in columns B/C, per my message #99900. There is no “local-atime” column(s). mtime by necessity is anchored to local time when a file is overwritten so that there are no sync problems with users of the file through different time zones. atime is sort of a useless throwaway stat in that context.
“I now think the daylight saving theory is probably junk.”
Clivere, I have to disagree with you. The 0400 0500 variance matches up perfectly with the local-mtimes, even accounting for daylight savings.
Re: Steve McIntyre
Daylight savings time. During the summer they are 4 hours behind GMT and during the winter they are 5 hours behind.
When the month is between April and November the time difference is 4 hours. At all other times it is 5 hours.
Here is an example:
File FOIA/documents/Skagerrak-Foram-2010.doc has a time offset of -4 hours and was modified on Friday Oct 24th 2003 09:04:46 Eastern Daylight Time.
File FOIA/documents/cru-code/idl/pro/_oldest/plotcruts.pro.old has a time offset of -5 hours and was modified on Monday Oct 27th 2003 at 06:58:12 Eastern Standard Time.
Over the weekend of 24th – 27th Oct 2003 New York changed from Daylight Savings Time to Standard Time (the clocks went back)
TerryS, my point is different. I was aware of the Daylight time argument and the difference in British and Eastern fallbacks and had checked that. My point is within-batch differences.
For example,in the 2009-09-27 00:23 batch:
cru-code/f77/mnew/xspl12-tdm.for is -400 while
cru-code/f77/mnew/sh2sp_tdm.for is -500
The copy command must have been done at the same time on the same directory. So only one machine would be involved in the batch. When you look closely, there are many examples of this type.
dukec,
OK, I get your point now. The -400 and -500 definitely do correspond to Daylight/Standard offsets of the dates of the original documents (m-times) not the access dates (a-times). Even though the documents were originally made (for the most part) in the UK. Got it now.
TerryS,
it’s helpful to have the data in the raw form that you provided in the spreadsheet. thanks.
The issue with the three garbled time zones that were referred to above can be discerned in this raw form.
For these three files, the dates for all three files were set to 1980-01-01 but the zip mtimes were bleached to 00:00 in one case, but the raw mtimes are 00:16:46, 00:38:26 and 00:43:36.
Funny that the dates were both bleached but not the times. Does this anomaly suggest anything?
Perhaps a little info for those who don’t know what the various times being discussed are.
In the Unix world there are three times associated with a filesystem object (file):
mtime: modifcation time, the time when one or more data blocks were last written
atime: access time, the time when one or more data blocks were last read (1)
ctime: change time, the time when the inode (metadata) information about a file was last modified. This includes owner, group, permissions, timestamps.
1: It is possible on some file systems to disable updates to atime for performance reasons since many times the last access is not interesting.
It is possible to set the atime and mtime of a file explicitly using the touch command or system call interface. This feature is used by archiving software to preserve timestamps and for this reason the times are only a hint at what reality might look like.
Directory times are slightly different (or appear to be). The mtime reflects the last time the directory entry was modified (a file was created, deleted .or renamed). Atime reflects the last time the directory entry was read to access the file list. Modifing an existing file does not modify the mtime for the directory containing it.
TerryS,
another question about your spreadsheet. You’ve converted the MS-DOS local time information into interpreted dates. While you have this information open, could you also make a column with the raw local time.
Another question about how document information is handled.
Let’s consider a garden variety document – say one of the cru-code fortran programs or one of the briffa-external tree ring series storied as a text file. What properties does the file have in itself before it goes into the zip?
I take it that the file itself only has mtime in seconds and that it is only at the zip operation that the original mtime is translated into local time according to the time zone to which the zip-computer has been set.
http://ijish.livejournal.com/15630.html
test
Steve,
The MS-DOS time is stored in 2 bytes as follows:
Bits: 0-4 Seconds/2
Bits: 5-10 Minutes
Bits: 11-15 Hour
The MS-DOS date is stored in 2 bytes as follows:
Bits: 0-4 Day
Bits: 5-8 Month
Bits: 9-15 Year
1980 needs to be added to the year. I’ll add the raw values but I don’t think it will help. The format also means that the earliest date that can be represented is 1980-01-01 00:00:00
> For these three files, the dates for all three files were set to 1980-01-01 but the zip mtimes were bleached to 00:00 in one case, but the raw mtimes are 00:16:46, 00:38:26 and 00:43:36.
MS-DOS can not represent a date and time before 1980-01-01 00:00:00 and if you subtract 4 or 5 hours from the raw mtimes above you would get a 1979 date and time. Because of this zip sticks in 1980-01-01 00:00:00
> What properties does the file have in itself before it goes into the zip?
That depends on what file system it is stored on. Generally, on a unix system, it will have at least the following properties:
Size, UID, GID, Last Modified Time, Last Accessed Time, Last Status Change Time, Owner permissions (read, write, execute/search, setuid, sticky bit), Group permissions (read, write, execute/search, setgid), Other permissions (read, write, execute/search).
The zip file saved all of the above attributes except for Last Status Change Time.
The bleaching of these three oddball files remains a curiosity. The next files in the sequence 82/I00341 and 82/I00351 have unbleached mtimes:
1991-06-03 06:15:02 675944102
1991-06-03 06:41:52 675945712
I wonder why the mtimes of the three oddball 82/I003.. series were partially bleached. 315533806; 315535106
315535416
Steve,
315533806 = Mon Dec 31 19:16:46 EST 1979
315535106 = Mon Dec 31 19:38:26 EST 1979
315535416 = Mon Dec 31 19:43:36 EST 1979
Because of the format Zip uses when storing the local date and time (see #99968) it can not represent a date before 1980 so it uses the earliest date it can which is 1980-01-01 00:00:00.
Running the “stat” command from a linux CLP produces this:
File: `DukeCsAtimeTest.txt’
Size: 6568379 Blocks: 12832 IO Block: 4096 regular file
Device: 805h/2053d Inode: 143211 Links: 1
Access: (0644/-rw-r–r–) Uid: ( 1000/ *******) Gid: ( 1000/ *******)
Access: 2012-07-22 17:12:30.655796819 -0700
Modify: 2010-11-18 13:47:56.455404406 -0800
Change: 2012-07-24 21:58:28.939248657 -0700
*******-linux-laptop:~$
Access (atime)- Directory containing file was opened on this date/time.
Modify(mtime): file was created on this date/time
Change(ctime): filename was changed on this date/time
As you can see, simply opening the directory changed the atime, and the local time zone is included (Pacific time, in this case)
Atimes are sort of like the Heisenburg Uncertainty principle. The mere act of observing it, changes it. But peering into the zipfile header date catches it when it’s frozen in suspended animation, if you will. It’s in binary form, in the body of the zip archive as far as the computer doing the observing is concerned. It doesn’t have any local time zone info because the computer doing the reading doesn’t KNOW it’s an attribute. Once it’s extracted, it comes to life and is set according to the system clock of the computer doing the extracting.
That being said-
Since the atime column is labeled “gm-atime” ( in Frank b’s raw data) then I take that to mean that it’s GMT. But it was “frozen” on a computer, based on overwhelming evidence, with the system clock set to east coast time. Therefore, adjusting the gm-atime (subtracting 0400 from the files from 9-16-2009 up to 2Am on 11-1-2009, and subtracting 0500 on the files occuring after that date) would yield the time that files were accessed for the purposes of adding them to the zip archive.
A separate question I’ve had is why was this ever put in the hands of the local Constabulary instead of assigned to Scottland Yard? When I heard that originally, my first somewhat cynical assumption was “so they wouldn’t find an answer.”
It’s clearly international in scope, and deals with threats that are national rather than local.
It would have been very interesting to see how Scottland Yard would have handled this, and I bet they could have found laws that didn’t have statutes of limitation to prosecute on.
[Ah never mind, I’d always assumed Scottland Yard was the British equivalent to the FBI. It doesn’t appear that one exists.]
Lucia – is there a prize when the count gets to 100000?
TerryS (Comment #99972),
my point was different. I understand that the re-set mtime was before 1980-01-01 Eastern time. My question was different: out of all the files in that directory, why was the mtime bleached for these three files (while leaving files of apparently identical provenance) unbleached?
Carrick #99974
I agree with you on this. Early on (Nov 21), I observed that UEA was best off blaming things on Russian hackers and that there was potential downside if they caught RC_FOIA and he turned out to be an insider, as still seems possible to me:
http://climateaudit.org/2009/11/21/uk-whistleblower-legislation/
Also very relevant is that the UEA did not retain their own forensic specialist on their own nickel to ensure a thorough look, whereas they spent a lot of money on PR consultants.
Steve M-
I noticed that those 3 files are Shiyatov treering data. Does that data have any special context that would trigger extra attention from the leaker?
Steve McIntyre (Comment #99977):
It is very plausible that a nightmare scenario existed for UEA (and the hacker) in which if the alleged culprit were caught, he would then claim whistleblower status on the grounds that the material was subject to FOI and withheld. A formal record would be created on the FOI issue in a related criminal proceeding.
But I am not sure how that plays out as a defense for the hacker because of the amount of material, the fact that much of it is probably not subject to FOI disclosure and the manner in which it was collected.
So in that scenario both the leaker/hacker/whistleblower and UEA face potential liability–a mutually assured legal downside.
Therefore, the quasi-blackmail standoff suggested by Mosher in which UEA puts outside authorities through a sham investigation to save appearances but not actually unmask the perpetrator is not implausible.
RC-FOIA paid a lot of attention to Yamal. But there’s nothing about these three files that is of particular interest.
They are in the directory briffa-treering-external/ecat/yamal/rw/82/ The adjacent series have unbleached information. A small mystery still.
TerryS,
File FOIA/documents/Skagerrak-Foram-2010.doc, together with tdutch.pdf, is unique in a curious way.
These are the only two documents for which atime and mtime are identical, but which are not additionally bleached to January 1.
This is a seriously fascinating detective work thread. A lot of unanswered question, but all this obfuscation nicely covers FOIA’s tracks.
I’ve put up another version with extra fields. You can get it here
You should also look at theZip file specification. This is the spec I used when developing the program to extract the raw data from a Zip archive. It should give you some background to the column headings.
The spreadsheet now shows what part of the Zip file I got the various pieces of information from, which is why you should look at the spec.
I’ve also added a few columns at the end which show the MTime and ATime using New York’s timezone.
TerryS #99993
thanks for this.
I win #100000.
SteveMc–
Wow! Good thing I’ve manage to ban most of the spam bots or there might not have been a 100,000!
Steve – looks like Lucia can count! I had the following possibilities under serious consideration.
1. #100000
2. 100000
3. #00000
4. #0
5. #1
6. #gobbledegook
7. The Blackboard crashes
8. The end of the internet
Because of the last 2 I thought it best not to stay up and compete.
The use of ‘touch’ seems rather inconsistent/sloppy.
Why not touch all the files and remove all this information?
In FOIA/documents/ there are 18 files that have been touched to 5:00 on 2009/01/01, the same time as the emails.
Then in the same folder there are 5 files with an ADATE of 09-16, one 09-26, 5 on 09-28, etc, and finally the last file on 11-16.
This suggest to me that ‘RC’ got some files before Sept 16, then typed something like ‘touch *’, then got some more files but forgot or didn’t bother to touch them.
i.e. not ‘sophisticated and carefully orchestrated’, or ‘a high level of expertise and competence’ as stated in the Gregory interview.
[There was a similar style in the half-hearted redactions in the CG2 emails, which gave up after a small fraction of the emails.]
I second Kenneth’s call for an executive summary of what this all means. There have been some suggestions of a UK base (from the ATIMEs) and some of a US east coast base (the 5:00 touch hints at this). This would make a good blog post.
I haven’t seen this mentioned before-
There are 2 additional file-groups with what appear to be bleached timestamps.
1996-01-01 00:00:00
FOIA/documents/briffa-treering-external/
The directory and 16 subfolders share this date/time.
2004-01-01 00:00:00
FOIA/documents/cru-code/
The directory and 25 subfolders share this date/time
Paul Matthews-
The “touch” command isn’t recursive. If you touch the directory it will change the timestamps on the directory and the subfolders/ files one step down. It doesn’t reach down to lower folders and files.
This could have been an oversight.
Re: Paul Matthews (Jul 26 06:51),
I noticed the UTC/local and EST/DOS time delta.
It looks to me like the “touching” was done on a machine in EST (Jan 1 00:00:00) and the archive was created in UK summer time (Jan 1 06:00:00 local) / (Jan 1 05:00:00 UTC)
It should be noted that it’s not a requirement to buy a plane ticket to accomplish this, simply set TZ before running the respective command(s).
bob-> TZ=EST5EDT touch -d 2009-01-01T00:00:00 bfile
bob-> TZ=GMT0BST zip archive bfile
bob-> zipinfo -v archive | grep ‘last modified’
file last modified on (DOS date/time): 2009 Jan 1 05:00:00
file last modified on (UT extra field modtime): 2009 Jan 1 06:00:00 local
file last modified on (UT extra field modtime): 2009 Jan 1 05:00:00 UTC
bob->
Duke 34, yes that’s interesting, 2 lots of directories all set to 5am GMT, midnight US eastern, like the emails but different year.
Duke 35, yes I know touch doesnt go down the tree, I was talking about the files in the top directory itself, /documents.
Schnoerkelman, yes of course this all could have been done deliberately as a red herring!
However,
find . -exec touch {} \;goes down the tree. If the person didn’t know this, that would set a limit on the sophistication of the operation.
Another attribute that hasn’t received attention thus far:
uid 1002 gid 1002
The User and Group ID.
I use Ubuntu 9.1 which assigns user/group ID #’s starting at 1000 by default. Most popular Linux distros follow the same scheme.
When a user is registered on a system he/she is assigned the next available number in sequence, and by default is assigned to a group with the same number. This applies, however, to users with limited permissions.
The Admin with SUDO permissions typically would be assigned 1000 but the group GID is always 0.
So the Admin attribute would look like this:
Uid: ( 1000/[username]) Gid: ( 0/ root)
uid1002/gid1002 spans the entire archive with the exception of 2 files:
russia.rw
russia.mxd
So the target Linux machine that created the zip archive had at least 3 users, and RC-FOIA was logged in with limited permissions, not as SUDO.
A large network would have 100’s of users with different uid/gid numbering-naming schemes, but in this case it appears to be the default scheme with very few users. possible 1 machine. If RC-FOIA was the owner, why did he choose to login with limited permissions when working on the archive? Or did he use a small system where he wasn’t the owner?
Re: Carrick (Jul 26 10:05),
OK, aber wenn schon dann schon:
find . -depth | xargs touch …
OK, I do this for money, silly them 🙂
find . -depth -print0 | xargs -0 touch
I used to do this for money 😉
Re: TerryS (Jul 27 02:05),
Good point, well argued! I had to go the man page for that one.
It’s GNU only though (or at least not on SysV based systems) which takes 1/2 point off ;-).
I confess that I’ve not often had to deal with filenames with newlines in them but when users are involved better to be safe than sorry.
Re: Schnoerkelman
> This allows file names that contain newlines…
It’s file names with spaces as well.
xargs will treat “This File Name” as three separate arguments when passed it with find. With the -print0 and the -0 it treats it as a single argument.
Duke C – your latest post is of interest to me and increases the need for me to try to get an installation of BackupPc running.
I am a relative newcomer to Linux but have installed BackupPc on both Ubuntu and Peppermint versions of Linux and have managed to get them networked. However BackupPc needs futher configuration work to get it running for both server and client and I am still to get to grips with that.
If I get it running and test the creation of zip files from within BackupPc then I will also need to run the kind of commands that you and TerryS have been running to review the resultant zip files.
Any guidance on running the “stat” command you were mentioning earlier as it is not something I am familiar with. I can have a look for instructions on the web but if you have any advice it would be helpful.
Re: TerryS (Jul 27 05:32),
Yes, of course (insert facepalm here), sigh.
Not having one of my more brighter days today, am I? I’ll blame it on the heat which has finally arrived in Germany. Though I’m sure most in the US would find it laughable, 32C feels quite warm enough when sitting in non-airconditioned rooms full of machines.
Actually, this morning I was thinking that find needs a builtin -xargs switch that works like -exec but starts only one command with multiple arguments as appropriate. Doing it within find would eliminate all the shell problems which I think has nice properties.
Since this is fully OT I’ll stop now and wish all a Nice Week End!
Re: clivere (Jul 27 06:08),
The stat command is part of the GNU coreutils package.
Man page here: stat(1)
clivere (Comment #100048) ,
My computer skills are merely above average. There is a level of expertise amongst the commenters here that’s above my paygrade.
I usually spend half my time furtively hunting and pecking through the Man Pages when I want to accomplish something, and almost always figure it out and get it done.
Using the Man Pages would be a good starting point.
Duke C – IT is one of those areas where it is possible to be a specialist but where there is not really an overall IT expert. From my own background which crosses many disciplines I know that it is often necessary to obtain the views of lots of specialists with different skills to try to problem solve.
Like many of the others here I am trying to figure out if the IT forensics are a deliberate red herring or information unknowingly left behind which we dont yet fully understand.
Also based on my last comment going into moderation we also clearly have the worlds leading specialist of blog tweaking as webwaster of the Blackboard!!!!!
“Also if you stratify the times, here’s something else that’s interesting. In a first cut at days, there is access on all days of the week except Friday (although allowing for daylight time, there is access very late on a Friday.)”
What day were the backups run? If the person is accessing files on a backup server, one might want to avoid messing with them when backups are running to avoid “file in use” errors attracting attention and drawing scrutiny.
whooooops – unfortunate typo – w not even close to m on the keyboard – upside down dyslexia. Must have been the shock of being moderated. Only the 4th blog that has happened to me!