Three misconfigured AWS S3 buckets have been discovered wide open on the public internet containing "dozens of terabytes" of social media posts and similar pages – all scraped from around the world by the US military to identify and profile persons of interest.
The archives were found by veteran security breach hunter UpGuard's Chris Vickery during a routine scan of open Amazon-hosted data silos, and these ones weren't exactly hidden. The buckets were named centcom-backup, centcom-archive, and pacom-archive.
CENTCOM is the common abbreviation for the US Central Command, which controls army operations in the Middle East, North Africa and Central Asia. PACOM is the name for US Pacific Command, covering the rest of southern Asia, China and Australasia.
Vickery told The Register today he stumbled upon them by accident while running a scan for the word "COM" in publicly accessible S3 buckets. After refining his search, the CENTCOM archive popped up, and at first he thought it was related to Chinese multinational Tencent, but quickly realized it was a US military archive of astounding size.
"For the research I downloaded 400GB of samples but there were many terabytes of data up there," he said. "It's mainly compressed text files that can expand out by a factor of ten so there's dozens and dozens of terabytes out there and that's a conservative estimate."
Just one of the buckets contained 1.8 billion social media posts automatically fetched over the past eight years up to today. It mainly contains postings made in central Asia, however Vickery noted that some of the material is taken from comments made by American citizens.
The databases also reveal some interesting clues as to what this information is being used for. Documents make reference to the fact that the archive was collected as part of the US government's Outpost program, which is a social media monitoring and influencing campaign designed to target overseas youths and steer them away from terrorism.
Vickery found the Outpost development configuration files in the archive, as well as Apache Lucene indexes of keywords designed to be used with the open-source search engine Elasticsearch. Another file refers to Coral, which may well be a reference to the US military's Coral Reef data-mining program.
"Coral Reef is a way to analyze a major data source to provide the analyst the ability to mine significant amounts of data and provide suggestive associations between individuals to build out that social network," Mark Kitz, technical director for the Army Distributed Common Ground System – Army, told the Armed Forces Communications and Electronics Association magazine Signal back in 2012.
"Previously, we would mine through those intelligence reports or whatever data would be available, and that would be very manual-intensive."
Before you start scrabbling for your tinfoil hats, the army hasn't made a secret of Coral Reef, even broadcasting the awards the software has won. And social media monitoring isn't anything new, either.
However, it is disturbing quite how easily this material was to find, how poorly configured it was, and that the archives weren't even given innocuous names. If America's enemies – or to be honest, anyone at all – are looking for intelligence, these buckets were a free source of information to mine.
After years of security cockups like this in the public and private sectors, Amazon has tried to help its customers avoid configuring their S3 buckets as publicly accessible stores, by adding full folder encryption, yellow warning lights when buckets aren't locked down, and tighter access controls.
"This was found before these new Amazon controls were added," Vickery said. "So we have yet to see how effective that yellow button will be."
Vickery said he notified the American military about the screwup, and the buckets have now been locked down and hidden. Unusually, the military contact thanked him for bringing the matter to their attention – usually talking to the armed forces is a "one-way street," Vickery said.
The Register asked the army for comment, and for more details on Outpost and Coral Reef, but wheels turn slowly in the Green Machine. We'll update the story as soon as more information is known. ®