Army project illustrates promise, shortcomings of data mining

Maverick analysts in 2000 turned up names of some 9/11 hijackers, but methods called into question.

In the spring of 2000, a year and a half before the 9/11 attacks, Erik Kleinsmith made a decision that history may judge as a colossal mistake.

Then a 35-year-old Army major assigned to a little-known intelligence organization at Fort Belvoir in Virginia, Kleinsmith had compiled an enormous cache of information -- most of it electronically stored -- about the Al Qaeda terrorist network. It described the group's presence in countries around the world, including the United States.

It was of great interest to military planners eager to strike the terrorists' weak spots. And it may have contained the names of some of the 9/11 hijackers, including the ringleader, Mohamed Atta.

The intelligence data totaled 2.5 terabytes, equal to about 12 percent of all printed pages held by the Library of Congress. Neither the FBI nor the CIA had ever seen the information. And that spring, Kleinsmith destroyed every bit of it.

Why did he do that? And how did a midlevel officer in a minor intelligence outfit obtain that information in the first place? Those questions lie behind the latest phase of a simmering controversy in Washington: whether something could have been done to prevent the terror attacks of September 11.

Kleinsmith worked for an Army project code-named "Able Danger." This past summer, a number of former project members -- none of whom had worked for Kleinsmith -- came forward to say that Able Danger had identified Atta and linked him to a convicted terrorist who is still serving time in federal prison for his role in the 1993 bombing of the World Trade Center.

The Able Danger members recalled charts showing names and pictures of suspects, and their links to each other. Rep. Curt Weldon, an outspoken Pennsylvania Republican and longtime supporter of intelligence reform, has demanded to know why the charts were never shared with an agency positioned to halt the attacks.

He also points out that the 9/11 commission failed to include any mention of Able Danger in its final report, which is regarded as an authoritative history of the attacks. The Pentagon searched more than 80,000 documents and found no chart with the name "Mohamed Atta." Weldon has accused the government of a cover-up and called for a criminal investigation.

But Able Danger, for all its intrigue, is just one piece of the unusual intelligence practices that Kleinsmith was engaged in, years before 9/11. In the late 1990s, Kleinsmith was the chief of intelligence for the Army's Land Information Warfare Activity, a support unit assigned to the Intelligence and Security Command. LIWA had broad authority to assist the Army and all military commands in conducting "information operations," a broad discipline that includes information warfare, public deception in combat, and intelligence analysis.

The Army's hub in this effort was the aptly named Information Dominance Center, based at Fort Belvoir. Since the late 1990s, the IDC has been home to some of the most innovative, unconventional, and controversial minds in the intelligence business. In its futuristic-style building -- its interior spaces designed by a Hollywood set artist to mimic the bridge of the starship Enterprise, complete with a large captain's chair in the center of the main room -- the IDC covered a range of topics.

Analysts tracked computer hackers who were targeting military networks, watched for potential avenues of Chinese government espionage, and charted the working relationships among foreign terrorists. To do this, the IDC relied heavily on a novel technique called "data mining."

On a recent afternoon at a coffee shop in Springfield, Va., not far from the IDC, Kleinsmith explained how data mining works. Putting pen to paper, Kleinsmith sketched clumps of circles, then surrounded some with concentric, wavy perimeters, until he'd drawn a crude version of a topographical map.

In data mining, he explained, a powerful search engine is used to "harvest" tens of thousands of Web pages that contain key words of interest -- "Al Qaeda" and "bin Laden," for instance. Another tool, called a data visualization program, then creates a three-dimensional map showing which words appear most often and how they relate.

The features and contours of the map tell an analyst about the underlying information's significance, Kleinsmith said. High peaks represent words that appear frequently. Peaks close together signal words that share some context. The analysts can click on a peak and pull up the information that helped create it. With data mining, analysts don't just read information, they "see" it. Kleinsmith called this kind of data mining "intelligence on steroids," and it was the IDC's hallmark.

Data mining works best with large sets of information, so it's particularly useful for Internet searches. At the IDC, Kleinsmith and three colleagues mapped Al Qaeda for Able Danger by mining open sources and fusing their results with classified government intelligence. But in addition to the mass of information they returned on suspected terrorists, they collected thousands of names of U.S. citizens.

People's names and personal information litter the Internet. Data harvesting, by its very nature, is indiscriminate and sweeping. Unavoidably, along with "Osama Bin Laden," an often-mentioned name like "Bill Clinton" will be harvested. That says a lot about the power, and the limits, of data mining, and why Kleinsmith destroyed what he had; the military is not supposed to be gathering information on U.S. citizens.

A First Test

From its earliest days, the IDC was a haven for renegades who wanted to use technology to step outside traditional intelligence-gathering, which relies heavily on classified sources and labor-intensive analysis. The center had high-level champions, including Lt. Gen. Keith Alexander, who from 2000 to 2003 directed the Intelligence and Security Command, the IDC's parent. Alexander now heads the National Security Agency, which operates the most-sophisticated electronic eavesdropping devices in the world.

Alexander also worked closely with James Heath, who headed the IDC in the late 1990s and whom former employees recall as a mix of driven genius and mad scientist. According to one such former employee of the center, Heath saw the IDC as "an experimentation table" on which to try out all kinds of new tools, depending on what the Army wanted at the time. Analysts and technicians worked together, "speaking the same language" and building useful data-mining tools. This dynamic didn't exist in other intelligence agencies, the former employee noted.

The IDC earned a reputation for innovation, but it also stepped over the bounds of traditional military intelligence. One of its first outside fans was Curt Weldon. Rep. Weldon had been advocating a "national collaborative center" to fuse law enforcement and intelligence units, and their information, from across the government.

In 1997, as the U.S. intervened in the Balkan War, senior Russian officials wanted Weldon (who had had good and long-standing contacts with the Russians) to meet in Belgrade with Yugoslavia's then-president, Slobodan Milosevic, to negotiate a peace settlement.

As Weldon stated on the House floor in 2002, the Russians offered to arrange a meeting between Weldon and Dragomir Karic, a rich Serb closely tied to Milosevic. Perhaps, the Russians said, Karic could act as a go-between with the Serbian president. But according to Weldon, State Department officials said they'd never heard of Karic, and thought the meeting was a ploy to manipulate the congressman.

Weldon met with Karic on neutral territory, in Vienna. But before leaving the States, he asked then-CIA Director George Tenet for background on the Serb. Tenet "called me back the next day and gave me two or three sentences ... and said they thought he was tied in with the corruption in Russia, but did not know much else about him," Weldon said.

Unsatisfied, Weldon contacted his "friends at the Information Dominance Center," which he considered a model for his own intelligence collaboration venture. The IDC "came back to me with eight pages about this man," who the analysts said "was very close to Milosevic personally." Former IDC employees confirmed that they provided Weldon with detailed information on Karic.

The talks with Karic bore no fruit. But when Weldon returned to Washington, he said, the FBI and CIA asked to debrief him on what he knew about Karic. Weldon delivered a thorough dossier.

"I told them that there were four Karic brothers; that they were the owners of the largest banking system in the former Yugoslavia; that they employed some 60,000 people; that their bank had tried to finance the sale of an SA-10 [missile system] from Russia to Milosevic; that their bank had been involved in a $4 billion German bond scam; that one of the brothers had financed Milosevic's election; that the house Milosevic lived in was really their house; that, in fact, the Karic brothers' wives were best of friends with Milosevic's wife; and that they were the closest people to this leader."

Surprised to hear such details on a man they barely knew of, the agents presumed Weldon got the information from the Russians. When he told them that the facts came from the Army's Information Dominance Center, Weldon recalls, the agents replied, "What ... is the Information Dominance Center?"

The event convinced Weldon that the CIA and the FBI didn't "get it," and that the IDC was the wave of the future. He became its biggest proponent in Congress, and sang its praises to the highest levels of the Defense Department.

After Weldon submitted the Karic dossier, word of the IDC's work spread outside the Army realm, Kleinsmith said. He had put just two analysts on the Weldon project, and they had taken only a day to generate the Karic profile. It "shocked me that we were outdoing these other organizations," namely the CIA, Kleinsmith said.

The China Problem

Intrigued with the Karic work, senior Pentagon officials decided to see if the tiny band of analysts could prove their mettle on a bigger problem. Officials were concerned about the possible leakage of U.S. military technology abroad, through unauthorized exports or through espionage. In the spring of 1999, the Pentagon "initiated a onetime project, to use data-correlation tools to decide if we could use those methods as a superior approach for counterintelligence," said John Hamre, the deputy Defense secretary at the time. "It was an experiment."

The people involved said the experiment looked specifically at technology transfers to China, whose military posed the gravest post-Cold War threat to the United States. Kleinsmith says the particular technology the IDC researched was arbitrary. "I think we flipped a coin" to decide. The point was to show the Pentagon that data mining could identify front companies, potential leaks of technology, and other vulnerabilities. "What we found was absolutely enormous," Kleinsmith said.

Former IDC employees and others familiar with the work say the China research exposed a variety of avenues through which military technology designs could end up in Chinese government hands. The IDC created a diagram showing how organizations and people in the United States were connected to the Chinese. Hamre had visited the center, and according to Weldon, reported back, "It is amazing what they are doing there."

The experiment "went well," the former IDC employee said. "Unfortunately, it went too well." During construction of those link diagrams, the names of a number of U.S. citizens popped up, including some very prominent figures. Condoleezza Rice, then the provost at Stanford University, appeared in one of the harvests, the by-product of a presumably innocuous connection between other subjects and the university, which hosts notable Chinese scholars.

William Cohen, then the secretary of Defense, also appeared. As one former senior Defense official explained, the IDC's results "raised eyebrows," and leaders in the Pentagon grew nervous about the political implications of turning up such high-profile names, or those of any American citizens who were not the subject of a legally authorized intelligence investigation. Rumors still abound about other notable figures caught up in the IDC's harvest. "I heard they turned up Hillary Clinton," the official said. The experiment was not continued.

"We determined that there were significant methodological problems," Hamre said of the IDC's techniques. Data-correlation analyses on raw information "produce impossibly large numbers of potential correlations. The numbers are too large to be operationally helpful."

But it appears not everyone in the military establishment agreed. Over the next several months, Kleinsmith estimated he gave more than 200 briefings on the IDC to members of Congress, generals, and senior government officials. "I could tell in three to four minutes if someone 'got it,' " Kleinsmith said. Hamre got it, he noted. And so, it seems, did officials with the Army's Special Operations Command, who, despite the unease over the China experiment, came to the IDC asking for information about a then-shadowy organization called Al Qaeda.

Able Danger

In the fall of 1999, top officials in the Special Operations Command were looking for a way to take the nascent fight on terrorism to its source. Al Qaeda had recently destroyed the U.S. embassies in Kenya and Tanzania. Special Operations' top officers, including the commander, Gen. Peter Schoomaker, "wanted the mission of 'putting boots on the ground' to get at [Osama] bin Laden and Al Qaeda," according to the 9/11 commission report.

But the military leadership believed that without concrete intelligence about Al Qaeda, a strike on the group was doomed to fail. President Clinton told the 9/11 commission, "If we had really good intelligence about ... where [bin Laden] was, I would have done it." Plans were already under way to attack Al Qaeda using AC-130 gunships. What was lacking was actionable intelligence to tell the military whom to hit and where.

Kleinsmith said that a pair of Special Operations officials visited him at the IDC in December 1999. At the instruction of the Joint Chiefs of Staff, the officials wanted as much intelligence on Al Qaeda and other transnational terrorists that could be mustered. They called the project Able Danger. (The word "able" has been commonly used for military exercises for more than two decades.)

The officials asked Kleinsmith about the technologies the IDC was using. "They didn't talk specifics," Kleinsmith said, but it was clear that "we had something they could really use." Later, he offered to "run some data" and produce a preliminary analysis. Within 90 minutes, Kleinsmith said, his analysts found evidence that Al Qaeda had a "worldwide footprint," including "a surprising presence in the U.S. That's when we started losing sleep."

In January 2000, Special Operations gave Kleinsmith and his team the green light to find as much information as they could. "They told us, 'Start with the words "Al Qaeda," and go,' " he said. A month later, the IDC conducted the first Able Danger harvest. The initial results, while impressive, were hardly what Special Operations forces needed to put boots on the ground.

The harvest "was a mile wide and an inch deep," Kleinsmith said. It included more than two terabytes of information, too vast an amount to provide specific targets. The IDC analysts could see the broad outlines of Al Qaeda, particularly its transformation from an idealistic movement into an operational network that could possibly inflict damage. Names, locations, and capabilities, and even the group's financial sources, were "coming together," Kleinsmith said. But the data set was still too big.

That didn't stop the analysts from trying to pare the information down. The former IDC employee said analysts played what they called "the Kevin Bacon game," referring to the popular notion that the prolific film actor can be linked to any other actor through no more than five people. (The game is based on the "six degrees of separation" theory that anyone on Earth can be linked to anyone else through five intermediaries.)

"Let's say you had a bad guy at each end of a string," the employee said. The analysts looked for the people between them, and then those people's ties to each other and to still others, asking whether any of the links came back to the initial bad guys. The analysts played this game routinely to firm up the connections in the large data sets. Eventually, they were able to isolate some 20 people about whom Special Operations wanted further, deeper analysis, Kleinsmith said.

The team developed charts to serve as "simplified explanations" of what they found. But those charts, now famously alluded to by Weldon and others as having named Mohamed Atta, sometimes measured 20 feet in length and were covered with small type, the former IDC employee said. The charts were so big, in fact, that analysts had to hang them on walls just to read them. The former employee doesn't remember seeing Atta's picture.

The IDC might have followed Atta's trail if it had been told to do so, the former employee said. But just pulling names at random from the chart was pointless. And a simple connection between two people on a chart was not evidence of any criminality or pending attack. "Do you have any idea how many people on the planet would go to jail just because they knew somebody bad?" the former employee asked.

The IDC produced an impressive array of intelligence, but it also came dangerously close to an important legal line. The basic harvesting methodology guaranteed that the names of U.S. citizens would appear. "You'll pull in 16,000 people in a harvest," Kleinsmith said. It's "100 percent likely" that an American will be there. And sometimes the names themselves seemed meaningless.

If an analyst found "Clinton," Kleinsmith noted, that could mean George Clinton, the funk musician, or the town of Clinton, Md. Was the collection accidental or intentional? Regulations that restrict domestic surveillance of U.S. citizens don't necessarily apply to names that are swept up inadvertently in a data harvest. The IDC team pulled in hundreds of names every hour, Kleinsmith said. When asked which prominent Americans were included, he replied, "Everybody was coming up."

Data Destruction

As quickly as the IDC garnered powerful fans, it also earned some enemies. The center was not a chartered member of the formal intelligence community -- the 14 agencies that in 1999 officially constituted the country's spy apparatus. For a support organization, buried several layers deep in the Army, to tread on territory normally reserved for big-name agencies like the CIA and the Defense Intelligence Agency, and to present intelligence gleaned from the Internet, of all places, was simply anathema to people steeped in decades of intelligence rules and culture. The IDC analysts were mavericks.

In particular, the Defense Intelligence Agency questioned the analysts' results on a number of projects, not just Able Danger, the former IDC employee said. "We'd show them our stuff, and they'd say, 'Show us the math.' " But the answers didn't always add up so neatly. The combination of data mining and hunches sometimes produced results that the bigger intelligence agencies viewed as murky, even if military commanders found them compelling.

At a Pentagon briefing on Able Danger in September of this year, Thomas Gandy, the Army's director of counterintelligence and human intelligence, cautioned reporters about inferring too much information from the "links" the IDC established, particularly because its data-mining tools were far less sophisticated than the ones used today. "Just that there are links established doesn't really mean anything," Gandy said. "In the primacy of this technology, you get some very goofy links that require research."

Kleinsmith and the former employee, as well as others who worked tangentially to the IDC over the years, insisted that the IDC analysts were senior and seasoned, and that they recognized the fact that simple links required further investigation. Yet the analysts' enthusiasm for a less tidy sort of inquiry, which often raised more questions than answers, divided intelligence professionals. Some former government officials, who declined to be named, derided the IDC analysts as "zealots" and said their work never produced the eureka-like results that some, particularly former Able Danger members, now claim.

One senior IDC analyst, Eileen Preisser, who worked with Kleinsmith on Able Danger and other projects, was characterized by a former Defense official as "an uncontrolled flake." Kleinsmith, who called Preisser an "analytical genius," admitted that she "has constant trouble in working with others in the community." Preisser has worked in several intelligence jobs, inside and outside the government, and those who know her see her as the prototypical IDC believer.

She "is especially critical of those folks who she feels did not, or do not, 'get' the technology," Kleinsmith said. "Instead of working within the system, maneuvering around the tough spots, negotiating and dealing, she tends to burn her way through an issue to get where she needs to go." Preisser now works for the National Geospatial Intelligence Agency. A spokeswoman there said Preisser declined all requests for interviews.

In early 2000, in the midst of Able Danger, a lawyer with the Army's general counsel visited Kleinsmith. As Kleinsmith testified before the Senate Judiciary Committee in September, the lawyer reminded him that under Army regulations, any data the IDC collected on U.S. persons -- even inadvertently -- had to be destroyed within 90 days. If analysts could establish a legitimate reason to investigate a person further, they could keep the corresponding data.

But with potentially tens of thousands of names, checking each one would have been impossible, Kleinsmith said. In the Pentagon briefing, Gandy concurred: "I don't think they had the capability to scrub it in the fashion that the oversight rules could live with."

By the spring of 2000, Kleinsmith said, the IDC had the list of 20 individuals whom Special Operations wanted investigated further under Able Danger. But in March, Kleinsmith was ordered to cease all work on the project. He believes the order came from outside the IDC's command. From May to June, Kleinsmith and his team destroyed the information, and possibly the linkages between Mohamed Atta, Al Qaeda, and convicted terrorists already sitting in U.S. prisons.

"It was terrible," Kleinsmith said.

'So It Begins'

After the data purge, the heartbeat of the IDC slowed. In late September 2000, the center was authorized to begin new work on Able Danger, Kleinsmith said. A data harvest would take no time to replicate, but the analysis on people and locations was much harder to reproduce.

But Able Danger never ramped up a second time. On October 12, while the USS Cole was docked in Yemen's port city of Aden, Al Qaeda suicide bombers rammed the destroyer with a small explosive-laden boat, killing 17 U.S. sailors and wounding 39. From then on, U.S. Central Command, responsible for the Middle East, became the IDC's primary customer, Kleinsmith said. Special Operations Command, unhappy because the IDC's attention had shifted, moved Able Danger to a private intelligence research center run by Raytheon in Garland, Texas, Kleinsmith said.

A Raytheon spokesman did not respond to a request for comment. But Eileen Preisser, the IDC analyst who had worked on Able Danger with Kleinsmith, was working for Raytheon after the September 11 attacks. In a 2001 interview with National Journal, she spoke of projects she was involved with that were essentially the same as those at the IDC.

After the Cole bombing, the IDC concentrated on projects not related to Al Qaeda. "We went on to do some other things, other projects," the former IDC employee said. Less than a year later, the 9/11 attackers struck. Looking back, Kleinsmith doesn't claim that he saw the attacks coming. Rather, he felt resigned. "I wasn't surprised," he said. He had studied Al Qaeda's evolution and believed he knew its capabilities. "I thought, 'So it begins.'

Total Information Awareness

The 9/11 attacks breathed some new life into the Information Dominance Center. In late 2001, retired Navy Adm. John Poindexter, who had served as President Reagan's national security adviser, met with the director of the Defense Advanced Research Projects Agency, where Poindexter was soon to be employed. Poindexter was looking for a site to test new technologies under his Total Information Awareness program, which, not unlike the IDC, aimed to use open-source data and government information to understand terrorism.

TIA also looked at tools to examine commercial databases containing information on U.S. citizens, within the context of privacy regulations.

Poindexter wanted a proving ground staffed by seasoned, technology-inclined analysts, a "Manhattan Project" for counterterrorism, he said. The DARPA director, Tony Tether, told him to consider the IDC. After meeting with Gen. Alexander, the Army commander overseeing the center, Poindexter agreed to test some of the TIA tools at the IDC.

"TIA was a very good concept," the former IDC employee said. The center offered TIA "a high-speed testing bed" for its new technologies. "Some of the tools sucked, and some of them were good ideas," the employee said. The frustration came from officials' reluctance to use the tools for active intelligence projects. Poindexter emphasized that TIA was a research project and wasn't using data mining as part of any real intelligence operations. TIA was an experiment.

But the experiment was short-lived. In late 2002, Poindexter's role in TIA was revealed in the press. The controversial retired admiral's past caught up with him -- Poindexter was the central figure in the Iran-Contra scandal, which diverted the profits from covert arms sales to Iran to anti-Communist rebels in Nicaragua.

Members of Congress derided TIA as an Orwellian excess of the post-9/11 era. The funding was pulled. Kleinsmith, who had left the Army by the time TIA arrived, seemed perplexed by lawmakers' concerns. "We've had this capability for years," he remembered thinking. "Who cares?"

TIA's detractors declared a victory for privacy protection when they killed the project. Poindexter was forced to resign in August 2003. But research on TIA tools has hardly ceased.

Rather, it has moved into the intelligence agencies, where the work and the budgets for it are classified, Poindexter said, noting that now Congress has more-limited oversight and should be more concerned about privacy infringements. The former IDC employee concurred, saying "The [TIA] concept hasn't died off. It continues. And it continues elsewhere now, and I can't talk about that. The tools are continuing to be developed."

What-Ifs

Five years after Able Danger, Erik Kleinsmith seems oddly at ease for a key figure in a brewing political controversy. Inevitably, Kleinsmith would be a major witness in any investigation of the project. No one has suggested he did anything other than follow Army regulations in destroying the Able Danger documents.

Kleinsmith remains unconvinced that, despite the IDC's innovations, the 9/11 attacks were foreseeable. But "I do go to bed every night ... [thinking] that if we had not been shut down, we would have at least been able to prevent something or assist the United States in some way," Kleinsmith told the Senate Judiciary Committee during September's hearing. "Could we have prevented 9/11?" He paused, and then said: "I don't think I can ever speculate to that extent, that we could have done that."

Today, Kleinsmith is an employee with Lockheed Martin, working as a contractor to the Army's Information Operations Center, an IDC spin-off that is chartered to support the global war on terrorism. He oversees an intelligence training team of about 28 instructors, five of whom are working in Iraq to train U.S. analysts in data mining.

"One of the most amazing aspects of the Able Danger team is that, for a time, you had what I believe was the perfect combination of technology, data, and expert analysts that combined to create analysis that was above and beyond what the intelligence community was producing," Kleinsmith said. The results of the China experiment brought Special Operations Command to the IDC. That's proof enough for Kleinsmith that his group was providing what no one else could.

"I have been asked by several folks on Capitol Hill, members and staffers alike, whether the capability still exists to do what we did," Kleinsmith said. "My answer is, 'yes and no.' " Paradoxically, analysts are being trained to rely on the technological tools -- what Kleinsmith called "buttonology" -- too much, instead of thinking creatively on their own, he explained.

The technology is powerful, but needs to augment the analyst's work, he said. "There are still those who want to train analysts on how the engine of the car works instead of how to drive the car."

Kleinsmith recognized that the IDC's methods caused some consternation, but he takes pride in his former work and looks at the controversy pragmatically. "We understood that [there were objections], but we also understood that a lot of our customers didn't care."

Today, Kleinsmith is still struggling with the same puzzles. And, to hear him tell it, apart from the advancements in technology, little has changed. So much is still unknown, and undone, about the terrorist threat to the United States, he said. He can simply watch television to know that law enforcement isn't rounding up the terrorist cells he believes his team identified in the United States five years ago.

Ultimately, Kleinsmith sounds less like a man burdened by his past than one nervous about the future. No one seems to be acting on the information the IDC found that terrorists had taken up residence in the United States, far from New York, he said. And, as if they were listening, waiting for him to tip his hand, Kleinsmith cautiously added, "I'd just prefer not to say where they are."