Thursday, February 14, 2002

DARPA's Information Awareness Office, The Total Information Awareness System; Or, Big Brother in-carnate

IAO Mission:  The DARPA Information Awareness Office (IAO) will imagine, develop, apply, integrate, demonstrate and transition information technologies, components and prototype, closed-loop, information systems that will counter asymmetric threats by achieving total information awareness useful for preemption; national security warning; and national security decision making.

IAO Vision:  The most serious asymmetric threat facing the United States is terrorism, a threat characterized by collections of people loosely organized in shadowy networks that are difficult to identify and define.  IAO plans to develop technology that will allow understanding of the intent of these networks, their plans, and potentially define opportunities for disrupting or eliminating the threats.  To effectively and efficiently carry this out, we must promote sharing, collaborating and reasoning to convert nebulous data to knowledge and actionable options.

This is at once a nightmare directly from the worst conspiracy theories, U.S. Government policy, and a top secret research project. It is clear the projects sponsored by this office have been ongoing for a long time as many of the projects listed are described as continuation of prior projects. Everything is presented as if it's non-finished research, that has not been turned into operational systems or devices. But, at the same time, reading between the lines, it indicates some forms of these capabilities already exist.

It's important to remember that, while they say over and over this is meant to find "foreign terroists" it's merely a matter of choice to apply this to anybody. Today the target is "foreign terrorists" but suppose that tomorrow the target becomes "Mexican Americans" because so many of them are here illegally. Remember what happened to Japanese Americans during world war II (when many were detained in the American version of Concentration Camps).

All the information on this page was captured from the IAO website on November 15, 2002.

What's on this page:

Resources

  • After the big flap, the IAO office a) changed their logo to be less secret-society-like, and b) expunged some things from the site such as the BIO's of the lead people. Richard M. Smith [computerbytesman.com/] captured the missing pages out of the Google cache, and saved them on a web site. I have captured his pages into a PDF file here.
  • NY Times
  • March 10, 2003 [nytimes.com/2003/03/11/business/11PRIV.html] Software Pioneer Quits Board of Groove Mitchell D. Kapor,.., has resigned from the board of Groove Networks after learning that the company's software was being used by the Pentagon as part of its development of a domestic surveillance system. ... a person close to Mr. Kapor said that he was uncomfortable with the fact that Groove Networks' desktop collaboration software was a crucial component of the antiterrorist surveillance software being tested at the Defense Advanced Research Project Agency's Information Awareness Office ... The project has generated controversy since it was started early last year by Admiral Poindexter,..,whose felony conviction as part of the Iran-contra scandal was reversed because of a Congressional grant of immunity. ... The project has been trying to build a prototype computer system that would permit the scanning of hundreds or thousands of databases to look for information patterns that might alert the authorities to the activities of potential terrorists. ... "Mitch cares very much about the social impact of technology," said Shari Steele, executive director of the Electronic Frontier Foundation.. "It's the reason he founded E.F.F.," she said. ... On Feb. 11, House and Senate negotiators agreed that the Total Information Awareness project could not be used against Americans. Congress also agreed to restrict additional research on the program without extensive consultation with Congress. ... President Bush can keep the research alive by certifying to Congress that a halt "would endanger the national security of the United States."
  • Groove Networks home page [groove.net/] And, sure enough, there's a soldier in uniform on the front, sitting in front of a laptop computer, talking through a cell phone. Software features include
  • Collaboration, "Shared Spaces", joint file editing, messaging.
  • The laws & agreements detailed in the article are listed at the EPIC information page (above).
  • Washington Post
  • C|NET News
  • Reuters article, March 4, 2003 [news.com.com/2100-1028-991058.html?tag=fd_top] U.S., software maker craft Arabic tool The article details the difficulty in understanding Arabic writings ("The grammar of Arabic makes it difficult to distinguish words because of the way that word spellings change for conjugation and pronouns"). Basis Technology is the company in question, and their software can deal with understanding Arabic writing given all these linguistic difficulties. While not directly stated, this technology fits exactly in the various automatic language translation projects (Babylon, EARS and TIDES) listed below.
  • Basis Technology [basistech.com/] They have "Language Analysis" software for many languages (Japanese, Chinese, Korean, German and Arabic) and the Arabic version is said to have been created "in response to the needs of the U.S. Intelligence Community".
  • The Register:
  • Salon.COM
  • January 29, 2003 Total Information Awareness: Down, but not out [salon.com/tech/feature/2003/01/29/tia_privacy/] Congress may have put the brakes on the most ambitious government surveillance program ever. But for citizens worried about their privacy, TIA still means trouble.
  • Terrorism Information Awareness (TIA) [darpa.mil/iao/TIASystems.htm]: This is the initial morphing of the project as mentioned in the previous item.

Projects

This DARPA research project has a very wide and in-depth scope of massive proportions. Overall the effect is to create something akin to a "search engine" and to apply it to the whole "information space" available around the planet. The information space space covered by the project is not just what's on the Internet, but business transactions, biometric identification (via face-recognition) of people walking through public spaces, automatic speech recognition and language translation of captured telecommunications traffic, searching for "potential terrorists" and preventing "potential attacks", and more. In other words, they are working to create the technology and means to monitor pretty much everything happening around the planet, and to sift through the mass quantities of data looking for patterns indicating threats. The purpose is to identify possible threats, and have a better ability to choose from a variety of response options.

Map of the projects, links to project home pages, and goal statements quoted from their pages:

ProgramTheir mission statement and goalsMy translation
Total Information Awareness (TIA) SystemThe goal of the Total Information Awareness (TIA) program is to revolutionize the ability of the United States to detect, classify and identify foreign terrorists – and decipher their plans – and thereby enable the U.S. to take timely action to successfully preempt and defeat terrorist acts.  To that end, the TIA program objective is to create a counter-terrorism information system that: (1) increases information coverage by an order of magnitude, and affords easy future scaling; (2) provides focused warnings within an hour after a triggering event occurs or an evidence threshold is passed; (3) can automatically queue analysts based on partial pattern matches and has patterns that cover 90% of all previously known foreign terrorist attacks; and, (4) supports collaboration, analytical reasoning and information sharing so that analysts can hypothesize, test and propose theories and mitigating strategies about possible futures, so decision-makers can effectively evaluate the impact of current or future policies and prospective courses of action.Automatically sift through the "information space" looking for patterns of interesting activity. It is expected to notice these patterns of interesting activity within an hour. This could be anything happening within the "information space", or in other words, anything happening around the world. For example, a pattern to look for could be bank activity in certain bank accounts, coupled with telephone calls to certain phone numbers, purchases of airline tickets, and purchases of certain chemicals; a pattern indicating a possible attack on an airliner by sneaking aboard a canister of nerve gas, releasing it in mid-flight killing all aboard, leaving a derelict rogue airplane flying uncontrolled until it runs out of fuel and crashes somewhere random.

The phrase "automatically queue analysts" suggests that the computer systems will be constantly scanning, and when something is detected it will send a notice to Military Intelligence Analysts. The military analysts would have a "job queue" and when each analyst finishes one task, another task could be waiting in their queue.

This is no doubt similar to "call center" systems, of the type used for customer service centers in large corporations that field large quantities of customer requests. In such a system, when you call the customer service phone number, your call is placed in a queue and your call is answered by the first available customer service agent.

BabylonThe goal of the Babylon program is to develop rapid, two-way, natural language speech translation interfaces and platforms for the warfighter for use in field environments for force protection, refugee processing, and medical triage. Babylon will focus on overcoming the many technical and engineering challenges limiting current multilingual translation technology to enable future full-domain, unconstrained dialog translation in multiple environments.This appears to be the same as the pocket translator devices widely used by international travellers. The unit pictured has the typical high-impact-resistant design of military devices. The page mentions "The Babylon program will focus on low-population, high-terrorist-risk languages that will not be supported by any commercial enterprise.  Mandarin and Arabic were selected based on immediate and intermediate needs." acknowledging that this kind of device exists for "popular languages" (for example, Japanese/English) but not the unpopular ones.

Imagine the typical "Grunt" going into a foreign country. How will they communicate? This is it.

What's interesting is to contrast this with the EARS program before. The need is the same, instant information translation between a variety of languages, it is only the application that's different.

Bio-SurveillanceThe goal of the Bio-Surveillance program is to develop the necessary information technologies and resulting prototype capable of detecting the covert release of a biological pathogen automatically, and significantly earlier than traditional approaches.  The key to mitigating a biological attack is early detection.  Given the availability of appropriate medications, as many as half the expected casualties could be prevented if an attack is detected only a few days earlier than it would have otherwise been identified.   For contagious biological agents, early detection is also clearly paramount.  The Bio-Surveillance program will dramatically increase DoD's ability to detect a clandestine biological warfare attack in time to respond effectively and so avoid potentially thousands of casualties.As it says, they desire devices which can automatically detect biological attacks. Many biological agents are odorless, tasteless, etc, and given the long incubation period could cause widespread infection before being detected. If we recall the anthrax incidents in the Fall of 2001, the initial days were full of confusion over the nature of the incident, where the anthrax came from, and so forth.

The detection period shown in the picture is over a period of days. With current devices it is 4-8 days requred, and they desire decreasing the period to 1-3 days.

CommunicatorThe specific goal of the Communicator program is to develop and demonstrate “dialogue interaction” technology that enables warfighters to talk with computers, such that information will be accessible on the battlefield or in command centers without ever having to touch a keyboard. The Communicator Platform will be wireless and mobile, and will function in a networked environment. Software enabling dialogue interaction will automatically focus on the context of a dialogue to improve performance, and the system will be capable of automatically adapting to new topics so conversation is natural and efficient.This is to make interaction with battlefield computer devices more fluid. You can imagine the chaos in most battlefield situations, and how computer keyboards would be difficult to use. You want something to talk to, just like they have in the movies.
Effective, Affordable, Reusable Speech-to-Text (EARS)The Effective Affordable Reusable Speech-To-Text (EARS) program is developing speech-to-text (automatic transcription) technology whose output is substantially richer and much more accurate than currently possible. This will make it possible for machines to do a much better job of detecting, extracting, summarizing, and translating important information. It will also enable humans to understand what was said by reading transcripts instead of listening to audio signals.The picture associated with this shows a communication tower, radio signals, going to "Rich Transcription", and becoming a transcript. This is tieing speech-to-text conversion (speech recognition technology, and conversion to written language) to automated language translation. The source of the speech being recognized, and then translated, is likely any telecommunication means (broadcast radio & TV, satellite phones, cellular phones, landlines, etc).

The capability would exist to, for example, tap into a radio broadcast, and regardless of what the language being spoken, to convert the spoken speech to written text, and then translate that written text into whatever language the military analyst speaks (presumably English).

Also see TIDES below and Babylon above.

Evidence Extraction and Link Discovery (EELD)The goal of the Evidence Extraction and Link Discovery (EELD) program is development of technologies and tools for automated discovery, extraction and linking of sparse evidence contained in large amounts of classified and unclassified data sources.  EELD is developing detection capabilities to extract relevant data and relationships about people, organizations, and activities from message traffic and open source data.  It will link items relating potential terrorist groups or scenarios, and learn patterns of different groups or scenarios to identify new organizations or emerging threats.Consider again the "information space" concept. The system proposed here is able to suck in vast quantities of data. What they want to look for is threads of data scattered around the cloud of all available information.

Think about the data leading to the people leading the September 11, 2001 attacks. They bought one-way airplane tickets using cash, an event which could stand out on its own because most people buy round-trip tickets using credit cards. This might be innocent, but when 5 people on the same flight all buy one-way airplane tickets using cash you begin to wonder. Then if you notice that on three other flights there are five people on each flight buying one-way airplane tickets using cash, it raises the interest of this activity. The next step, you look at who these people are, and any prior knowledge of them, and you might find some of them know each other, some have been enrolled in flight training schools, have stayed at the same hotels at the same times as others, etc. By this time of tieing these pieces of data together, the potential threat level should be pretty high.

FutureMapThe DARPA FutureMAP (Futures Markets Applied to Prediction) program is a follow-up to a current DARPA SBIR, Electronic Market-Based Decision Support (SB012-012).  FutureMAP will concentrate on market-based techniques for avoiding surprise and predicting future events. Strategic decisions depend upon the accurate assessment of the likelihood of future events.  This analysis often requires independent contributions by experts in a wide variety of fields, with the resulting difficulty of combining the various opinions into one assessment.  Market-based techniques provide a tool for producing these assessments.

There is potential for application of market-based methods to analyses of interest to the DoD.  These may include analysis of political stability in regions of the world, prediction of the timing and impact on national security of emerging technologies, analysis of the outcomes of advanced technology programs, or other future events of interest to the DoD.  In addition, the rapid reaction of markets to knowledge held by only a few participants may provide an early warning system to avoid surprise.

The picture shows the throwing of darts at a dartboard, and indeed predicting the future is tricky. But, the pattern of darts indicates something interesting:
  • "General Poll" and "Poll of Experts" is way off target.
  • "Analysis Reports" are better, but still wide of the mark.
  • "Delphi Methods" even better.
  • The "Market Method" is implied to be highly accurate.
GenisysProgram Genisys is a FY02 new-start program. The Genisys program’s goal is to produce technology enabling ultra-large, all-source information repositories.  To predict, track, and preempt terrorist attacks, the U.S. requires a full-coverage database containing all information relevant to identifying: potential foreign terrorists and their possible supporters; their activities; prospective targets; and, their operational plans.  Current database technology is clearly insufficient to address this need.This is the information storage and retrieval system. It includes distributed databases spread around the world, support for complex queries, and more.
GenoaProject Genoa, in the process of concluding, provides the structured argumentation, decision-making and corporate memory to rapidly deal with and adjust to dynamic crisis management.Computer aided decision making, by having the computer store and reference and link vast quantities of information. "Corporate Memory" is to store all information and actions taken by an organization, and make those avilable in later times so that the organization can learn and grow from what happened in the past.
Genoa IIGenoa II is a FY02 new-start program.  It will focus on developing information technology needed by teams of intelligence analysts and operations and policy personnel in attempting to anticipate and preempt terrorist threats to US interests.  Genoa II’s goal is to make such teams faster, smarter, and more joint in their day-to-day operations.  Genoa II will apply automation to team processes so that more information will be exploited, more hypotheses created and examined, more models built and populated with evidence, and in the larger sense, more crises dealt with simultaneously. Bigger and better than Genoa.

The picture shows multiple levels of information query systems and decision trees. That is, if you have a highly complex problem, how do you interpret everything and come up with an answer. For example, say you have a derelict airplane flying across the U.S., not responding to radio calls, just flying. Say, at the same time, there are international tensions. Is the airplane flying without response a simple accident, or is it related to the international tensions? Is it an innocent act, or an attack? To work this out, you might have to refer to dozens of pieces of data scattered all around, and analyze many possible interpretations of every piece of raw data.

Human ID at a Distance (HumanID)The goal of the Human Identification at a Distance (HumanID) program is to develop automated biometric identification technologies to detect, recognize and identify humans at great distances.  These technologies will provide critical early warning support for force protection and homeland defense against terrorist, criminal, and other human-based threats, and will prevent or decrease the success rate of such attacks against DoD operational facilities and installations.  Methods for fusing biometric technologies into advanced human identification systems will be developed to enable faster, more accurate and unconstrained identification of humans at significant standoff distances.They mention "Face Recognition", "Gait Recognition" and "Iris Recognition", and the ability to do this at a distance. The distance is initially 25-150 feet, with a target of 500 foot range by 2004. They also desire a system that can operate 24/7 (all day long, every day).

Identification is going to be based on the persons face, how they walk, and their eyes. Presumably when someone wants to sneak through an area known to be monitored, they would disguise themselves, but it would be very hard to disguise how they walk, or their eyes.

Translingual Information Detection, Extraction and Summarization (TIDES)The Translingual Information Detection, Extraction and Summarization (TIDES) program is developing advanced language processing technology to enable English speakers to find and interpret critical information in multiple languages without requiring knowledge of those languages.Two other projects (Babylon and EARS) are involved with language translation. Obviously there are lots of languages in the world, and to understand others you have to understand their language. For the whole system to work, and automatically recognize threats, the system needs to automatically recognize threats in any language.
Wargaming the Asymmetric Environment (WAE)The goal of the Wargaming the Asymmetric Environment (WAE) program is the development and demonstration of predictive technology to better anticipate and act against terrorists.  WAE is a revolutionary approach to identify predictive indicators of attacks by and the behavior of specific terrorists by examining their behavior in the broader context of their political, cultural and ideological environment.This is the use of mathematical techniques, Wargaming Theory, to predict attacks. The picture mentions "Pre-attack behavior" and "post attack behavior", indicating that a group about to stage an attack will follow some pattern of activities that might tip off someone of the impending attack (if they knew what to look for).

Admiral Poindexter

Yes, this is the same Admiral Poindexter who was National Security Adviser to President Reagan, and was convicted for lieing to Congress over the Iran/Contra affair. The convictions were later overturned on a technicality.

Decoding the Terminology

Information Space: This is a mathematically related term referring to the vast amount of information. Contemplate having kerjillions of pieces of information ("hundreds of millions of documents"), some of which are related to one another, and the information you desire may be snippets of data scattered around. It's easy to see this information as a "cloud" of information, it is connected in multidimensionally complex ways, and it is the sum total of this information which is the "information space".

Queue: A queue is the same thing as a "waiting line", such as you experience when buying tickets at a movie theater. When "waiting in line" you are "in a queue".

"Leave-behind prototypes": Not entirely clear is this phrase. It appears that the prototype implementations created by the research organizations will be "left behind" so that the contracting agencies can continue to use the technology. For example "...as well as develop a series of increasingly powerful leave-behind prototypes that both provide immediate value to the Intelligence Community and stimulate feedback to guide follow-on research".

Data model: When designing a computer system part of what you design is the information which the computer system processes. Computer designs always "abstract" or "model" the processed information, because computers cannot directly process the real world. The phrase "data model" refers to the abstract form information takes when inside a computer.