Friday, June 3, 2005

Another brick to building the Total Information Awareness system

The Total Information Awareness system was so obviously a plan to create Big Brother. While the project ran into some political difficulties and was "canceled", the individual pieces have obviously continued to be developed. I am tracking those on this thread in my blog.

Forging an anti-terrorism search tool (Published: June 2, 2005, 5:52 PM PDT, By Stefanie Olsen, Staff Writer, CNET News.com)

This article discusses a project that obviously fills in part of the Total Information Awareness system picture.

The technology, released as a prototype in recent weeks, is designed to mine a corpus of documents for associated ideas or connections--connections between two unrelated concepts, for example, that would otherwise go unseen or would take countless hours of investigative work to discover. The project was specifically funded for anti-terrorism efforts and initially was used for searching over data within the 9/11 Commission report and public Web pages related to the suicide bombings carried out by terrorists who hijacked three U.S. commercial planes.

"Say you have the kind of question that connects these two people that we don't know about. You could start reading through all those documents. But our system is designed to look specifically for those evidence trails" that connect those two people, said Rohini Srihari, UB professor of computer science and engineering.

The research is being conducted at The Center for Excellence in Document Analysis and Recognition.

CEDAR Overview

A wide variety of documents are encountered by each of us everyday. They cover all spheres of our lives including commerce, education, law, health, religion, music and entertainment. Some of these documents have a simple and predictable structure such as a page in a printed book. Others have much more complex structure such as those involving figures, tables, logos, signatures, handwriting, etc. Discovering methods and algorithms for analyzing the structure and content of complex documents, and their generalization to related domains, is the focus of research at CEDAR.

From a computer science research and education perspective the work spans the areas of pattern recognition, machine learning, information retrieval and computational linguistics. There is a continuum from the analysis of scanned document images to areas such as text mining and information retrieval. There is also a continuum from analyzing documents for forensic purposes to the analysis of fingerprint images and other areas of biometrics.

Having received sustained funding from the United States Postal Service (USPS) since 1984 CEDAR was designated the Center of Excellence for Document Analysis and Recognition (CEDAR) by the USPS in 1991. Today, while postal-related document analysis problems continue to be of interest at CEDAR, a number of projects have emerged with other sponsors.

CEDAR has been a conduit for scholarly collaboration as well as individual accomplishment for many scientists and students. Over a period of 20 years faculty, students and research scientists at CEDAR have published more than a thousand scientific papers. CEDAR has supported more than 500 students, resulting in the award of several hundred master's degrees and more than 40 doctoral degrees.

They've already developed and fielded a few technologies:

  • Name and Address Block Reader
  • Handwritten Address Interpretation
  • Handwriting Verification and Recognition

All of which would be useful in the automated Big Brother picture that the TIA team put together.

Remember that the Total Information Awareness system was the original name, and that they renamed it to Terrorism Information Awareness system after the September 11, 2001 attacks made anti-Terrorism of interest. However, just because they say they're focusing on "terrorism" doesn't mean that it can't be used for other purposes. Information is information, and when you have this kind of data mining system you can use it to search for any information.

e.g. I see this trend in the public discussion that says "Fundamental Christianity is the only valid spiritual path". That idea is fine until that idea becomes enshrined in the Government, and the Government starts enforcing it. Given that the government leaders are tending towards speaking that idea more and more frequently, it's rather possible they'll begin using the Government agencies to enforce that idea. So, would we find this system employed towards finding connections among people in the "new age" movement?