Scamseek - The Implications of Text Mining Technology

Scamseek is the result of a 15 month collaborative project between the Australian Government and University of Sydney and Macquarie Univeristy. Scamseek is an industrially viable system that retrieves potential internet scams and classifies them according to risk. The system will be utilised by the Australian Securities and Investment Commission (ASIC) to efficiently detect scams and result in speedier convictions of scammers and thus savings to potential victims.

During the 15 month trial, Scamseek correctly identified potential scam internet sites and documents 80% of the time. The trial also resulted in the quick detection and conviction of one scammer. ASIC estimates that based on the trial and the fact that it operates 24/7, the estimated savings in human effort in monitoring internet scams will increase by the order of 100 to 1.

The technology driving Scamseek uses semantic model of language, Systemic Functional Grammar, to model the semantics of the scammers. The team building Scamseek consisted of 3 linguists, 2 computational linguist and 3 software engineers. The lead researcher, Professor Jon Patrick's work as a psychotherapist contributed to the development of Scamseek, based on using language from a psychotherapy perspective. "Scams are about camouflaging motivation through language, which is similar to patients disguising their issues in psychotherapy sessions. We looked at what strategies and language constructs are used to disguise motivation."

Scamseek has proven that Systemic Functional Grammar can be used to identify documents or information on the internet that fits carefully crafted grammatical models. The implications for expanded use of this technology are vast. Related uses could include the tracking of terrorist activity and the identification and tracking of crackers and those involved in internet crime and fraud.

But, the technology could also be used to data mine personal details. This could be used to assist fraudsters in identity theft, particularly identity theft of those who maintain active internet presence. It could also assist spammers and fraudsters to identify key victims for their scams. This could result in the ultimate marketing tool!

Like many of our technological developments, the benefit to society is counterbalanced by the potential of misuse. The technology of Scamseek will filter to mass use in time. Our knowledge of the technology and a better understanding of its capabilities and limitations will assist all net users in maintain a level of personal and privacy protection. Any additional information/comments about the Scamseek technology would be appreciated.

Scamseek Official Website: http://www.cs.usyd.edu.au/~lkmrl/scamseek.htm

Submitted by: Mako

Delicious  •  Digg  •  StumbleUpon  •  Reddit  •  Furl  •  Facebook  •  Technorati  •  Icerocket
 Talkback