Following the successful event run during ICARIS 2009, this year will host two informal industry-sponsered competitions for AIS practitioners to try out their algorithms on some real-world application data.

The domain of the competitions are based on Anomaly Detection and Data-Mining. Data will be made publicly available for preliminary results and fine-tuning. Additional test data will made available at the conference. Participants may be expected to give a short, informal presentation on their methods.

Please see the descriptions of individual competitions for details on registration and accessing data.

DSTL: Anomaly Detection in Mass Spectra

Organiser: Mark Neal, University of Aberystwyth

The rapid, non-invasive detection of low-levels of unusual substances in everyday environments is an extremely useful capability in many security and safety applications. Examples include:

The task of carrying out such detection is currently undertaken by sniffer-dogs which are extremely sensitive, but suffer from a number of set-backs. Dogs are difficult, time-consuming and expensive to train. They also tire and become distracted fairly rapidly when in use (on average they are only useful for about 40 minutes at a time). They are also vulnerable to deliberate distraction and are difficult to keep fed, watered and exercised in many deployment scenarios. The use of sensitive monitoring equipment such as mass-spectrometers is ultimately seen as a way of reducing the reliance on the use of sniffer-dogs and the production of a robotic equivalent would be extremely valuable in a range of applications.

This workshop offers the opportunity to test and compare machine-learning systems on real-world data of varying degrees of complexity and varying levels of adulteration with anomalous substances. The test and competition data has been captured using two different instruments: a highly sensitive time-of-flight proton transfer mass spectrometer and a slightly less sensitive (but more portable) quadrupole proton transfer mass spectrometer. The test data which is available from the FTP server is provided in compressed ASCII format and each data file is accompanied by a description of when anomalous substances were introduced, what they were and how “large” each anomaly is expected to be.

The challenge is to build an immune-inspired system that is capable of detecting as many of the anomalies as possible with the minimum of false-positives and within a system that is capable of running indefinitely with as little human intervention as possible. The ability to both detect and classify anomalies that have been seen before is desirable and some attention should be paid to how such information is to be presented to the user of such a system. For the purposes of this competition it can be assumed that the data will be available from the mass-spectrometer at a rate of between 1 and 5Hz.

Example scenario

You are asked to imagine your system deployed on a number of sensors distributed around a city in the role of detecting dangerous substances that have been released into the air either deliberately or accidentally. Your system must deal with natural diurnal variations in air-quality as well as potential variations caused by road traffic emissions, passersby, weather events and any number of other “background” phenomena. Unfortunately the mass-spectrometer itself will also gradually change its properties over time leading to apparent shifts in the mass-spectra. Your system will have access to data sets gathered over extended periods of “normal” background without any anomalous events, although this data is very unlikely to cover all of the possible “normal” background states of the environment. The job of the immune-inspired system is to alert the authorities to any anomalous events that are detected in such a way as to allow them to respond appropriately to each event. Thus information about the nature of the anomaly will be useful. Information that can give answers (or partial answers) to questions such as those listed below will therefore be very useful.

Registration and Data

Please register your interest in taking part by email to so we have a rough idea of number of participants. Contact Mark Neal for more information about this specific competition.

The training data and instructions can be downloaded here.

UReason: Rationalizing, Analyzing and Visualizing Alarms

Martin Robbins, UReason

The Problem

In a major plant, whether it’s an oil refinery, water pumping station or nuclear power plant, thousands of operations may be happening per minute. When something goes wrong, alarms are generated, and as a problem escalates the alarms may begin to cascade, swamping operators with information.

UReason is a company that specializes in trying to deal with what happens when things go wrong in industry. We build software to monitor complex systems through a variety of sensors and data streams, rationalize alarms, and predict beforehand what’s likely to happen next.

In this competition, we’re looking to find ways of analysing alarm data, to find interesting or useful patterns, and to see if alarms can be predicted ahead of time.

The Data

A typical plant will produce data for many months or years at a time, resulting in many gigabytes of data. Alarm rates will likely vary, with a scale-free distribution raging from a few a minute to many hundreds per minute. Alarm data is broadly kept in the following form:

Alarm { Start Time | End Time | Acknowledged Time |Priority (1-5) }

The four variables are fairly self-explanatory – they record the time that the alarm started, the time it ended, the time that an operator acknowledge they had seen it, and the priority, on a scale from 1 (low) to 5 (run as fast as you can).

What We Want

Sample Data

Data from a real plant sourced from an industry partner will be made available by early March. Please register with us (see below) to make sure that you get the data as soon as it’s available.


The winner and runner-up will receive a trophy, along with a modest prize.

Those with solutions we find useful or interesting will be invited to discuss possible commercial applications with us.

Registration and Data

Please register your interest in taking part by email to so we have a rough idea of number of participants. To be kept updated with data releases and information, or if you have any further questions, please e-mail

The training data and instructions can be downloaded here