Skip to Main Content

This project addresses the detection of real-time threats in video streams from crowded soft target environments. Video analysis is often the first line of defense in such environments due to the wide area of coverage, both at far and close range, provided by a spatially distributed network of cameras. Anomalous behavior or events in the video may clearly signify a threat (e.g., a fight breaking out) or indicate something out of place that should be pursued via human intervention or another sensor modality (e.g., a line of people moving the wrong way through a crowd). This project will attack the video analysis problem at several different levels. First, we will use both models and observations of the environment and human/vehicle behavior to dynamically determine the appropriate tasking and deployment of cameras to best detect threats. Second, we will use long-term video collected from many days and months of observation to determine what behavior is natural and expected for a given hour, day, and location. This model will allow the detection of anomalous behaviors in real time, even in the presence of dense crowds. Third, we will build models for the classification of individual actions and behaviors to detect anomalies and threats, with an emphasis on models that are both fair and explainable. 

Academic computer vision research generally relies on datasets containing videos that are (1) not representative of soft targets or crowded DHS environments of interest, and (2) quite short (at most a few minutes long). Research that addresses a large system of surveillance cameras observing a wide-area environment is also rare. The proposed project goes beyond typical computer vision research by focusing from the outset on long-time-scale data from wide-spatial-area multi-camera surveillance networks, as would be common in the soft-target environments of interest to SENTRY (e.g., schools, airports, subway stations, sports stadia). The proposed research pushes the technical state of the art in its development of models and algorithms that are robust and accurate, explicitly model uncertainty, explain the reasoning behind their decisions in a way human end-users understand, operate in highly crowded environments, and are fair and unbiased, all of which will enable proactive interdiction by the Virtual Sentry rather than after-the-fact response.