AI-driven aggressive action detection: Taking the fight against street crimes to the next level using human to object interaction

Written by Dr Vishnu Monn, School of Information Technology

Video surveillance is quintessentially a passive driven system, whereby recorded or archived content is used primarily as evidence of a criminal activity that has taken place. To improve deterrence against criminal intent (in the context of snatch theft and robbery), autonomous or active based surveillance represents a viable alternative to passive based systems. An active surveillance system analyses the content of the video feed in real-time to recognise spontaneous and suspicious behaviours leading to criminal intent. In this study, we formulate, model, and implement a human to object interaction system using deep neural networks. It enables a software approach to autonomously detect the presence of a person wielding a weapon from a surveillance camera in real-time. Thus, constitute an AI-based aggressive action detection system.

Achieving a reliable human to object interaction model for an aggressive action detection system is by no means challenge free. Studies show that approximately 22% - 36% of robberies or homicides in the United States alone are associated with firearms. A similar trend is observed globally with the use of other forms of weapons. Therefore, modelling a relationship between the object (i.e., weapon) and its subject (i.e., the person(s)) is critical towards accurately identifying a person-to-weapon interaction and isolating possible false classification of actions.

In stage one, we started with the Monash Automatic Gun Detection System (MAGTS), which won Gold in Malaysia's 31st International Invention, Innovation & Technology Exhibition (ITEX 2020). It is a computer vision-based object detection model that can detect handguns from surveillance cameras accurately in real-time. It has motivated us to solve the second challenge of this study in formulating a human to weapon interaction model. The significance of this model would enable a reliable real-time aggressive human action detection model. It could re-envision how AI is used to strengthen law enforcement and to further deter criminal activities in enhancing public safety through an active video surveillance framework.

The funding of this study is from the Malaysia Fundamental Research Grant Scheme (FRGS) and Monash University Malaysia Advanced Engineering Platform Cybersecurity AI (ψ^2) cluster. We have published part of this research at the 32nd British Machine Vision Conference (BMVC 2021). A full-length journal manuscript is currently under review. The product also won Gold in Malaysia's 32nd International Invention, Innovation & Technology Exhibition (ITEX 2021). We will continue fine-tuning and optimising the model to increase its reliability in a more dynamic video surveillance environment. Monash University Malaysia plays a pivotal role in supporting us towards realising the outcomes. It includes providing the compute facilities crucial for designing and implementing the AI model here. The hard work and conscientiousness of Marcus Lim Jun Yi (PhD postgraduate student) and Arthur Lam Jian Shun (Honours student) have made the research results possible.