A real-time two-input stream multi-column multi-stage convolution neural network (TIS-MCMS-CNN) for efficient crowd congestion-level analysis

Tripathi, S.K.; Srivastava, R.

IDR Home
→
Article
→
Department of Computer Science and Engineering
→
View Item

dc.contributor.author	Tripathi, S.K.
dc.contributor.author	Srivastava, R.
dc.date.accessioned	2020-11-20T06:30:54Z
dc.date.available	2020-11-20T06:30:54Z
dc.date.issued	2020-10-01
dc.identifier.issn	09424962
dc.identifier.uri	http://localhost:8080/xmlui/handle/123456789/953
dc.description.abstract	Crowd congestion-level analysis (CCA) is one of the most important tasks of crowd analysis and helps to control crowd disasters. The existing state-of-the-art approaches either utilize spatial features or spatial–temporal texture features to implement the CCA. The state-of-the-art deep-learning approaches utilize a single column convolution neural network (CNN) to extract deep spatial features to solve the objective function and perform better than traditional approaches. But still, the performance is needed to be improved as these models can not capture features invariant to perspective change. The proposed work is mainly based on two intuitions. First, both deep spatial and temporal features are required to improve the performance of the model. Second, a multi-column CNN with different kernel size is capable of capturing features invariant to perspective and scene change. Based on these intuitions, we proposed a two-input stream multi-column multi-stage CNN with parallel end to end training to solve the CCA. Each stream extracts spatial and temporal features from the scene, followed by a fusion layer to enhance the discrimination power of the model. We demonstrated experiments by using publicly available datasets such as PETS-2009, UCSD, UMN. We manually annotated 22 K frames into one of five crowd congestion levels such as Very Low, Low, Medium, High, and Very High. The proposed model achieves accuracies of 96.97%, 97.21%, 98.52%, 98.55%, 97.01% on PETS-2009, UCSD-Ped1, UCSD-Ped2, UMN-Plaza1 and UMN-Plaza2, respectively. The model processes nearly 30 test frames per second and hence applicable in real-time applications. The proposed model outperforms some of the existing state-of-the-art techniques. © 2020, Springer-Verlag GmbH Germany, part of Springer Nature.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Springer	en_US
dc.relation.ispartofseries	Multimedia Systems;Vol. 26 Issue 5
dc.subject	Crowd analysis	en_US
dc.subject	Crowd congestion-level classifcation	en_US
dc.subject	Multi-stage multi-column CNN	en_US
dc.subject	Feature fusion	en_US
dc.title	A real-time two-input stream multi-column multi-stage convolution neural network (TIS-MCMS-CNN) for efficient crowd congestion-level analysis	en_US
dc.type	Article	en_US