A Review of Pedestrian Information Retrieval Research
Main Article Content
Abstract
Pedestrian Information Search (PIS) has gained attention for its wide range of practical applications. The main objective of PIS is to find a matching object in a set of scene images or videos. Early work on PIS focused on image-based search. With the advent of deep neural networks, PIS can be freed from the limitations of the search source. Therefore, a systematic study of PIS is necessary. In this paper, we review the research results of PIS based on different modalities in terms of the origin of the PIS task, the development history of PIS, and the methods of training and evaluation of PIS models. We selected the better-performing models for experiments. We summarize and comparatively evaluate the experimental results. Finally, we discuss some of the present problems of PIS and some meaningful future research directions.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
เนื้อหาข้อมูล
References
C. Lin, J. Lu, G. Wang, and J. Zhou, “Graininess-aware deep feature learning for robust pedestrian detection,” IEEE Trans. Image Process., vol. 29, pp. 3820-3834, Jan. 2020.
J. Zhang, N. Wang, and L. Zhang, “Multi-shot pedestrian re-identifcation via sequential decision making,” in Proc. IEEE CVPR, 2018, pp. 6781-6789.
C. J. Pai, H. R. Tyan, Y. M. Liang, and H. Y. Mark Liao, “Pedestrian detection and tracking at crossroads,” Pattern Recognit., vol. 37, no. 5, pp. 1025-1034, Jan. 2003.
M. You, Y. Zhang, C. Shen, and X. Zhang, “An extended filtered channel framework for pedestrian detection,” IEEE Trans. Intell. Transp. Syst., vol. 19, no. 5, pp. 1640-1651, May 2018.
P. Dollar, C. Wojek, B. Schiele, and P. Perona, “Pedestrian detection: An evaluation of the state of the art,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 4, pp. 743-761, Apr. 2012.
M. Enzweiler and D. M. Gavrila, “Monocular pedestrian detection: Survey and experiments,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 12, pp. 2179-2195, Oct. 2009.
T. Liu and T. Stathaki, “Faster R-CNN for robust pedestrian detection using semantic segmentation network,” Front. Neurorobot., vol. 64, no. 12, pp. 1-10, Oct. 2018.
S. Zhai, S. Dong, D. Shang, and S. Wang, “An improved Faster R-CNN pedestrian detection algorithm based on feature fusion and context analysis,” IEEE Access, vol. 8, pp. 138117-138128, Jul. 2020.
C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder: Real-time tracking of the human body,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 780-785, Jul. 1997.
L. Zheng et al. “Mars: A video benchmark for large-scale person re-identification,” Computer Vision-ECCV, vol. 9910, pp. 868-884, Sep. 2016.
D. Figueira, M. Taiana, A. Nambiar, J. Nascimento, and A. Bernardino, “The HDA+ dataset for research on fully automated re-identification systems,” Cham, CH: Springer, 2015, pp. 241-255.
Y. Shen, H. Li, S. Yi, D. Chen, and X. Wang, “Person re-identification with deep similarity-guided graph neural network,” in Proc. ECCV, 2018, pp. 486-504.
S. Salehian, P. Sebastian, and A. B. Sayuti, “Framework for pedestrian detection, tracking and re-identification in video surveillance system,” in Proc. IEEE ICSIPA, 2019, pp. 192-197.
S. Zhang, D. Chen, J. Yang, and B. Schiele, “Guided attention in CNNs for occluded pedestrian detection and re-identification,” Int. J. Comput. Vis., vol. 129, pp. 1875-1892, Apr. 2021.
Y. Sun, L. Zheng, Y. Yang, Q. Tian, and S. Wang, “Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline),” ECCV, vol. 11208, pp. 480-496, Oct. 2018.
S. Li, T. Xiao, H. Li, B. Zhou, D. Yue, and X. Wang, “Person search with natural language description,” in Proc. IEEE CVPR, 2017, pp. 5187-5196.
X. Han, S. He, L. Zhang, Q. Ye, and J. Sun, “Text-based person search with limited data,” in Proc. ACM Int. Conf. Multimedia, 2023, pp. 1-20.
D. K. Panda and S. Meher, “Dynamic background subtraction using local binary pattern and histogram of oriented gradients,” in Proc. ICIIP, 2015, pp. 306-311.
C. Wang, Z. Luo, Y. Lin, and S. Li, “Text-based person search via multi-granularity embedding learning,” in Proc. IJCAI, 2021, pp. 1068-1074.
H. Liu, J. Feng, M. Qi, and S. Yan, “End-to-end comparative attention networks for person re-identification,” IEEE Trans. Image Process., vol. 26, no. 7, pp. 3492-3506, Jul. 2017.
P. Srivastava and A. Khare, “Utilizing multiscale local binary pattern for content-based image retrieval,” Multimedia Tools Appl., vol. 77, pp. 12377-12403, Jun. 2017.
K. Chen, X. Song, X. Zhai, B. Zhang, B. Hou, and Y. Wang, “An integrated deep learning framework for occluded pedestrian tracking,” IEEE Access, vol. 7, pp. 26060-26072, Feb. 2019.
D. Jiang and M. Ye, “Cross-modal implicit relation reasoning and aligning for text-to-image person retrieval,” in Proc. IEEE/CVF, 2023, pp. 2787-2797.
J. S. J. Rani and M. G. Augasta, “PoolNet deep feature based person re-identification,” Multimedia Tools Appl., vol. 82, no. 16, pp. 24967-24989, Jan. 2023.
L. Zheng, Y. Yang, and A. G. Hauptmann, “Person re-identification: Past, present and future,” J. LaTeX Class.File, vol. 14, no. 8, pp. 1-20, Aug. 2016.
Z. Wang, Z. Wang, Y. Zheng, and Y. Y. Chuang, “Learning to reduce dual-level discrepancy for infrared-visible person re-identification,” in Proc. IEEE/CVF CVPR, 2020, pp. 618-626.
Y. Sun, L. Zheng, W. Deng, and S. Wang., “SVDNet for Pedestrian Retrieval,” in Proc. 2017 IEEE ICCV, 2017, pp. 3820-3828.
Y. Chen, G. Zhang, Y. Lu, Z. Wang, Y. Zheng, and R. Wang “TIPCB: A simple but effective part-based convolutional baseline for text-based person search,” Neurocomputing, vol. 494, pp. 171-181, 2002.
J. Cao, Y. Pang, J. Xie, F. S. Khan, and L. Shao, “From handcrafted to deep features for pedestrian detection: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 9, pp. 4913-4934, Apr. 2021.
D. Ribeiro, J. C. Nascimento, A. Bernardino, and G. Carneiro, “Improving the performance of pedestrian detectors using convolutional learning,” Pattern Recognit., vol. 61, pp. 641-649, 2017.
Y. H. Byeon and K. C. Kwak, “A performance comparison of pedestrian detection using faster RCNN and ACF,” in Proc. 2017 6th IIAI Int. Congr. Adv. Appl. Inform., 2017, pp. 858- 863. https://doi.org./10.1109/IIAI-AAI.2017.196
B. Sheng, Q Hu, J. Li, W. Yang, B. Zhang, and C. Sun, “Filtered shallow-deep feature channels for pedestrian detection,” Neurocomputing, vol. 9, pp. 106-113, Aug. 2017.
S. Zhang, R. Benenson, and B. Schiele, “Filtered feature channels for pedestrian detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 1751-1760.
Z. Cai, M. Saberian, and N. Vasconcelos, “Learning complexity-aware cascades for deep pedestrian detection,” in Proc. 2015 IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 3361-3369.
Z. Ma and P. P. Gao, “Research on the Cascade Pedestrian Detection Model Based on LDCF and CNN,” in Proc. 2018 IEEE Int. Conf. Big Data Smart Comput. (BigComp), 2018, pp. 314-320.
Y. Luo, C. Zhang, M. Zhao, H. Zhou, and J. Sun “Where, what, whether: Multi-modal learning meets pedestrian detection,” Computer Vision Foundation, no. zrxiv2103.11599, pp. 14065, Dec. 2020.
P. Dong and W. Wang, “Better region proposals for pedestrian detection with R-CNN,” in Proc. 2016 Visual Commun. Image Process. (VCIP), 2017, pp. 1-4.
H. Zhang et al., “Pedestrian Detection Method Based on Faster R-CNN,” in Proc. 2017 13th Int. Conf. Comput. Intell. Security (CIS), 2018, pp. 427-430.
E. Ahmed, M. Jones, and T. K. Marks, “An improved deep learning architecture for person re-identification,” in Proc. 2015 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2015, pp. 3908-3916.
X. Chen, Y. Xu, D. W. K. Wong, T. Y. Wong, and J. Liu, “Glaucoma detection based on deep convolutional neural network,” in Proc. 2015 37th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), 2015, pp. 715-718.
M. Koestinger, M. Hirzer, P. Wohlhart, P. M. Roth, and H. Bischof, “Large scale metric learning from equivalence constraints,” in Proc. 2012 IEEE Conf. Comput. Vis. Pattern Recognit., 2012, pp. 2288-2295.
L. Zhang, T. Xiang, and S. Gong, “Learning a discriminative null space for person re-identification,” in Proc. 2016 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 1239-1248.
Z. Zheng, L. Zheng, and Y. Yang, “Pedestrian alignment network for large-scale person re-identification,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 10, pp. 3037-3045, Oct. 2019.
F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in Proc. 2015 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2015, pp. 815-823.
Y. Xu, X. Cao, and H. Qiao, “An efficient tree classifier ensemble-based approach for pedestrian detection,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 41, no. 1, pp. 107-117, May 2011.
T. Xiao, S. Li, B. Wang, L. Lin, and X. Wang, “Joint detection and identification feature learning for person search,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3415-3424.
H. Liu, J. Feng, M. Qi, J. Jiang, and S. Yan, “End-to-end comparative attention networks for person re-identification,” IEEE Trans. Image Process., vol. 26, no. 7, pp. 3492-3506, May 2017.
L. Zheng, H. Zhang, S. Sun, M. Chandraker, Y. Yang, and Q. Tian, “Person re-identification in the wild,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 3346-3355.
W. S. Zheng, S. Gong, and T. Xiang, “Person re-identification by probabilistic relative distance comparison,” in Proc. CVPR 2011, 2011, pp. 649-656.
L. Zheng, L. Shen, L. Tian, J. Wang, and Q. Tian “Scalable person re-identification: A benchmark,” in Proc. 2015 IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 1116-1124.
Y. Sun, L. Zheng, Y. Yang, Q. Tian, and S. Wang, “Beyond part models: Person retrieval with refined part pooling (and astrong convolutional baseline),” in Proc. Eur. Conf. Comput. Vis., 2017, pp. 480-496.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Commun. ACM, vol. 60, pp. 84-90, Dec. 2012.
N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” in Proc. EMNLPIJCNLP, 2019, pp. 3982-3992
Q. Li and J. Qu, “A novel BNB-NO-BK method for detecting fraudulent crowdfunding projects,” SJST, vol. 44, no. 5, pp. 1209-1219, Oct. 2022.
W. Hou and J. Qu, “BM5-SP-SC: A dual model architecture for contradiction detection on crowdfunding projects,” CAST, vol. 23, no. 6, pp. 1-29, Apr. 2023.
W. Suo et al., “A simple and robust correlation filtering method for text-based person search,” Eur. Conf. Comput. Vis., vol. 13695, no. 6, pp. 726-742, Nov. 2022.
Y. Zhang and H. Lu, “Deep cross-modal projection learning for image-text matching,” Springer, vol. 11205, pp. 686-701, Oct. 2018.
M. Ye, J. Shen, G. Lin, T. Xiang, L. Shao, and S. C. H. Hoi, “Deep learning for person re-identification: A survey and outlook,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 6, pp. 2872-2893, Jan. 2021.
T. Y. Lin et al., “Microsoft coco: Common objects in context,” Springer, vol. 8693, pp. 740-755, Sep. 2014.
J. Deng et al., “ImageNet: A large-scale hierarchical image database,” in Proc. 2009 IEEE Conf. Comput. Vis. Pattern Recognit., 2009, pp. 248-255.
X. H. Phan, L. M. Nguyen, and S. Horiguchi, “Learning to classify short and sparse text & web with hidden topics from large-scale data collections,” in Proc. 17th Int. Conf. World Wide Web, 2008, pp. 91-100.
Y. Zhang, R. Alturki, H. J. Alyamani, and M. Ikram, “Multilabel CNN-based hybrid learning metric for pedestrian reidentification,” Mobile Inf. Syst., vol. 2021, no. 7, pp. 1-7, Apr. 2021.
J. Revaud, J. Almazán, R. S. Rezende, and C. R. de Souza, “Learning with average precision: Training image retrieval with a listwise loss,” in Proc. 2019 IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2019, pp. 5106-5115.
R. Pan, J. A. García-Díaz, M. Á. Rodríguez-García, and R. Valencia-García, “UMUTeam at SemEval-2023 Task 3: Multilingual transformer-based model for detecting the genre, the framing, and the persuasion techniques in online news,” in Proc. 17th Int. Workshop Semantic Eval., 2023, pp. 609-615.
B. Bharathi and G. U. Samyuktha, “Machine learning based approach for sentiment analysis on multilingual code-mixing text,” in Proc. FIRE, 2021. pp. T6-T18.
K. Peyton and S. Unnikrishnan, “A comparison of chatbot platforms with the state-of-the-art sentence BERT for answering online student FAQs,” Results Eng., vol. 17, p. 100856, Mar. 2023.
G. Ashqar and A. Mutlu, “A Comparative assessment of various embeddings for keyword extraction,” in Proc. 5th Int. Congr. Human-Computer Interaction, 2023, pp. 1-6.
R. Qin, “Bert-based feature extraction approach for software forum posts,” IEEE Access, vol. 11, pp. 1-9, Jan. 2024, https://doi.org./10.1109/ACCESS.2024.3426976