Abstract
Social scientists have shown evidence that visual perceptions of urban attributes, such as safe, wealthy, and beautiful perspectives of the given cities, are highly correlated to the residents’ behaviors and quality of life. Despite their significance, measuring visual perceptions of urban attributes is challenging due to the following facts: (1) Visual perceptions are subjectively contradistinctive rather than absolute. (2) Perception comparisons between image pairs are usually conducted region by region, and highly related to the specific urban attributes. And (3) the urban attributes have both the shared and specific information. To address these problems, in this article, we present a Deep inteRActive Multi-task leArning scheme, DRAMA for short. DRAMA comparatively quantifies the perceptions of urban attributes by jointly integrating the pairwise comparisons, regional interactions, and urban attribute correlations within a unified deep scheme. In DRAMA, each urban attribute is treated as a task, whereby the task-sharing and the task-specific information is fully explored. By conducting extensive experiments over a public large-scale benchmark dataset, it is demonstrated that our proposed DRAMA scheme outperforms several state-of-the-art baselines. Meanwhile, we applied the pairwise comparisons of our DRAMA model to further quantify the urban attributes and hence rank cities with respect to the given urban attributes. As a byproduct, we have released the codes and parameter settings to facilitate other researches.
- James Q. Wilson. 1982. Broken windows: The police and neighborhood safety. Atlan. Month. 249, 2 (1982), 29–38.Google Scholar
- Kees Keizer, Siegwart Lindenberg, and Linda Steg. 2008. The spreading of disorder. Science 322, 5908 (2008), 1681–1685.Google ScholarCross Ref
- A. J. Milam, C. D. M. Furrholden, and P. J. Leaf. 2010. Perceived school and neighborhood safety, neighborhood violence and academic achievement in urban school children. Urban Rev. 42, 5 (2010), 458–467.Google ScholarCross Ref
- Deborah A. Cohen, Karen Mason, Ariane Bedimo, Richard Scribner, Victoria Basolo, and Thomas A. Farley. 2003. Neighborhood physical conditions and health. Amer. J. Pub. Health 93, 3 (2003), 467–71.Google ScholarCross Ref
- Fredrik N. Piro, Øyvind Nœss, and Bjørgulf Claussen. 2006. Physical activity among elderly people in a city population: The influence of neighbourhood level violence and self perceived safety. J. Epidem. Commun. Health 60, 7 (2006), 626–632.Google ScholarCross Ref
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- Rajeev Ranjan, Vishal M. Patel, and Rama Chellappa. 2016. HyperFace: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2016), 1–1. https://pubmed.ncbi.nlm.nih.gov/29990235/.Google Scholar
- Abhimanyu Dubey, Nikhil Naik, Devi Parikh, Ramesh Raskar, and César A. Hidalgo. 2016. Deep learning the city: Quantifying urban perception at a global scale. In Proceedings of the European Conference on Computer Vision. IEEE, 196–212.Google Scholar
- Ian P. Howard. 1996. Alhazen’s neglected discoveries of visual phenomena. Perception 25, 10 (1996), 1203–1217.Google ScholarCross Ref
- Omar Khaleefa. 1999. Who is the founder of psychophysics and experimental psychology? Amer. J. Islam. Soc. Sci. 16, 2 (1999), 1.Google ScholarCross Ref
- Pascal Mamassian, Michael Landy, and Laurence T. Maloney. 2002. Bayesian modelling of visual perception. Probabil. Mod. Brain (2002), 13–36. https://psycnet.apa.org/record/2002-02646-001.Google Scholar
- Xuaner Zhang, Ren Ng, and Qifeng Chen. 2018. Single image reflection separation with perceptual losses. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE, 4786–4794.Google ScholarCross Ref
- Philip Salesses, Katja Schechtner, and César A. Hidalgo. 2013. The collaborative image of the city: Mapping the inequality of urban perception. PloS One 8, 7 (2013), e68400.Google ScholarCross Ref
- Marco Maggini, Franco Scarselli, Leonardo Rigutini, and Tiziano Papini. 2008. SortNet: Learning to rank by a neural-based sorting algorithm. In Proceedings of the International ACM SIGIR Conference. ACM, 76–79.Google Scholar
- Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In Proceedings of the International Conference on Machine Learning. ACM, 89–96. Google ScholarDigital Library
- Ming-Feng Tsai, Tie-Yan Liu, Tao Qin, Hsin-Hsi Chen, and Wei-Ying Ma. 2007. FRank: A ranking method with fidelity loss. In Proceedings of the 30th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 383–390. Google ScholarDigital Library
- Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. 2003. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, Nov. (2003), 933–969. Google ScholarDigital Library
- Ralf Herbrich. 2000. Large margin rank boundaries for ordinal regression. Adv. Large Marg. Classif. (2000), 115–132. https://www.bibsonomy.org/bibtex/c1aab52010073f7f01771dabde1e5b9a.Google Scholar
- Zhaohui Zheng, Keke Chen, Gordon Sun, and Hongyuan Zha. 2007. A regression framework for learning ranking functions using relative relevance judgments. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 287–294. Google ScholarDigital Library
- Qingbo Wu, Hongliang Li, Zhou Wang, Fanman Meng, Bing Luo, Wei Li, and King N. Ngan. 2017. Blind image quality assessment based on rank-order regularized regression. IEEE Trans. Multim. 19, 11 (2017), 2490–2504.Google ScholarCross Ref
- Martin Engilberge, Louis Chevallier, Patrick Pérez, and Matthieu Cord. 2019. SoDeep: A sorting deep net to learn ranking loss surrogates. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10792–10801.Google ScholarCross Ref
- Ravi Kiran Sarvadevabhatla, Isht Dwivedi, Abhijat Biswas, Sahil Manocha et al. 2017. SketchParse: Towards rich descriptions for poorly drawn sketches using multi-task hierarchical deep networks. In Proceedings of the ACM on Multimedia Conference. ACM, 10–18. Google ScholarDigital Library
- Keke He, Zhanxiong Wang, Yanwei Fu, Rui Feng, Yu-Gang Jiang, and Xiangyang Xue. 2017. Adaptively weighted multi-task deep network for person attribute classification. In Proceedings of the ACM on Multimedia Conference. ACM, 1636–1644. Google ScholarDigital Library
- Xuelong Li, Zhigang Wang, and Xiaoqiang Lu. 2017. A multi-task framework for weather recognition. In Proceedings of the ACM on Multimedia Conference. ACM, 1318–1326. Google ScholarDigital Library
- Liqiang Nie, Luming Zhang, Yi Yang, Meng Wang, Richang Hong, and Tat-Seng Chua. 2015. Beyond doctors: Future health prediction from multimedia and multimodal observations. In Proceedings of the ACM International Conference on Multimedia. ACM, 591–600. Google ScholarDigital Library
- Foteini Markatopoulou, Vasileios Mezaris, and Ioannis Patras. 2016. Deep multi-task learning with label correlation constraint for video concept detection. In Proceedings of the ACM on Multimedia Conference. ACM, 501–505. Google ScholarDigital Library
- Yu Zhang and Qiang Yang. 2017. A survey on multi-task learning. arXiv preprint arXiv:1707.08114 (2017).Google Scholar
- Changxing Ding, Chang Xu, and Dacheng Tao. 2015. Multi-task pose-invariant face recognition. IEEE Trans. Image Proc. 24, 3 (2015), 980–993.Google ScholarCross Ref
- Yong Luo, Yonggang Wen, Dacheng Tao, Jie Gui, and Chao Xu. 2016. Large margin multi-modal multi-task feature extraction for image classification. IEEE Trans. Image Proc. 25, 1 (2016), 414–427.Google ScholarCross Ref
- Lianyang Ma, Xiaokang Yang, and Dacheng Tao. 2014. Person re-identification over camera networks using multi-task distance metric learning. IEEE Trans. Image Proc. 23, 8 (2014), 3656–3670.Google ScholarCross Ref
- Wenqing Chu, Yao Liu, Chen Shen, Deng Cai, and Xian-Sheng Hua. 2018. Multi-task vehicle detection with region-of-interest voting. IEEE Trans. Image Proc. 27, 1 (2018), 432–441.Google ScholarCross Ref
- Long Xu, Jia Li, Weisi Lin, Yongbing Zhang, Lin Ma, Yuming Fang, and Yihua Yan. 2016. Multi-task rank learning for image quality assessment. IEEE Trans. Circ. Syst. Vid. Technol. 27, 9 (2016), 1833–1843.Google ScholarCross Ref
- Qiang Zhang and Martin D. Levine. 2016. Robust multi-focus image fusion using multi-task sparse representation and spatial context. IEEE Trans. Image Proc. 25, 5 (2016), 2045–2058. Google ScholarDigital Library
- Weiqing Min, Shuhuan Mei, Linhu Liu, Yi Wang, and Shuqiang Jiang. 2019. Multi-task deep relative attribute learning for visual urban perception. IEEE Trans. Image Proc. 29 (2019), 657–669.Google ScholarDigital Library
- Jens Preiss, Felipe Fernandes, and Philipp Urban. 2014. Color-image quality assessment: From prediction to optimization. IEEE Trans. Image Proc. 23, 3 (2014), 1366–1378. Google ScholarDigital Library
- Ingmar Lissner and Philipp Urban. 2012. Toward a unified color space for perception-based image processing. IEEE Trans. Image Proc. 21, 3 (2012), 1153–1168. Google ScholarDigital Library
- Lark Kwon Choi, Jaehee You, and Alan Conrad Bovik. 2015. Referenceless prediction of perceptual fog density and perceptual image defogging. IEEE Trans. Image Proc. 24, 11 (2015), 3888–3901.Google ScholarCross Ref
- Vasileios Argyriou. 2011. Sub-hexagonal phase correlation for motion estimation. IEEE Trans. Image Proc. 20, 1 (2011), 110–120. Google ScholarDigital Library
- Yichao Yan, Jingwei Xu, Bingbing Ni, Wendong Zhang, and Xiaokang Yang. 2017. Skeleton-aided articulated motion generation. In Proceedings of the ACM on Multimedia Conference. ACM, 199–207. Google ScholarDigital Library
- Bernard E. Harcourt. 1998. Reflecting on the subject: A critique of the social influence conception of deterrence, the broken windows theory, and order-maintenance policing New York style. Soc. Sci. Electron. Publish. 97, 2 (1998), 291–389.Google Scholar
- William Bratton and George Kelling. 2006. There are no cracks in the broken windows. Nat. Rev. 28 (2006).Google Scholar
- D. Cohen, R. Spear, Scribner, P. Kissinger, K. Mason, and J. Wildgen. 2000. “Broken windows” and the risk of gonorrhea. Amer. J. Pub. Health 90, 2 (2000), 230–230.Google ScholarCross Ref
- C. E. Ross and J. Mirowsky. 2001. Neighborhood disadvantage, disorder, and health.J. Health Soc. Behav. 42, 3 (2001), 258–258.Google ScholarCross Ref
- Plos One Staff. 2015. Correction: The collaborative image of the city: Mapping the inequality of urban perception. PLoS One 10, 3 (2015), e0119352.Google Scholar
- Nikhil Naik, Jade Philipoom, Ramesh Raskar, and César A. Hidalgo. 2014. Streetscore—Predicting the perceived safety of one million streetscapes. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, 793–799. Google ScholarDigital Library
- Nikhil Naik, Ramesh Raskar, and Cesar A. Hidalgo. 2016. Cities are physical too: Using computer vision to measure the quality and impact of urban appearance. Amer. Econ. Rev. 106, 5 (2016), 128–132.Google ScholarCross Ref
- Vicki Been, Ingrid Gould Ellen, Michael Gedal, Edward Glaeser, and Brian J. Mccabe. 2016. Preserving history or restricting development? The heterogeneous effects of historic districts on local housing markets in New York City. J. Urb. Econ. 92 (2016), 16–30.Google ScholarCross Ref
- Nikhil Naik, Scott Duke Kominers, Ramesh Raskar, Edward L. Glaeser, and César Hidalgo. 2015. Preserving history or restricting development? The co-evolution of physical, social, and economic change in five major U.S. cities. Soc. Sci. Electron. Pub. (2015). https://www.hbs.edu/faculty/Pages/item.aspx?num=50631.Google Scholar
- Chester Harvey, Lisa Aultman-Hall, Stephanie E. Hurley, and Austin Troy. 2015. Effects of skeletal streetscape design on perceived safety. Landsc. Urb. Plann. 142 (2015), 18–28.Google ScholarCross Ref
- Yongchao Xu, Qizheng Yang, Chaoran Cui, Cheng Shi, Guangle Song, Xiaohui Han, and Yilong Yin. 2019. Visual urban perception with deep semantic-aware network. In Proceedings of the International Conference on Multimedia Modeling. Springer, 28–40.Google ScholarCross Ref
- Sebastian Ruder. 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017).Google Scholar
- Ran He, Man Zhang, Liang Wang, Ye Ji, and Qiyue Yin. 2015. Cross-modal subspace learning via pairwise constraints. IEEE Trans. Image Proc. 24, 12 (2015), 5543–5556.Google ScholarCross Ref
- Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. In Proceedings of the International Conference on Neural Information Processing Systems, Workshop on Machine Learning Systems.Google Scholar
- Ralf Herbrich, Tom Minka, and Thore Graepel. 2007. TrueSkillTM: A Bayesian skill rating system. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 569–576. Google ScholarDigital Library
- Sahand Negahban, Sewoong Oh, and Devavrat Shah. 2012. Iterative ranking from pair-wise comparisons. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2474–2482. Google ScholarDigital Library
- Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20, 4 (2002), 422–446. Google ScholarDigital Library
Index Terms
- Urban Perception: Sensing Cities via a Deep Interactive Multi-task Learning Framework
Recommendations
Urban Perception of Commercial Activeness from Satellite Images and Streetscapes
WWW '18: Companion Proceedings of the The Web Conference 2018People can percept social attributes from streetscapes such as safety, richness, and happiness by means of visual perception, which inspires the research in terms of urban perception. To the best of our knowledge, this is the first work focused on ...
Quantifying Urban Safety Perception on Street View Images
WI-IAT '21: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent TechnologyIn the last 40 years, Urban perception has become an important research area covering several fields, such as criminology, psychology, urban planning, Broken windows theory. It aims to analyze and interpret the behavior of the perception in cities. ...
Looking South: Learning Urban Perception in Developing Cities
Special Issue on Group ’18 and Regular PapersMobile and social technologies are providing new opportunities to document, characterize, and gather impressions of urban environments. In this article, we present a study that examines urban perceptions of three cities in central Mexico; the study ...
Comments