xNet+SC: Classifying Places Based on Images by Incorporating Spatial Contexts

Authors Bo Yan , Krzysztof Janowicz, Gengchen Mai, Rui Zhu

Thumbnail PDF


  • Filesize: 0.51 MB
  • 15 pages

Document Identifiers

Author Details

Bo Yan
  • STKO Lab, University of California, Santa Barbara, USA
Krzysztof Janowicz
  • STKO Lab, University of California, Santa Barbara, USA
Gengchen Mai
  • STKO Lab, University of California, Santa Barbara, USA
Rui Zhu
  • STKO Lab, University of California, Santa Barbara, USA

Cite AsGet BibTex

Bo Yan, Krzysztof Janowicz, Gengchen Mai, and Rui Zhu. xNet+SC: Classifying Places Based on Images by Incorporating Spatial Contexts. In 10th International Conference on Geographic Information Science (GIScience 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 114, pp. 17:1-17:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


With recent advancements in deep convolutional neural networks, researchers in geographic information science gained access to powerful models to address challenging problems such as extracting objects from satellite imagery. However, as the underlying techniques are essentially borrowed from other research fields, e.g., computer vision or machine translation, they are often not spatially explicit. In this paper, we demonstrate how utilizing the rich information embedded in spatial contexts (SC) can substantially improve the classification of place types from images of their facades and interiors. By experimenting with different types of spatial contexts, namely spatial relatedness, spatial co-location, and spatial sequence pattern, we improve the accuracy of state-of-the-art models such as ResNet - which are known to outperform humans on the ImageNet dataset - by over 40%. Our study raises awareness for leveraging spatial contexts and domain knowledge in general in advancing deep learning models, thereby also demonstrating that theory-driven and data-driven approaches are mutually beneficial.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Computer vision tasks
  • Computing methodologies → Neural networks
  • Theory of computation → Bayesian analysis
  • Spatial context
  • Image classification
  • Place types
  • Convolutional neural network
  • Recurrent neural network


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Georges Baatz, Olivier Saurer, Kevin Köser, and Marc Pollefeys. Large scale visual geo-localization of images in mountainous terrain. In Computer Vision-ECCV 2012, pages 517-530. Springer, 2012. Google Scholar
  2. Thomas Berg, Jiongxin Liu, Seung Woo Lee, Michelle L Alexander, David W Jacobs, and Peter N Belhumeur. Birdsnap: Large-scale fine-grained visual categorization of birds. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 2019-2026. IEEE, 2014. Google Scholar
  3. Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems, pages 4349-4357, 2016. Google Scholar
  4. Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334):183-186, 2017. Google Scholar
  5. Marco Castelluccio, Giovanni Poggi, Carlo Sansone, and Luisa Verdoliva. Land use classification in remote sensing images by convolutional neural networks. arXiv preprint arXiv:1508.00092, 2015. Google Scholar
  6. Moustapha Cisse, Yossi Adi, Natalia Neverova, and Joseph Keshet. Houdini: Fooling deep structured prediction models. arXiv preprint arXiv:1707.05373, 2017. Google Scholar
  7. Anne Cocos and Chris Callison-Burch. The language of place: Semantic value from geospatial context. In 15th Conference of the European Chapter of the Association for Computational Linguistics, volume 2, pages 99-104, 2017. Google Scholar
  8. Shanshan Feng, Gao Cong, Bo An, and Yeow Meng Chee. Poi2vec: Geographical latent representation for predicting future visitors. In AAAI, pages 102-108, 2017. Google Scholar
  9. Michael F Goodchild and Donald G Janelle. Thinking spatially in the social sciences. Spatially integrated social science, pages 3-22, 2004. Google Scholar
  10. Ian J Goodfellow, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, and Vinay Shet. Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082, 2013. Google Scholar
  11. Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014. Google Scholar
  12. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770-778, 2016. Google Scholar
  13. Geremy Heitz and Daphne Koller. Learning spatial context: Using stuff to find things. In European conference on computer vision, pages 30-43. Springer, 2008. Google Scholar
  14. Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735-1780, 1997. Google Scholar
  15. Gao Huang, Zhuang Liu, Kilian Q Weinberger, and Laurens van der Maaten. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2261-2269, 2017. Google Scholar
  16. Been Kim, Rajiv Khanna, and Oluwasanmi O Koyejo. Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems, pages 2280-2288, 2016. Google Scholar
  17. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097-1105, 2012. Google Scholar
  18. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998. Google Scholar
  19. Stefan Lee, Haipeng Zhang, and David J Crandall. Predicting geo-informative attributes in large-scale image collections using convolutional neural networks. In Applications of Computer Vision (WACV), 2015 IEEE Winter Conference on, pages 550-557. IEEE, 2015. Google Scholar
  20. Tsung-Yi Lin, Serge Belongie, and James Hays. Cross-view image geolocalization. In Computer Vision and Pattern Recognition, pages 891-898. IEEE, 2013. Google Scholar
  21. Kang Liu, Song Gao, Peiyuan Qiu, Xiliang Liu, Bo Yan, and Feng Lu. Road2vec: Measuring traffic interactions in urban road system from massive travel routes. ISPRS International Journal of Geo-Information, 6(11):321, 2017. Google Scholar
  22. Nikhil Naik, Scott Duke Kominers, Ramesh Raskar, Edward L Glaeser, and César A Hidalgo. Computer vision uncovers predictors of physical urban change. Proceedings of the National Academy of Sciences, 114(29):7571-7576, 2017. Google Scholar
  23. Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In Security and Privacy (SP), 2016 IEEE Symposium on, pages 582-597. IEEE, 2016. Google Scholar
  24. Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. Google Scholar
  25. Pierre Stock and Moustapha Cisse. Convnets and imagenet beyond accuracy: Explanations, bias detection, adversarial examples and model criticism. arXiv:1711.11443, 2017. Google Scholar
  26. Wanxiao Sun, Volker Heidt, Peng Gong, and Gang Xu. Information fusion for rural land-use classification with high-resolution satellite imagery. IEEE Transactions on Geoscience and Remote Sensing, 41(4):883-890, 2003. Google Scholar
  27. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, et al. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, 2015. Google Scholar
  28. Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013. Google Scholar
  29. Kevin Tang, Manohar Paluri, Li Fei-Fei, Rob Fergus, and Lubomir Bourdev. Improving image classification with location context. In Proceedings of the IEEE international conference on computer vision, pages 1008-1016, 2015. Google Scholar
  30. Bo Yan, Krzysztof Janowicz, Gengchen Mai, and Song Gao. From itdl to place2vec-reasoning about place type similarity and relatedness by learning embeddings from augmented spatial contexts. Proceedings of SIGSPATIAL, 17:7-10, 2017. Google Scholar
  31. Jie Yu and Jiebo Luo. Leveraging probabilistic season and location context models for scene understanding. In Proceedings of the 2008 international conference on Content-based image and video retrieval, pages 169-178. ACM, 2008. Google Scholar
  32. Andi Zang, Runsheng Xu, Zichen Li, and David Doria. Lane boundary extraction from satellite imagery. In Proceedings of the 1st ACM SIGSPATIAL Workshop on High-Precision Maps and Intelligent Applications for Autonomous Vehicles, page 1. ACM, 2017. Google Scholar
  33. Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. Men also like shopping: Reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457, 2017. Google Scholar
  34. Shenglin Zhao, Tong Zhao, Irwin King, and Michael R Lyu. Geo-teaser: Geo-temporal sequential embedding rank for point-of-interest recommendation. In Proceedings of the 26th international conference on world wide web companion, pages 153-162. International World Wide Web Conferences Steering Committee, 2017. Google Scholar
  35. Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017. Google Scholar