Improving Discovery of Open Civic Data

Authors Sara Lafia , Andrew Turner, Werner Kuhn

Thumbnail PDF


  • Filesize: 0.64 MB
  • 15 pages

Document Identifiers

Author Details

Sara Lafia
  • Department of Geography, University of California, Santa Barbara, USA
Andrew Turner
  • Esri DC, Office of Research and Development, Arlington, VA, USA
Werner Kuhn
  • Department of Geography, University of California, Santa Barbara, USA

Cite AsGet BibTex

Sara Lafia, Andrew Turner, and Werner Kuhn. Improving Discovery of Open Civic Data. In 10th International Conference on Geographic Information Science (GIScience 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 114, pp. 9:1-9:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


We describe a method and system design for improved data discovery in an integrated network of open geospatial data that supports collaborative policy development between governments and local constituents. Metadata about civic data (such as thematic categories, user-generated tags, geo-references, or attribute schemata) primarily rely on technical vocabularies that reflect scientific or organizational hierarchies. By contrast, public consumers of data often search for information using colloquial terminology that does not align with official metadata vocabularies. For example, citizens searching for data about bicycle collisions in an area are unlikely to use the search terms with which organizations like Departments of Transportation describe relevant data. Users may also search with broad terms, such as "traffic safety", and will then not discover data tagged with narrower official terms, such as "vehicular crash". This mismatch raises the question of how to bridge the users' ways of talking and searching with the language of technical metadata. In similar situations, it has been beneficial to augment official metadata with semantic annotations that expand the discoverability and relevance recommendations of data, supporting more inclusive access. Adopting this strategy, we develop a method for automated semantic annotation, which aggregates similar thematic and geographic information. A novelty of our approach is the development and application of a crosscutting base vocabulary that supports the description of geospatial themes. The resulting annotation method is integrated into a novel open access collaboration platform (Esri's ArcGIS Hub) that supports public dissemination of civic data and is in use by thousands of government agencies. Our semantic annotation method improves data discovery for users across organizational repositories and has the potential to facilitate the coordination of community and organizational work, improving the transparency and efficacy of government policies.

Subject Classification

ACM Subject Classification
  • Information systems → Digital libraries and archives
  • data discovery
  • metadata
  • query expansion
  • interoperability


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Sean Bechhofer, David De Roure, Matthew Gamble, Carole Goble, and Iain Buchan. Research objects: Towards exchange and reuse of digital knowledge. Nature Precedings, 2010. URL:
  2. Tim Berners-Lee, James Hendler, and Ora Lassila. The semantic web. Scientific american, 284(5):34-43, 2001. Google Scholar
  3. Stefan Bischof, Athanasios Karapantelakis, Cosmin-Septimiu Nechifor, Amit P Sheth, Alessandra Mileo, and Payam Barnaghi. Semantic modelling of smart city data. In Report of the W3C Workshop on the Web of Things 2014, 2014. URL:
  4. Wade Bishop and Tony H Grubesic. Geographic information, maps, and gis. In Geographic Information, pages 11-25. Springer, 2016. Google Scholar
  5. Yaser Bishr. Overcoming the semantic and other barriers to gis interoperability. International journal of geographical information science, 12(4):299-314, 1998. Google Scholar
  6. Christophe Debruyne, Éamonn Clinton, Lorraine McNerney, Atul Nautiyal, and Declan O'Sullivan. Serving ireland’s geospatial information as linked data. In International Semantic Web Conference (Posters &Demos), 2016. Google Scholar
  7. Rob Kitchin. The real-time city? big data and smart urbanism. GeoJournal, 79(1):1-14, 2014. Google Scholar
  8. Sara Lafia, Jon Jablonski, Werner Kuhn, Savannah Cooley, and F Antonio Medrano. Spatial discovery and the research library. Transactions in GIS, 20(3):399-412, 2016. Google Scholar
  9. Matthew S Mayernik. Research data and metadata curation as institutional issues. Journal of the Association for Information Science and Technology, 67(4):973-993, 2016. Google Scholar
  10. George A Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39-41, 1995. Google Scholar
  11. Barry Smith and Mark Jensen. The unep ontologies and the obo foundry. In ICBO/BioCreative, 2016. Google Scholar
  12. Elaine Svenonius. The intellectual foundation of information organization. MIT press, 2000. Google Scholar
  13. Open Research Data Taskforce. Research data infrastructures in the uk : Landscape report. Technical report, Universities UK, 2017. Google Scholar
  14. Andrew Turner. Desire paths to open data., 2014.
  15. Anneke Zuiderwijk and Marijn Janssen. Open data policies, their implementation and impact: A framework for comparison. Government Information Quarterly, 31(1):17-29, 2014. Google Scholar