,
Auriol Degbelo
,
Benjamin Risse
Creative Commons Attribution 4.0 International license
The exponential growth of interactive geovisualizations on the Web has underscored the need for automated techniques to enhance their findability. In this paper, we present the Geovicla dataset (2.5K instances), constructed through the harvesting and manual labelling of webpages from a broad range of domains. The webpages are categorized into three groups: "interactive visualisation", "interactive geovisualisation" and "`no interactive visualisation". Using this dataset, we compared three approaches for interactive (geo)visualization classification: (i) a heuristic-based approach (i.e. using manually derived rules), (ii) a feature-engineering approach (i.e. hand-crafted feature vectors combined with machine learning classifiers) and (iii) an embedding-based approach (i.e. automatically generated large language model (LLM) embeddings with machine learning classifiers). The results indicate that LLM embeddings, when used in conjunction with a multilayer perceptron, form a promising combination, achieving up to 74% accuracy for multiclass classification and 75% for binary classification. The dataset and the insights gained from our empirical comparison offer valuable resources for GIScience researchers aiming to enhance the discoverability of interactive geovisualizations.
@InProceedings{huffer_et_al:LIPIcs.GIScience.2025.10,
author = {H\"{u}ffer, Phil and Degbelo, Auriol and Risse, Benjamin},
title = {{Geovicla: Automated Classification of Interactive Web-Based Geovisualizations}},
booktitle = {13th International Conference on Geographic Information Science (GIScience 2025)},
pages = {10:1--10:12},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-378-2},
ISSN = {1868-8969},
year = {2025},
volume = {346},
editor = {Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.10},
URN = {urn:nbn:de:0030-drops-238397},
doi = {10.4230/LIPIcs.GIScience.2025.10},
annote = {Keywords: spatial information search, geovisualization search, findable interactive geovisualization, webpage classification}
}