Multilingual Trend Detection in the Web

Author Jan Stutzki



PDF
Thumbnail PDF

File

OASIcs.SCOR.2014.16.pdf
  • Filesize: 0.53 MB
  • 9 pages

Document Identifiers

Author Details

Jan Stutzki

Cite As Get BibTex

Jan Stutzki. Multilingual Trend Detection in the Web. In 4th Student Conference on Operational Research. Open Access Series in Informatics (OASIcs), Volume 37, pp. 16-24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014) https://doi.org/10.4230/OASIcs.SCOR.2014.16

Abstract

This paper represents results from our ongoing research project in the foresight area. The goal of the project is to develop web based tools which automatically detect activity and trends regarding given keywords. This knowledge can be used to enable decision makers to react proactively to arising challenges.

As for now we can detect trends worldwide in more than 60 languages and assign these trends accordingly to over 100 national states. To reach this goal we utilize the big search engines as their core competence is to determine the relevance of a document regarding the search query. The search engines allow slicing of the results by language and country.

In the next step we download some of the proposed documents for analysis. Because of the amount of information required we reach the field of Big Data. Therefore an extra effort is made to ensure scalability of the application.

We introduce a new approach to activity and trend detection by combining the data collection and detection methods. To finally detect trends in the gathered data we use data mining methods which allow us to be independent from the language a document is written in. The input of these methods is the text data of the downloaded documents and a specially prepared index structure containing meta data and various other information which accumulate during the collection of the documents.

We show that we can reliably detect trends and activities in highly active topics and discuss future research.

Subject Classification

Keywords
  • Information Retrieval
  • Web Mining
  • Trend Detection

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail