Skip to search boxSkip to navigationSkip to main content

Classification of colloquial Arabic tweets in real-time to detect high-risk floods

  • Waleed Alabbas
    ,
  • Haider M. Al-Khateeb
    ,
  • ,
  • Gregory Epiphaniou
    ,
  • Ingo Frommholz
Research Output: Chapter in Book/Report/Conference proceeding Conference contribution Peer-review

Abstract

Twitter has eased real-time information flow for decision makers, it is also one of the key enablers for Open-source Intelligence (OSINT). Tweets mining has recently been used in the context of incident response to estimate the location and damage caused by hurricanes and earthquakes. We aim to research the detection of a specific type of high-risk natural disasters frequently occurring and causing casualties in the Arabian Peninsula, namely 'floods'. Researching how we could achieve accurate classification suitable for short informal (colloquial) Arabic text (usually used on Twitter), which is highly inconsistent and received very little attention in this field. First, we provide a thorough technical demonstration consisting of the following stages: data collection (Twitter REST API), labelling, text pre-processing, data division and representation, and training models. This has been deployed using 'R' in our experiment. We then evaluate classifiers' performance via four experiments conducted to measure the impact of different stemming techniques on the following classifiers SVM, J48, C5.0, NNET, NB and k-NN. The dataset used consisted of 1434 tweets in total. Our findings show that Support Vector Machine (SVM) was prominent in terms of accuracy (F1=0.933). Furthermore, applying McNemar's test shows that using SVM without stemming on Colloquial Arabic is significantly better than using stemming techniques.

Publication Information

Output type

Research Output: Chapter in Book/Report/Conference proceeding Conference contribution Peer-review

Original language

English

Pages from-to (Number of pages)

Pages 1-8 (8 pages)

Publication milestones

  • Published - 06/10/2017

Publication status

Published - 06/10/2017

Publisher

Institute of Electrical and Electronics Engineers Inc., United States

Publication series

  • Publication series name: 2017 International Conference On Social Media, Wearable And Web Analytics, Social Media 2017
    Volume: 2017-June
9781509050574

ISBN (Electronic)

9781509050574

External Publication IDs

  • handle.net: 10547/624252
  • Scopus: 85044829111

Host publication title

2017 International Conference On Social Media, Wearable And Web Analytics, Social Media 2017