If you are interested in one or more of the following "Arabic Multi Dialect Text Corpora", JUST e-mail me: kaa846 <@> cs.bham.ac.uk.
- Arabic Text Corpora
a) Gulf Text Corpus
b) Levantine Text Corpus
c) Egyptian Text Corpus
d) North Africa Text Corpus
- Analysis of Text Corpora
a) Classified Words
b) Gulf Corpus _ Distinct tokens with frequency
c) Levantine Corpus; Distinct tokens with frequency
d) Egyptian Corpus; Distinct tokens with frequency
e) North Africa Corpus; Distinct tokens with frequency
f) All Four corpora; Distinct tokens with frequency
g) All Four corpora; Tokens with frequency
Reference:
K. Almeman and M. Lee, "Automatic Building of Arabic Multi Dialect Text Corpora by Bootstrapping Dialect Words", In The First International Conference on Communications, Signal Processing, and their Applications (ICCSPA’13), Sharjah, UAE, 12-14 Feb. 2013, IEEE, 2013. [IEEE] [PDF]