2. Seemingly sober tweets: Human annotators
as well as our classifier could not identify
whether ‘Will you take her on a date? But
really she does like you’ was drunk, although
the author of the tweet had marked it so.
This example also highlights the difficulty of
drunk-texting prediction.
3. Pragmatic difficulty: The tweet ‘National
dress of Ireland is one’s one vomit.. my fam-
ily is lovely’ was correctly identified by our
human annotators as a drunk tweet. This
tweet contains an element of humour and
topic change, but our classifier could not cap-
ture it.
7 Conclusion & Future Work
In this paper, we introduce automatic drunk-
texting prediction as the task of predicting a tweet
as drunk or sober. First, we justify the need for
drunk-texting prediction as means of identifying
risky social behavior arising out of alcohol abuse,
and the need to build tools that avoid privacy leaks
due to drunk-texting. We then highlight the chal-
lenges of drunk-texting prediction: one of the
challenges is selection of negative examples (sober
tweets). Using hashtag-based supervision, we cre-
ate three datasets annotated with drunk or sober
labels. We then present SVM-based classifiers
which use two sets of features: N-gram and stylis-
tic features. Our drunk prediction system obtains
a best accuracy of 78.1%. We observe that our
stylistic features add negligible value to N-gram
features. We use our heldout dataset to compare
how our system performs against human annota-
tors. While human annotators achieve an accuracy
of 68.8%, our system reaches reasonably close and
performs with a best accuracy of 64%.
Our analysis of the task and experimental find-
ings make a case for drunk-texting prediction as a
useful and feasible NLP application.
References
Aby. 2014. Aby word processing website, January.
Steven Bird. 2006. Nltk: the natural language toolkit.
In Proceedings of the COLING/ACL on Interactive
presentation sessions, pages 69–72. Association for
Computational Linguistics.
David M Blei, Andrew Y Ng, and Michael I Jordan.
2003. Latent dirichlet allocation. the Journal of ma-
chine Learning research, 3:993–1022.
Josephine A Borrill, Bernard K Rosen, and Angela B
Summerfield. 1987. The influence of alcohol on
judgement of facial expressions of emotion. British
Journal of Medical Psychology.
Angela Bryan, Courtney A Rocheleau, Reuben N Rob-
bins, and Kent E Hutchinson. 2005. Condom use
among high-risk adolescents: testing the influence
of alcohol use on the relationship of cognitive corre-
lates of behavior. Health Psychology, 24(2):133.
Brad J Bushman and Harris M Cooper. 1990. Effects
of alcohol on human aggression: An intergrative re-
search review. Psychological bulletin, 107(3):341.
Christopher Carpenter. 2007. Heavy alcohol use and
crime: Evidence from underage drunk-driving laws.
Journal of Law and Economics, 50(3):539–557.
Chih-Chung Chang and Chih-Jen Lin. 2011. Lib-
svm: a library for support vector machines. ACM
Transactions on Intelligent Systems and Technology
(TIST), 2(3):27.
Ted A Loomis and TC West. 1958. The influence of al-
cohol on automobile driving ability: An experimen-
tal study for the evaluation of certain medicologi-
cal aspects. Quarterly journal of studies on alcohol,
19(1):30–46.
John Merrill, GABRIELLE MILKER, John Owens,
and Allister Vale. 1992. Alcohol and attempted sui-
cide. British journal of addiction, 87(1):83–89.
James W Pennebaker. 1993. Putting stress into words:
Health, linguistic, and therapeutic implications. Be-
haviour research and therapy, 31(6):539–548.
James W Pennebaker. 1997. Writing about emotional
experiences as a therapeutic process. Psychological
science, 8(3):162–166.
Matthew Purver and Stuart Battersby. 2012. Experi-
menting with distant supervision for emotion classi-
fication. In Proceedings of the 13th Conference of
the European Chapter of the Association for Com-
putational Linguistics, pages 482–491. Association
for Computational Linguistics.
Philip Resnik, Anderson Garron, and Rebecca Resnik.
2013. Using topic modeling to improve prediction
of neuroticism and depression. In Proceedings of
the 2013 Conference on Empirical Methods in Nat-
ural, pages 1348–1353. Association for Computa-
tional Linguistics.
Theresa Wilson, Janyce Wiebe, and Paul Hoffmann.
2005. Recognizing contextual polarity in phrase-
level sentiment analysis. In Proceedings of the con-
ference on human language technology and empiri-
cal methods in natural language processing, pages
347–354. Association for Computational Linguis-
tics.