Postings made on social media platforms represent a novel source of data for drug addiction epidemiology. Often however, the language used in such forums comprises slang and jargon which complicate knowledge extraction. Currently, software tools that allow scientists to analyze the esoteric language-use in social media drug-use sub-culture do not exist. Drug use Insights (DUI) is a public and open-source web application developed to address this deficiency. Given post(s) from social media forums such as Reddit and Twitter, DUI can be used to identify constituent terms related to drug use and conduct epidemiologically-relevant content analysis.
DUI is underlined by a hierarchical taxonomy encompassing 84 different addiction related categories consisting of over 9,000 drug-use terms, where each category encompasses a set of semantically related terms. This current set of categories and terms were established by utilizing thematic analysis in conjunction with term embedding generated using 7,472,545 Reddit posts made by 1,402,017 redditors in 117 recreational drug use subreddits and 29 drug addiction recovery subreddits during the period December 31, 2010 - June 27, 2020. This taxonomy is updated annually to keep it current.