Social Media Tools and Market Intelligence Reports for Understanding Tobacco Product Use and Dissemination

To improve the Food and Drug Administration's (FDA) surveillance of social media data for trends in tobacco products use and perception, AIR built two distinct social media analysis tools. These tools leveraged machine learning, text analytics, and natural language processing to scrap and analyze content on social network websites. Information and findings from the study were presented to the FDA in 12 market intelligence reports which covered a wide variety of tobacco related topics.

The Challenge

To protect the public’s health, the 2009 Family Smoking Prevention and Tobacco Control Act (Tobacco Control Act) gave the Food and Drug Administration, and specifically, the Center for Tobacco Products (CTP), the authority to regulate tobacco products and educate the public about the dangers of tobacco use. Currently, approximately 443,000 Americans die from tobacco-related illnesses, such as cancer and heart disease, each year.

To improve CTP’s surveillance of social media data and further analyses, tools must be developed with data science methodology to allow for more accurate and nuanced surveillance in the future. These tools will help refine future reporting done through monthly market intelligence reports and other means. In addition, regular reporting on the tobacco consumer and industry landscape is crucial to furthering CTPs understanding of emerging trends and informing regulatory, communications and research activities.

While these reports may, or may not, integrate these tools consistently, data science tools can help more nuanced reporting in the future. These reports may gather data from social media, secondary datasets, or other sources, and will provide valuable insights more quickly than traditional primary data collections (e.g., surveys).

Our Role

To provide insights as quickly as possible, AIR leveraged its experience with data science tools and traditional qualitative and quantitative research to design and build two social media analysis tools for the FDA. The first tool extracts X (formerly Twitter) social media data and conducts social network analyses to help track topic trends and influencers and detect social network communities. The second tool conducts content analyses using NLP and topic modeling on the extracted data to better understand the content of the posts themselves and their impacts. Together these tools can be used, for example, to assess youth perceptions and industry advertising tactics around Electronic Nicotine Delivery Systems (ENDS) products.

Additionally, AIR publishes its findings alongside social media data obtained through Brandwatch, in quarterly market intelligence reports on a variety of tobacco-related topics, which include perceptions and impacts of the 2021 menthol cigarette ban, cigar use among African-American youth, and perceptions of tobacco products among bi-lingual and Spanish speakers. Collectively, the information gathered from the tools and reports help to inform decision makers about the life cycles of content on social media platforms and its impact on consumers.


The information and insights provided through AIR’s work with the FDA, mean that organizations will be better equipped to research and understand the spread of information about tobacco products on social media and related platforms.

Considerations for Equity and Algorithmic Bias

For the market intelligence reports, AIR considers the different perceptions of tobacco products by race and ethnicity. To inform how we present these materials and not inject biases into their presentation, AIR refers to its Culturally and Linguistically Appropriate Standards for Projects, Research and Operations as guidelines.

Timothy Hill
Senior Vice President, Health