โก Quick Summary
This article provides a comprehensive overview of data science applications in public health, highlighting insights from 112 projects supported by the CDC’s Data Science Upskilling program from 2019 to 2023. The findings reveal a significant increase in the use of artificial intelligence (AI) and machine learning (ML), emphasizing the need for ongoing workforce development in these areas. ๐
๐ Key Details
- ๐ Timeframe: 2019-2023
- ๐ข Organization: Centers for Disease Control and Prevention (CDC)
- ๐ Total Projects: 112 data science projects
- ๐งฉ Key Focus Areas: COVID-19, infectious diseases, vaccines
- โ๏ธ Tools Used: RStudio, Jupyter Notebooks, Power BI, Python, R
๐ Key Takeaways
- ๐ AI and ML usage increased from 33% in 2019 to 56% in 2023.
- ๐ Data visualization was employed in 54% of projects, indicating a strong focus on decision support.
- ๐ Statistics were utilized in 51% of projects, showcasing traditional analytical methods.
- ๐ค 42% of projects incorporated AI and ML methodologies.
- ๐ ๏ธ Programming Languages: Python (56%) and R (55%) were the most commonly used.
- ๐ 52% of projects aimed to support decision-making processes.
- ๐ 22% of projects focused on improving processes and programs.
- ๐ Visualization tools like dashboards were prioritized for effective communication of data insights.

๐ Background
The integration of data science into public health is becoming increasingly vital as organizations recognize its potential to enhance decision-making and improve health outcomes. However, there remains a gap in understanding how these technologies are being applied in real-world settings. This study aims to bridge that gap by analyzing projects from the CDC’s Data Science Upskilling program, which has been instrumental in fostering data science capabilities within public health.
๐๏ธ Study
The study involved a thorough review of applications and final presentations from the CDC’s Data Science Upskilling program over five years. Researchers analyzed the projects based on seven characteristics, including the public health domain, data science methods, and tools used. This comprehensive approach provides valuable insights into the evolving landscape of data science in public health.
๐ Results
The analysis revealed that the CDC supported a total of 112 data science projects across five annual cohorts. Notably, projects addressing the COVID-19 pandemic constituted 13% of the total, with similar percentages for infectious diseases and vaccines. The increasing use of AI and ML methodologies highlights a shift towards more advanced analytical techniques, with a significant rise in their application from 33% in 2019 to 56% in 2023.
๐ Impact and Implications
The findings underscore the importance of data visualization and advanced analytics in public health decision-making. As organizations continue to embrace data science, there is a clear need for enhanced infrastructure and training in these areas. The increasing reliance on AI and ML suggests that workforce development strategies must evolve to keep pace with technological advancements, ensuring that public health professionals are equipped with the necessary skills to leverage these tools effectively.
๐ฎ Conclusion
This study highlights the transformative potential of data science in public health, particularly through the use of AI and ML. As the field continues to evolve, it is crucial for public health organizations to prioritize workforce development and data modernization efforts. By doing so, they can optimize the use of data science technologies to improve health outcomes and respond effectively to emerging public health challenges. The future of public health is undoubtedly intertwined with the advancements in data science! ๐
๐ฌ Your comments
What are your thoughts on the integration of data science in public health? How do you see AI and ML shaping the future of healthcare? Let’s start a conversation! ๐ฌ Leave your thoughts in the comments below or connect with us on social media:
Decoding Data Science Upskilling: Insights From 5 Years of Data Science Projects at the Centers for Disease Control and Prevention, 2019-2023.
Abstract
CONTEXT: Public health organizations are increasingly recognizing the value and potential of data science. However, a gap remains in understanding how data science is being applied in public health.
OBJECTIVE: This article provides a comprehensive overview of data science applications in real-world public health settings. By describing the characteristics of projects supported by the Centers for Disease Control and Prevention’s Data Science Upskilling (DSU) program during 2019-2023, we seek to guide future efforts in public health data science workforce development and data modernization.
METHODS: We manually reviewed DSU applications and final presentations about the projects compiled during 2019-2023. We analyzed projects based on 7 characteristics, including public health domain and task, data science topic and method, data modality, tools, and programming languages used.
RESULTS: DSU supported 112 data science projects across 5 annual cohorts (2019-2023). Many projects addressed the COVID-19 pandemic (13%), infectious diseases (13%), and vaccines (11%). Approximately half the projects used data visualization (54%) and statistics (51%), with 42% employing artificial intelligence (AI) and machine learning (ML). Furthermore, 52% of projects were designed to support decision making, and 22% sought to improve processes and programs. Learners primarily used RStudio (50%), Jupyter Notebooks (41%), and Power BI (26%), along with Python (56%) and R (55%). AI and ML use increased from 33% of projects in 2019 to 56% in 2023, demonstrating an evolving focus on advanced methodologies.
CONCLUSIONS: Many teams prioritized data visualization, such as dashboards and visualization tools to support decision making, indicating opportunities for additional infrastructure and training in this area. We observed increasing use of AI and ML, suggesting a need for staff upskilling in these domains. Optimally leveraging data science technologies will require workforce development strategies and data modernization efforts to keep pace with the rapidly evolving field.
Author: [‘Antoine M’, ‘Ojo AI’, ‘Bertulfo MC’, ‘Okomo-Adhiambo M’, ‘Kirkcaldy RD’]
Journal: J Public Health Manag Pract
Citation: Antoine M, et al. Decoding Data Science Upskilling: Insights From 5 Years of Data Science Projects at the Centers for Disease Control and Prevention, 2019-2023. Decoding Data Science Upskilling: Insights From 5 Years of Data Science Projects at the Centers for Disease Control and Prevention, 2019-2023. 2026; 32:260-267. doi: 10.1097/PHH.0000000000002284