Student Story

Students Focus Data Science Skills on COVID-19 Project

Division of Mathematics, Computing, and Statistics (MCS) students participated in Broadstreet’s COVID-19 Data Project Internship. According to their website, Broadstreet “provides public good tools that allow professionals to spend their time and resources on improving community health.” The project was a collaboration of “approximately 200 students, statisticians, epidemiologists, healthcare experts, data scientists and other passionate professionals who are committed to having the most accurate, community level data about the COVID-19 positive tests and fatality rates.”

“I took part in the initial 8-week timeline and I expect to stay on and extend that through the summer,” said participating student Natalie Starczewski ’22. “I became involved after applying with the link Nanette Veilleux sent us in March. I started in March as one of the individuals doing daily data entry.”

The goal of the project is to create a centralized data set of COVID cases per county for every day of the pandemic in the U.S. “I was assigned some states in the western U.S. and would add their daily total case and death numbers by the county to a central spreadsheet,” added Starczewski.

As the number of volunteers grew, Starczewski was moved from data entry to quality assurance for the west coast region, and was eventually made the region’s team leader for data validation/quality assurance. “I now manage a team of seven students from across the U.S. My team checks our dataset against comparison data sets from places like the New York Times and we work as a team to research discrepancies in the data and correct errors in the trend. The dataset produced by this project will stand out because of our rigorous data validation/quality assurance system. It has been a super positive experience and it feels good to be doing something purposeful in these crazy times.” 
 
“This is hopefully a once-in-a-lifetime opportunity to track a pandemic in real-time,” said Charlie Repaci ’21, “and I wanted to explore if I enjoyed working with epidemiological data.”

Repaci is also collecting county-level data, and noted the difficulties of doing so. “All states have their unique challenges and you can see which ones are organized and handling the crisis even without looking at the counts.” Repaci will also participate in a special project to detect data anomalies to help streamline the process of checking our numbers against various sources, including The New York Timesand the Johns Hopkins info board.

Students were keen to use data science skills they have learned in class. “I saw this as an opportunity to be involved with the current pandemic and to help out in any way that I can,” said Richney Chin-Chap ’22.

In addition to data entry and fact-checking, Chin-Chap is also a part of the data visualization team. These visuals display the daily trend of coronavirus cases and deaths in each state/county, the total confirmed cases in the US, any trends of how fast the virus is spreading and how well we are doing to flatten the curve.

“My experience with this internship has been nothing but positive as I work with amazing people in such a vast and supportive community,” said Chin-Chap.

You May Also Like