COVID-19에 대한 공개 협업

Source: GitHub Blog | Author: Martin Woodward

COVID-19의 불확실성과 심각성 속에서, 우리는 전세계 과학자, 정부 관리, 언론인, 프로그래머 및 관련 시민 커뮤니티가 공동의 목표를 가지고 다양한 프로젝트에 협력하기 위해 모이는 것을보고 영감을 받았습니다. COVID-19를 이해하고 최상의 응답을 조정합니다. 이러한 프로젝트 중 많은 프로젝트가 기존 소프트웨어 프로젝트는 아니지만 동일한 협업 개발 모델이 선별 된 데이터 세트, DIY 명령어 세트 등에 적용됩니다.

다음은 지금까지 COVID-19를 추적, 이해 및 대응하기 위해 본 가장 영향력있는 오픈 소스 프로젝트입니다.

Tracking the pandemic collaboratively

One of the most cited open COVID-19 datasets is provided by Johns Hopkins University (JHU). Epidemiologists, journalists, and statisticians from around the world are treating this as one of the canonical sources of data on the outbreak. The data is also used to power this interactive dashboard, which tracks reported cases of COVID-19 in real time. As they explain in their article in The Lancet, the Johns Hopkins Center for Systems Science and Engineering developed the dashboard “to provide researchers, public health authorities, and the general public with a user-friendly tool to track the outbreak as it unfolds.”

The data is pulled in from various sources (primarily DXY), verified by cross-referencing other sources such as the WHO. This dashboard has generally been faster than the WHO in reporting countries’ first cases. The JHU team believes the dashboard is especially useful in providing essential information for appropriate responses in the earliest stages of viral outbreak.[1]

Another high-quality dataset made available to the public is the nCoV2019 dataset by the Institute for Health Metrics and Evaluation at the University of Washington. The data is presented in this dashboard. The dataset contains highly individual data for each patient such as date of symptom onset, date of laboratory confirmation, and more. It’s intended to aid in calculating key statistics of COVID-19 such as reproduction number, incubation period, and other important factors.[2]

Tracking cases in the US

The most comprehensive data source on US testing and infection rates is the COVID-19 Tracker project.[3,4] The project’s numbers are available on a web page and Google sheet, and via a public API. This project was started in early March, led by a partnership of The Atlantic and the founder of Related Sciences out of concern about the lack of testing information being provided by the CDC. The partners put out a call for volunteers, who quickly developed a collection of software packages to crawl state websites, aggregate the data, and make the dataset available to the public via APIs. The project was developed quickly, and the team shared its source code and datasetsOur World in Data’s page on COVID-19 testing used to list COVID19 Tracker numbers alongside CDC numbers, but now only reports the COVID19 Tracker numbers.

Volunteer computing for large scale research

Scientific work is also being carried out on COVID-19, both for epidemiological research and in the hopes of finding a vaccine or a cure. Folding@home is a distributed-computing project that uses the personal computers of volunteers to model molecular dynamics for, among other things, computational drug design. They have started an effort focused on COVID-19 to find potentially druggable protein targets. Data for this effort is stored in this repository. Folding@home is an open-source project, and all of its datasets and software are available.

Helping the public

The World Healthcare Organization app collective is rapidly putting together a mobile application to help people around the world cope with COVID-19. The team, led by Dr. Daniel Kraft, is rapidly putting together a first version of the app. Their goal is to have the app provide local information for people and have their data feedback to public health officials to improve accuracy for other users.

Faster application of the scientific method

Nextstrain is an open-source project for tracking and analyzing pathogen genomes. They run a dashboard of the genomic epidemiology of COVID-19. The dashboard shows the evolutionary relationships of the mutations of the HCoV-19 viruses, which can help to trace the origins of the virus. Nextstrain’s goal is to aid epidemiological understanding of viruses to improve outbreak response. They state explicitly on their website that “current scientific publishing practices hinder the rapid dissemination of epidemiologically relevant results,” and they are dedicated to providing high-quality data quickly to minimize the damage done by pandemic outbreaks. Nextstrain’s COVID-19 dashboard sources its data from GISAID, which has strict sharing guidelines, but its software is all open source.

Smaller scientific datasets abound, such as this repository of chest X-ray images, aimed at developing AI to improve diagnostic accuracy and predict the infection.

Data visualization

There are numerous smaller-scale scientific visualization projects on COVID-19. The Novel Coronavirus Infection Map provides visualizations of infection histories globally or broken down by country. It’s the work of the Humanistic GIS Lab at the University of Washington and pulls in data from numerous government and public health organizations.

COVID-19 Scenarios is a COVID-19 outbreak simulator designed to determine strain on the health care systems in various regions as the outbreak unfolds.

COVID-19 Dashboards is a set of interactive visualizations of the Johns Hopkins COVID-19 data built in Jupyter Notebooks and converted to blog posts with fastpages. GitHub Actions are used to keep the COVID-19 Dashboards dataset up to date, so the visualizations are always current. This entire site is open source and has been built by a group of volunteer programmers and data scientists. The site includes predictions as well as visualizations, and so is well suited to an open source approach where the source code of the predictive model can be directly examined (fastpages presents the source code directly embedded in the generated web page).

In a similar vein, Predict COVID-19 (repository) allows users to compare the number of COVID-19 cases between different countries, which gives an idea of how the epidemic might progress in the coming days.

A number of projects like this one have been developed to simplify programmatic access to COVID-19 data. This API, serving out the Johns Hopkins data, drives numerous COVID-19 visualization sites, including almost 20 responsive live visualizations.

Nations, states, municipalities, communities

The country of Italy is sharing all its latest COVID-19 data. This data is used to power a dashboard, which tracks infections throughout the country in real time. In this same vein, various metropolitan areas like Tokyo and Zurich are storing and sharing real-time infection information via GitHub repositories.

The Wuhan2020 community project is a self-organized, open source community project aimed at “establishing a data service for real-time synchronization of hospitals, factories, procurement and other information, and convening all those who want to contribute to this fight against viruses”.

DIY devices

Finally, the Low-Cost Open-Source Ventilator project gives extensive instructions on how to build a low-cost respirator, which may save lives if hospitals’ supplies of standard respirators become exhausted.

Need GitHub’s help for a COVID-19 project?

We’re inspired to see this community of contributors come together with such a robust response to the COVID-19 outbreak. We are already donating 60,000 computing hours/day to Folding@home, and we’ve reached out to other projects to offer support. If you’re on a team that needs access to any GitHub products or services for a project related to COVID-19, send us a note with information about your project and how GitHub can help.

[1] John Hopkins: COVID-19 Map FAQs
[2] Our World in Data: nCoV-2019 Data Working Group data
[3] Talking Points Memo: Key Source of COVID-19 Testing and Infection Data
[4] The Atlantic: How to Understand Your State’s Coronavirus Numbers

댓글 남기기