COVID-ARC helps address the immediate need to understand the spread and impact of COVID-19 with a platform of networked and centralized archives that store, curate, visualize, and disseminate multimodal data related to the disease. The platform includes patients’ demographics, clinical evaluations, vitals, EKGs, and imaging data, such as CT, X-Ray, PET, ultrasound and MRI.
COVID-ARC data, together with a variety of analytic tools, are shared broadly with the world-wide scientific community. This will help maximize the potential for research progress by uniting scientists from diverse fields, including medicine, public health, and artificial intelligence.
COVID-ARC helps address the immediate need to understand the spread and impact of COVID-19 with a platform of networked and centralized archives that store, curate, visualize, and disseminate multimodal data related to the disease. The platform includes patients’ demographics, clinical evaluations, vitals, EKGs, and imaging data, such as CT, X-Ray, PET, ultrasound and MRI.
COVID-ARC data, together with a variety of analytic tools, are shared broadly with the world-wide scientific community. This will help maximize the potential for research progress by uniting scientists from diverse fields, including medicine, public health, and artificial intelligence.
COVID-19 Data
COVID-ARC helps address the immediate need to understand the spread and impact of COVID-19 with a platform of networked and centralized archives that store, curate, visualize, and disseminate multimodal data related to the disease. The platform includes patients’ demographics, clinical evaluations, vitals, EKGs, and imaging data, such as CT, X-Ray, PET, ultrasound and MRI.
COVID-ARC data, together with a variety of analytic tools, are shared broadly with the world-wide scientific community. This will help maximize the potential for research progress by uniting scientists from diverse fields, including medicine, public health, and artificial intelligence.
COVID-ARC is a data archive that stores multimodal (i.e., demographic information, clinical outcome reports, imaging scans) and longitudinal data related to COVID-19 and provides various statistical and analytic tools for researchers.This archive provides access to data along with user-friendly tools for researchers to perform analyses to better understand COVID-19 and encourage collaboration on this research. The COVID-19 pandemic is spreading rapidly across the world, and governments are imposing travel bans, quarantine laws, business and school closings, and many other restrictions in efforts to contain the virus and limit the spread. However, much is still unknown about COVID-19. There is an urgent need for scientists around the world to work together to model the virus, study how the virus has changed and will change over time, understand how it spreads, and discover a vaccine. The work from this project can also prepare scientists for future pandemics by putting the infrastructure in place to enable researchers to aggregate data and perform analyses quickly in the event of an emergency.
The approach is to develop a platform of networked and centralized web-accessible data archives to store multimodal data related to COVID-19 and make them broadly available and accessible to the world-wide scientific community to expedite research in this area due to the urgent nature of the COVID-19 pandemic. By leveraging previous work in developing data repositories and archival capabilities at the at the Laboratory of Neuro Imaging at the USC Mark and Mary Stevens Neuroimaging and Informatics Institute, COVID-ARC aims to provide an efficient and secure data repository platform that facilitates data access and analysis. COVID-ARC provides tools for researchers to visualize and analyze various types of data as well as a website with tools for training, announcements, virtual information sessions, and a knowledgebase wherein researchers post questions and receive answers from the community.
An efficient, secure, HIPAA-compliant data repository platform.
A multi-center data review and assessment system that preserves data quality, fidelity and provenance.
Mechanism and regular training sessions to increase the ease of data aggregation and the downloading of large datasets.
Spatially normalize data and create subject cohorts to search, compare and download data.
Integrated processing on an extensible framework of protocols that can integrate with modules from any other software suite.
COVID-ARC accommodates a wide variety of data types related to COVID-19, including clinical evaluation
(symptoms), vitals (spirometry, temperature, respiration rate, heart rate, etc.), demographic,
geolocation, EKG, EEG, CT, X-ray, PET, and MRI, in order to create a comprehensive picture of the
virus and its spread. The following table provides an overview of data types currently accepted,
but due to its adaptable design, other formats will be integrated according to end-user needs.
LONI has more than 18 PB of storage capacity.
Data Categories | Data Types | File Formats* |
---|---|---|
Imaging | Structural MRI, resting state fMRI, DTI, PET, CSM, CT, ultrasound | DICOM, NIFTI, NII, MGZ |
Clinical Data | Symptoms, vitals, patient history, medical history, cognitive assessments, demographic, geolocation | Python scripts, R, C code, Excel, CSV, NPY* |
Temporal Recordings | EEG, ECoG, multi- or single-unit microelectrode recording, EMG, TMS. | Python scripts, R, C code, Excel, MP4, AVI, WAV, CSV, NPY |
*The file formats listed provide a snapshot of COVID-ARC’s capabilities but do not represent a comprehensive list. Other formats will be integrated over time according to end-user needs.
QUALITY
Some data, such as EEG, are subject to amplifier noise, technical artifacts (e.g. poor electrode location, issues with electrode impedance), and physiological artifacts (e.g. eye movement). In order to ensure consistent data quality, COVID-ARC uses range checking, signal-to-noise ratio checks, artifact removal techniques, power spectrum analysis, and various filtering methods to inspect for noise. The LONI Quality Control System (LONI QC) is used for all modalities of imaging data and regularly reviewed by participating collaborators.
PROVENANCE
All information pertaining to acquisition, QC, pre-processing, and analyses is captured and retained, providing a comprehensive history and provenance to the data. When algorithms are executed within the LONI Pipeline, provenance is captured in the form of machine- and human-readable XML files.
CONTROL
COVID-ARC relies on an infrastructure comprised of fail-safe, redundant, and secure components to store both raw and processed data. In the event of a single system failure, redundant web, application, and database servers ensure service continuity, while data backup mechanisms are in place to protect the integrity of data. Investigators may choose to use centralized, federated, or cloud-based solutions to store their data. Data providers will maintain control over their data at all times; COVID-ARC simply provides a user-friendly tool to facilitate the storage, management, and sharing of those data.
Case Summaries
Johns Hopkins University Coronavirus Resource Center
Comprehensive global COVID-19 case tracker with critical trend evaluation
Coronavirus (COVID-19) Data in the United States
The New York Times is compiling time series data of confirmed and probable COVID-19 cases and deaths at the federal, state, and county level
Coronavirus (COVID-19) Cases Worldwide
Worldwide confirmed tested cases of COVID-19 and the number of deaths and recoveries from the disease sourced from Johns Hopkins University Center for Systems Science and Engineering
Novel Coronavirus 2019 Dataset
Day-level information on COVID-19 cases worldwide extracted from Johns Hopkins Coronavirus Resource Center
Citation format for each dataset in COVID-ARC database
COVID-19 CT Image Dataset– Site 1
Public chest CT image dataset of 2,168 COVID-19 positive images, 1,247 images from patients with other pulmonary illnesses, and 758 images of healthy patients from Sao Paulo, Brazil.
COVID-19 Collective CT Image Dataset – Site 2
Chest CT dataset with 349 COVID-19 positive images and 463 COVID-19 negative images from various hospitals in China.
COVID-19 CT Image and Lung Segmentation Dataset – Site 3
Chest CT dataset containing 100 COVID-19 positive images and lung segmentation masks for images provided by the Italian Society of Medical Radiology and Interventional (SIRM).
COVID-19 CT and CX Image Dataset – Site 5
Large dataset containing 6687 CT, 7377 CR, and 9463 DX studies from the Valencian Region Medical ImageBank (BIMCV).
COVID-19 CT Image Database – Site 6
CT Scan database containing 1,110 COVID-19 positive cases and 50 lung segmentation masks from the Moscow Center of Diagnostics and Telemedicine.
COVID-19 CT Image Collection – Site 7
COVID-19 CT dataset containing 63,849 images from 377 patients from the Negin Medical Center in Sari, Iran.
COVID-19 Image Data Collection – Site 8
Public dataset of chest X-ray and CT images of COVID-19 positive patients and patients suspected of COVID-19 or other viral/bacterial pneumonias.
UESTC-COVID-19 Dataset – Site 9
Dataset was constructed for the purpose of pneumonia lesion segmentation and it contains CT scans (3D volumes) of 120 patients diagnosed with COVID-19.
2019nCoVR – Site 10
Dataset of the CT images and metadata are constructed from cohorts from the China Consortium of Chest CT Image Investigation (CC-CCII). All CT images are classified into novel coronavirus pneumonia (NCP) due to SARS-CoV-2 virus infection, common pneumonia and normal controls.
COVID-19 Image Repository – Site 11
Chest X-ray dataset from the Institute for Diagnostic and Interventional Radiology, Hannover Medical School, Hannover Germany. The public dataset containing 243 images for COVID-19 positive patients also includes extensive metadata for each image.
COVID-19 Radiography Database – Site 12
Database of chest X-ray images with 3,616 COVID-19 positive images, 1345 viral pneumonia images, 6,012 non-COVID lung infection images, and 10,192 normal images.
COVID Data Saves Lives Clinical Dataset – Site 13
Private dataset containing clinical information for 2,157 COVID-19 patients from HM Hospitales in Madrid, Spain. Information available includes diagnoses, treatment, inpatient care, ICU stay, discharge, laboratory results, and medical imaging features.
COVID-19 Day Level Information Dataset – Site 14
Time series data on worldwide COVID-19 cases, including number of daily affected cases, deaths and recovery.
Data Science for COVID-19 – Site 15
Case, patient, and time series data based on more than 10,000 COVID-19 cases reported in South Korea.
COVID-19 Ultrasound Dataset – Site 16
Ultrasound dataset including 22 images and 50 videos from COVID-19 patients. This dataset also includes 57 videos and 22 images from patients with other bacterial or viral pneumonias as well as 73 videos and 15 images from healthy patients.
CT Images in COVID-19 – Site 17
Retrospective dataset of chest CT images from 632 COVID-19 patients.
Chest Imaging Dataset of Rural COVID-19 Population – Site 18
Dataset of 31,935 CT, CR, and DX images from 105 COVID-19 patients provided by The University of Arkansas for Medical Sciences (UAMS) Translational Research Institute.
RSNA International COVID-19 Open Radiology Database – Site 19
Dataset of 120 chest CT scans of 110 COVID-19 patients from four international sites. This dataset also includes detailed segmentations and diagnostic labels.
RSNA Pneumonia Imaging Dataset – Site 20
Chest X-Ray dataset of 29,684 images from patients with pneumonia.
Pneumonia X-ray Image Collection – Site 21
X-ray dataset from Guangzhou Women and Children’s Medical Center containing 5,863 images from healthy patients and patients diagnosed with bacterial/viral pneumonias.
MRI dataset – Site 22
Neurologic Complications of COVID-19 – Site 23
Clinical data of 581 patients from New York City who experienced neurological complications with COVID-19.
China Consortium of Chest X-Ray Image Investigation COVID-19 Dataset – Site 24
Chest X-Ray dataset of patients with COVID-19, patients with other bacterial and viral pneumonias, and healthy controls from the China Consortium of Chest X-Ray Image Investigation (CC-CXRI).
Open-Source COVID-19 CT Dataset – Site 25
A dataset of 62 submillimetric CTs from 50 COVID-19 patients in addition to automatic lung tissue classification and corresponding clinical scores.
ECG Images of Cardiac and COVID-19 Patients – Site 26
ECG images acquired from various cardiac institutions around Pakistan. ECG images include COVID-19 patients (250), healthy controls (859), and patients with cardiovascular diseases such as Myocardial Infarction (77), abnormal heartbeats (548), and patients with a history of MI (203).
UCSD Pneumonia Dataset - Site 27
A dataset of X-ray images acquired from 976 patients. This includes 1276 total images, with 524 pneumonia positive images from around 400 patients.
ADDITIONAL DATASETS:
COVID-19 Case Surveillance Dataset
How to request access?
Data are made available for limited use upon completion of the registration
information and data use restrictions agreement (RIDURA).
To access the restricted dataset, please email: eocevent394@cdc.gov and include the completed RIDURA.
Access will be granted through https://github.com/cdc-data.
What variables are included in the dataset?
The public dataset includes the following variables: Initial case report date to CDC, Date of first
positive specimen collection, Symptom onset date, if symptomatic, Case status, Sex, Age group (0-9,
10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80+ years), Race and ethnicity (combined),
Hospitalization status, ICU admission status, Death status and Presence of underlying comorbidity or
disease. In addition to variables in the public dataset, the restricted access dataset also includes
the following variables: State of residence, County of residence, County FIPS code, Healthcare worker
status, Pneumonia present, Acute respiratory distress syndrome (ARDS) present, Abnormal chest x-ray
(CXR) present, Mechanical ventilation (MV)/intubation status and Presence of each of the following
symptoms: fever, subjective fever, chills, myalgia, rhinorrhea, sore throat, cough, shortness of
breath, nausea/vomiting, headache, abdominal pain, diarrhea.
Oxford COVID-19 Government Response Tracker (OxCGRT)
Oxford COVID-19 Government Response Tracker (OxCGRT) provides a systematic way to track government responses to COVID-19 across countries and sub-national jurisdictions over time. OxCGRT can be used to describe variation in government responses, explore whether the government response affects the rate of infection, and identify correlates of more or less intense responses.
Platform to aggregate and summarize open datasets related to COVID-19
Public Data Lake for Analysis of COVID-19 Data
Centralized repository of up-to-date and curated COVID-19 datasets hosted by Amazon Web Services
Collection of dataverses and datasets related to COVID-19 cases in China and the United States hosted by China
Open-Access Data and Computational Resources to Address COVID-19
Synthesized open-access data and computational resources freely available to researchers aggregated by the National Institutes of Health
Collection of open datasets about COVID-19 hosted by AMiner
COVID-19 Public Datasets Program
Repository of public datasets related to COVID-19 and the spread of COVID-19 hosted on Google Cloud Platform
Electron Microscopy Public Image Archive
EMPIAR database provides public microscopy data for several types of viruses, including SARS-CoV-2. Currently, there are 16 entries relating to SARS-CoV2, providing raw 2D electron microscopy data that can be used to create a 3D volume. Different datasets consist of images, videos, or both. To access these datasets, search SARS-CoV-2 from the home page and all datasets will be listed and available to download.
All COVID-19 Data Archive (COVID-ARC) data are shared through a secure research data repository at the Laboratory of Neuro Imaging (LONI). Interested researchers may obtain access to COVID-19 imaging, clinical, or demographic data along with user-friendly tools for the purpose of scientific research, teaching, or clinical studies.
Each application is reviewed to verify investigator affiliation with a scientific or educational institution and on the basis of the proposed research or data use. Incomplete applications will be asked to provide additional material. We will respond to requests within one week via email. Approved applicants will receive login information to access and download COVID-19 data from COVID-ARC.
If utilizing data from Site 17, 18, or 19, please do not sign the Data Use Agreement below in order to comply with the Data Usage Policies and Restrictions of the original data provider.
Apply for access to data: To request access to COVID-ARC datasets, please complete and sign the appropriate Data Use Agreement.
Data upload uses secure encryption and is HIPAA compliant. Once data are uploaded, they are securely stored and regularly backed up.
ASPERA is the latest HIPAA compliant software utility from IBM used as a way to transfer large datasets. ASPERA is an extremely fast and light file transfer client and is not subject to the limitations that exist in web browsers. This upload method requires providers install ASPERA on their local machines and request the connection and host credentials from COVID-ARC, which include the correct file paths and storage locations on COVID- ARC servers. Once installed, providers simply log in and select files to transfer. No file structure or naming requirements are involved and data can be deleted from COVID-ARC at any time.View ASPERA instructions and ASPERA HIPAA Compliance
LONI PIPELINE
The LONI Pipeline is a free workflow application primarily for computational scientists. With the LONI Pipeline, users can quickly create workflows that take advantage of all the greatest tools developed in various programming languages that can be applied to neuroimaging, genomics, bioinformatics, and other related data.
LONI quality control
LONI Quality Control (LONI QC) is an imaging data review and assessment platform for human imaging research studies involving either one or multiple centers. LONI QC allows users to anonymously download imaging data from COVID-ARC and run a standardized quality control check via an automated preprocessing system using a number of metrics. Users then receive a detailed report of the image quality.
Principal Investigator
Dominique Duncan, PhDAssistant Professor of Neurology, Biomedical Engineering, and Neuroscience, Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern California.
Dr. Duncan has developed novel analytic tools for analyzing multimodal data, including imaging and electrophysiology. Her interests lie at the intersection of data analysis, signal processing, and machine learning, particularly applied to traumatic brain injury and epilepsy. She has led and co-led large-scale multimodal databasing projects that are linked with visualization and analytic tools, aiming to encourage collaboration across multiple fields. She has also developed virtual reality tools to optimize the process of analyzing neuroimaging data and to improve neuroscience education among K-12 students.
Postdoctoral Scholar
Marianna La Rocca, PhD
Dr. La Rocca received her PhD in Physics applied to Neuroscience from Bari University. Her research involves the use of neuroimaging techniques and computational methods to study biomarkers of epileptogenesis after traumatic brain injury using multimodal data. She has been developing and applying complex network-based quantitative methods and machine learning techniques to electrophysiology and imaging data.
Postdoctoral Scholar
Sana Salehi, M.D.
Sana received her MD degree from Iran in 2017 and started working as a researcher in the field of diagnostic and interventional Radiology after graduation. Her research interests include cardiothoracic imaging, neuroradiology, global health, and patient safety. She has also been working on applications of artificial intelligence in diagnostic imaging. She focused her research on COVID-19 since February 2020 and joined Duncan Lab in 2021 to continue her COVID-19 research with other team members. Sana is passionate about translating scientific research to clinical applications.
Project Specialist
Rachael Garner, BA
Rachael Garner received her Bachelor of Arts in Cognitive Science from the University of California, Berkeley. She conducts multimodal data analysis on human and rodent imaging and electrophysiology data. She also works on multimodal databasing, including data curation and harmonization.
Project assistant
Yujia Zhang, MS
Yujia received her B.S. in Chemical Engineering from Michigan State University in 2018 and her M.S. in Chemical Engineering from the University of Southern California in 2020. Her undergraduate research was focused on single-enzyme kinetics using the Atomic Force Microscope. She is currently working on multimodal data analysis and applying various machine learning methods on COVID-ARC data.
Instructor
Michael Sinclair, PhD
Michael Sinclair has been a technology teacher and coordinator at Bravo Medical Magnet High School since 1999 and has been in education since 1984. He has written and coordinated a number of grants and has co-created three academies. Since 2017, he has served as a California Department of Education-appointed Specialized Secondary Programs Mentor overseeing the implementation of programs throughout Southern California.
Instructor
Glendy Ramirez-De La Cruz, BS
Glendy is a Life Science and Career Technical Education (CTE) teacher at Bravo. Has the overall coordinating and management responsibility for the day-to-day operations of the STAR (Science Technology and Research) & EHA (Engineering Academy for Health) biotechnology programs at Bravo since 2012. She is the primary liaison between USC laboratories, principal investigators and graduate student mentors for the STAR & EHA capstone class.
Alexis Bennett
Alexis is currently a fourth year at the University of Southern California pursuing her Bachelor of Science in Computational Neuroscience and her Master of Science in Biomedical Engineering with a focus in Neuroengineering. She works as a student research assistant and performs manual segmentations on human neuroimaging data to contribute to the findings of potential biomarkers of epileptogenesis after traumatic brain injury.
Azrin Khan
Azrin is an incoming freshman at the University of Southern California’s Viterbi School of Engineering majoring in Electrical and Computer Engineering. She analyzes human imaging data of patients with traumatic brain injury using an interactive software application to identify biomarkers of epileptogenesis.
Jiaju Liu
Jiaju is an incoming freshman at Stanford University, where he plans to study symbolic systems. His research interests include using signal processing and unsupervised learning methods on EEG data to detect and classify high frequency oscillations, putative biomarkers of epileptogenesis. He has developed an automated high frequency oscillation classifier that solves novel machine learning problems regarding fast oscillating data.
Noor Nouaili
Noor is a rising freshman at Yale University. She recently graduated from the Marlborough School in Los Angeles and plans to major in neuroscience and global affairs. Noor has been conducting research for the Epilepsy Bioinformatics Study for Antiepileptogenic Therapy (EpiBioS4Rx) for the past year, focusing on performing MRI brain segmentations.
Aubrey Martinez
Aubrey Martinez is currently pursuing a Bachelor of Science in Neuroscience degree at the University of Southern California. She works on the analysis of human and rodent neuroimaging data, using manual and automated segmentation methods, to identify potential biomarkers of post-traumatic epileptogenesis.
Alexander Bruckhaus
Alex is from the San Francisco Bay Area and is currently a third-year Computer Science/Business student at USC. He is an undergraduate research assistant working on COVID-ARC and is using data science to study the spread and severity of COVID-19 among different populations. Alex hopes to coalesce his interests in Computer Science, Data Science, and Healthcare to further the understanding and progression of our contemporary world while helping others. He loves to eat ramen, travel, and explore the cultures of the world.
Alexis Bennett
Alexis is currently a fourth year at the University of Southern California pursuing her Bachelor of Science in Computational Neuroscience and her Master of Science in Biomedical Engineering with a focus in Neuroengineering. She works as a student research assistant and performs manual segmentations on human neuroimaging data to contribute to the findings of potential biomarkers of epileptogenesis after traumatic brain injury.
Azrin Khan
Azrin is an incoming freshman at the University of Southern California’s Viterbi School of Engineering majoring in Electrical and Computer Engineering. She analyzes human imaging data of patients with traumatic brain injury using an interactive software application to identify biomarkers of epileptogenesis.
Jiaju Liu
Jiaju is an incoming freshman at Stanford University, where he plans to study symbolic systems. His research interests include using signal processing and unsupervised learning methods on EEG data to detect and classify high frequency oscillations, putative biomarkers of epileptogenesis. He has developed an automated high frequency oscillation classifier that solves novel machine learning problems regarding fast oscillating data.
Noor Nouaili
Noor is a rising freshman at Yale University. She recently graduated from the Marlborough School in Los Angeles and plans to major in neuroscience and global affairs. Noor has been conducting research for the Epilepsy Bioinformatics Study for Antiepileptogenic Therapy (EpiBioS4Rx) for the past year, focusing on performing MRI brain segmentations.
Aubrey Martinez
Aubrey Martinez is currently pursuing a Bachelor of Science in Neuroscience degree at the University of Southern California. She works on the analysis of human and rodent neuroimaging data, using manual and automated segmentation methods, to identify potential biomarkers of post-traumatic epileptogenesis.
Alexander Bruckhaus
Alex is from the San Francisco Bay Area and is currently a third-year Computer Science/Business student at USC. He is an undergraduate research assistant working on COVID-ARC and is using data science to study the spread and severity of COVID-19 among different populations. Alex hopes to coalesce his interests in Computer Science, Data Science, and Healthcare to further the understanding and progression of our contemporary world while helping others. He loves to eat ramen, travel, and explore the cultures of the world.
OUTREACH
As part of our goal to disseminate outreach material on a wider scale, slide decks from previous webinars for high school students are included below for download. To view videos of the presentations, please use the links under NEWS AND EVENTS.
Association Between ABO Blood Types and COVID-19
COVID-19 Correlational Analysis
How Science can Directly Relate to Public Health Guidelines and Health Inequities
Machine Learning and Artificial Intelligence
Machine Learning for COVID-19 Diagnosis
Threshold Based Segmentation for COVID-19
Transfer Learning for COVID-19
What Will I Get Out Of Doing Research
NEWS AND EVENTS
Bravo Public Health & Innovative Technology Data Science Workshop
Monday, March 6, 2023
New research sheds light on COVID-19 vaccine inequities in California
Wednesday, January 12, 2022
January 2022 COVID Research Webinar: Lightning Talks + Q&A
Wednesday, January 12, 2022
Tuesday, September 28, 2021
Bravo Summer Bridge 2021 - USC INI Presentation Webinar
Thursday, July 29, 2021
USC INI Presentation Webinar: Flyer
COVID-ARC Introduction to Research Webinar for High School Students
Friday, May 28, 2021
COVID-19 Research Lightning Round: Flyer
COVID-ARC Webinar for Los Angeles and San Diego High School Students and Teachers
Friday, January 22, 2021
COVID-19 Research Lightning Round: Flyer
COVID-ARC Webinar for Francisco Bravo Medical Magnet High School Students and Teachers
Friday, November 13, 2020
COVID-19 Research Lightning Round: Flyer
COVID-19 Research Lightning Round: Webinar and Q&A
Wednesday, September 16, 2020
COVID-19 Research Lightning Round : Video Recording
The COVID Information Commons brings together a group of researchers studying wide-ranging aspects of the current pandemic, to share their research and answer questions from our community. The first monthly webinar included talks by the following researchers.
Erick Jones, University of Texas at Arlington
EAGER: AI-Enabled Optimization of the COVID-19 Therapeutics Supply Chain to Support Community Public Health.
Howard Stone, Princeton University
Flow Asymmetry in Human Breathing and the Asymptomatic Spreader.
Michael Pazzani, University of California San Diego
RAPID: Explainable Machine Learning for Analysis of COVID-19 Chest CT.
Ashok Srinivasan, University of West Florida
Collaborative: RAPID: Leveraging New Data Sources to Analyze the Risk of COVID-19 in Crowded Locations.
Dominique Duncan, University of Southern California
RAPID: COVID-ARC (COVID-19 Data Archive).
Debbie Kim, University of Chicago
RAPID: Pandemic Learning Loss in U.S. High Schools: A National Examination of Student Experiences.
Nora Garza, Laredo College
RAPID: Using real life COVID-19 Data to teach quantitative reasoning skills to undergraduate Hispanic STEM students.
Ajitesh Srivastava, University of Southern California
RAPID: ReCOVER: Accurate Predictions and Resource Allocation for COVID-19 Epidemic Response).
Scientists launch data archive to bolster research on COVID-19
August 19th, 2020
FAQ
How do the centralized and federated storage models differ?
Data providers who choose to store their data under a centralized model will transfer their data to be stored at the USC Stevens Neuroimaging and Informatics Institute. Under the federated model, data providers will keep their data stored locally at their sites. If a COVID-ARC user is granted permission by the data provider to access any of the federated data, the data will be transferred to the user from the data provider’s home institution’s database. COVID-ARC’s central server will only have a description of the federated data but will not store the data.
How can I transfer data to COVID-ARC?
Data can be securely imported to COVID-ARC through the file transfer client Aspera. Since Aspera is a general file transfer client, any data format is transferrable. More information can be found at https://asperasoft.com/software/client-options/desktop-client/ An Aspera tutorial can be found at Aspera Tutorial.
What happens if a network connection is lost during data transfer?
Both COVID-ARC and Aspera have tools in place to minimize the burden of lost connections during data upload. Aspera pauses the transfer and can resume once the network connection is reestablished.
How long will data transfers take?
File transfer depends largely on the uploader’s network connection. Aspera uses a proprietary data transfer protocol developed by IBM that makes it faster than traditional File Transfer Clients.
After COVID-ARC’s funding period concludes, what will happen with the archive?
COVID-ARC is committed to the security and persistence of data shared by COVID-ARC researchers and data providers. All data uploaded to COVID-ARC will remain securely stored at the USC Stevens Neuroimaging and Informatics Institute.
What types of projects would be most suited for federated storage?
The federated model can accommodate the data needs of any project, but it is particularly effective for large datasets that are continuously collecting data over large periods of time. That way, data can be queried as it is collected without the need for team members to upload large datasets repeatedly.
COVID-ARC is powered by:
Laboratory of Neuro Imaging
USC Stevens Neuroimaging and Informatics Institute
Keck School of Medicine of USC
University of Southern California
2025 Zonal Avenue
Los Angeles, CA 90033
Dominique Duncan, Principal Investigator
dduncan@loni.usc.edu
Tel: (323) 865-1754
Fax: (323) 442-0137