Overview

The Covid-19 JHU CSSE data repository, established in 2020 by the Johns Hopkins University Center for Systems Science and Engineering (CSSE), provides publicly accessible data on the global COVID-19 pandemic. This initiative became a critical resource for understanding the spread and impact of the virus, offering granular data that supports a wide range of analytical needs. The repository's primary focus is to collect, process, and disseminate global data on confirmed cases, deaths, and recoveries, organized by geographical location.

The data is primarily distributed through CSV files hosted on GitHub, which allows for direct download and integration into various data processing pipelines. While it does not offer a traditional REST API, the structured nature of the CSVs facilitates parsing and programmatic access. This approach makes the data particularly well-suited for developers, researchers, and data scientists who require direct access to raw data for custom analysis, model building, and visualization projects.

The repository excels in scenarios requiring historical time series data and daily updates for tracking pandemic trends. It is widely utilized in academic research for epidemiological studies, in public health analysis for situational awareness and policy formulation, and by journalists for data-driven reporting. Organizations such as Our World in Data have also leveraged this data to create their own visualizations and analyses, demonstrating its utility as a foundational dataset for global health monitoring.

The JHU CSSE data is recognized for its comprehensive geographic coverage, including country, state/province, and sometimes county-level data, depending on the region. This level of detail enables localized analysis and comparative studies across different administrative divisions. The project's commitment to public availability and regular updates ensured its relevance throughout the pandemic, providing a consistent stream of information for a global audience. The data format, while not an API, is designed for ease of use in data science workflows, making it a valuable asset for anyone working with public health statistics.

Key features

  • Global COVID-19 Case Data: Provides confirmed cases, deaths, and recoveries across countries and regions globally.
  • Daily Reports: Offers daily snapshots of aggregated COVID-19 data, updated regularly to reflect the latest figures.
  • Time Series Data: Includes historical data points for cases, deaths, and recoveries, enabling trend analysis and longitudinal studies.
  • Geographic Granularity: Data is often available at country, state/province, and sometimes county levels for detailed analysis.
  • CSV File Distribution: Data is published as structured CSV files on GitHub, facilitating direct download and programmatic parsing.
  • Publicly Available: All data is free to access and use for non-commercial and commercial purposes, supporting broad research and public health initiatives.
  • Data Documentation: Comprehensive documentation on the GitHub repository explains data fields, methodologies, and update schedules.

Pricing

The Covid-19 JHU CSSE data is entirely free to access and use. All data files are publicly available on their GitHub repository, requiring no subscription, API key, or payment.

Feature Cost (as of 2026-05-28) Notes
Access to all historical and daily data Free No charges for data download or usage.
Updates Free Regular updates provided without cost.
Support Community-driven Support primarily through GitHub issues and community engagement.

For the most current information regarding data access, refer to the JHU CSSE COVID-19 Data README.

Common integrations

The JHU CSSE data is primarily integrated by directly downloading and processing its CSV files. While there isn't a traditional API, its structured format allows for integration into various data analysis and visualization tools:

  • Python Data Analysis: Developers commonly use libraries like Pandas to read and process the CSV files for data cleaning, transformation, and analysis.
  • R Statistical Computing: R users integrate the data for statistical modeling, epidemiological analysis, and creating interactive dashboards.
  • Business Intelligence (BI) Tools: Tools such as Tableau, Power BI, or Qlik Sense can import the CSVs for creating dashboards and reports.
  • Geographic Information Systems (GIS): GIS platforms like ArcGIS can ingest the data for spatial analysis and mapping, leveraging the geographic coordinates provided in some datasets. The ArcGIS Developer documentation provides resources for working with spatial data.
  • Custom Web Applications: Data can be fetched and parsed by backend services to power custom web applications and interactive maps.
  • Database Ingestion: The CSV files can be loaded into relational or NoSQL databases for long-term storage and complex querying.

Alternatives

Getting started

To begin working with the Covid-19 JHU CSSE data, you typically download the CSV files directly from their GitHub repository. The following Python example demonstrates how to fetch and load a daily report into a Pandas DataFrame, a common first step for data analysis.

import pandas as pd

# Define the base URL for the raw data on GitHub
base_url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/"

# Example: Fetch data for a specific date (e.g., May 27, 2026)
date_str = "05-27-2026" # Format: MM-DD-YYYY
file_url = f"{base_url}{date_str}.csv"

try:
    # Read the CSV file directly into a Pandas DataFrame
    df = pd.read_csv(file_url)
    print(f"Successfully loaded data for {date_str}. Shape: {df.shape}")
    print(df.head()) # Display the first few rows of the DataFrame
except Exception as e:
    print(f"Error loading data for {date_str}: {e}")
    print("Please ensure the date is valid and the file exists at the specified URL.")

# To access time series data, you would use a different path:
# time_series_confirmed_url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv"
# df_ts_confirmed = pd.read_csv(time_series_confirmed_url)
# print("\nGlobal Confirmed Time Series Data:")
# print(df_ts_confirmed.head())

This script demonstrates how to programmatically access and load the daily report CSV for a specific date. You can adapt the date_str variable to fetch data for other dates. For comprehensive information about the data structure and available files, refer to the JHU CSSE COVID-19 Data API reference.