SDKs overview

USGS Water Services provides access to a broad range of hydrological and water quality data through its API and web services. While direct RESTful API calls are always an option, developers can utilize Software Development Kits (SDKs) and libraries to streamline interaction with these services. SDKs encapsulate common API patterns, handle authentication (where applicable, though USGS services are generally open access), and simplify data parsing, allowing developers to focus on application logic rather than low-level HTTP requests. The primary goal of these SDKs is to facilitate programmatic access to data from systems like the National Water Information System (NWIS) and the Water Quality Portal (WQP).

The USGS encourages the use of established programming languages and environments for data access. The official and community-contributed libraries often provide functions for common data retrieval tasks, such as querying streamflow, groundwater levels, or water quality parameters over specific timeframes and locations. These tools are particularly valuable for researchers, environmental scientists, and application developers building tools that integrate USGS water data.

Official SDKs by language

The USGS supports and maintains official libraries primarily in Python and R, reflecting the common programming environments in scientific and data analysis communities. These libraries are designed to provide robust and consistent access to the various data endpoints offered by USGS Water Services.

Language Package Name Description Maturity
Python dataretrieval A Python package for retrieving data from USGS and EPA web services, including NWIS and WQP. It simplifies data queries and returns data in a pandas DataFrame format. Stable
R dataRetrieval An R package for obtaining water quality and hydrological data from USGS and EPA data sources. It provides functions to access NWIS and WQP data, commonly used in hydrological analysis. Stable

Installation

Installing the official SDKs is typically straightforward, leveraging standard package managers for Python and R. The following instructions outline the common installation methods.

Python: dataretrieval

The dataretrieval package for Python can be installed using pip, the standard package installer for Python. It is recommended to install it within a Python virtual environment to manage dependencies effectively.

pip install dataretrieval

After installation, you can verify the package by importing it in a Python interpreter:

import dataretrieval as dr
print(dr.__version__)

R: dataRetrieval

The dataRetrieval package for R is available on CRAN (Comprehensive R Archive Network) and can be installed using R's built-in install.packages() function.

install.packages("dataRetrieval")

To load the package and verify its installation:

library(dataRetrieval)
packageVersion("dataRetrieval")

Quickstart example

This quickstart demonstrates how to retrieve daily streamflow data for a specific USGS gauging station using both the Python dataretrieval and R dataRetrieval libraries.

Python example: Retrieving daily streamflow data

This Python snippet uses dataretrieval to fetch daily mean streamflow (parameter code 00060) for a USGS station (e.g., 01646500) over a specified date range. The data is returned as a pandas DataFrame.

import dataretrieval as dr
import pandas as pd

# Define station ID, parameter code for streamflow, and date range
station_id = "01646500"  # Example: Potomac River at Chain Bridge, Washington, DC
parameter_code = "00060"  # Discharge, cubic feet per second
start_date = "2023-01-01"
end_date = "2023-01-31"

# Retrieve daily data
df_daily = dr.nwis.get_dv(site=station_id, parameterCd=parameter_code, 
                        startDate=start_date, endDate=end_date)

# Print the first few rows of the DataFrame
print(f"Data for station {station_id} from {start_date} to {end_date}:")
print(df_daily.head())

# Access specific columns, e.g., the streamflow value
# The column name might vary slightly based on the data returned, often '00060_00003' for daily mean
if not df_daily.empty and '00060_00003' in df_daily.columns:
    print(f"Average streamflow in January 2023: {df_daily['00060_00003'].mean():.2f} cfs")

R example: Retrieving daily streamflow data

This R snippet performs the same task using the dataRetrieval package, fetching daily mean streamflow for the same station and date range. The result is an R data frame.

# Load the dataRetrieval package
library(dataRetrieval)

# Define station ID, parameter code for streamflow, and date range
station_id <- "01646500" # Example: Potomac River at Chain Bridge, Washington, DC
parameter_code <- "00060" # Discharge, cubic feet per second
start_date <- "2023-01-01"
end_date <- "2023-01-31"

# Retrieve daily data
df_daily <- readNWISdv(siteNumbers = station_id, 
                       parameterCd = parameter_code, 
                       startDate = start_date, 
                       endDate = end_date)

# Print the first few rows of the data frame
cat(paste0("Data for station ", station_id, " from ", start_date, " to ", end_date, ":\n"))
print(head(df_daily))

# Access specific columns, e.g., the streamflow value (often 'X_00060_00003')
if (!is.null(df_daily) && "X_00060_00003" %in% names(df_daily)) {
  cat(paste0("Average streamflow in January 2023: ", round(mean(df_daily$X_00060_00003, na.rm = TRUE), 2), " cfs\n"))
}

Community libraries

Beyond the official offerings, the open-source community contributes various libraries and tools that interact with USGS Water Services. These libraries often cater to specific use cases, integrate with other data analysis ecosystems, or offer alternative interfaces. While not officially maintained by the USGS, they can provide valuable functionality and are often developed by active users of the data.

  • JavaScript/Node.js Libraries: Developers working in web environments might find community-contributed JavaScript libraries that wrap USGS API endpoints. These can simplify client-side or server-side data fetching for web applications. For example, some libraries might focus on fetching real-time streamflow data for interactive maps.
  • GIS Integration Tools: Given the geospatial nature of water data, several community projects focus on integrating USGS data with Geographic Information Systems (GIS) software. These tools might provide direct connectors or scripts to import data into platforms like ArcGIS or QGIS, facilitating spatial analysis and visualization. The ArcGIS API for Python, for instance, can be used to integrate various geospatial datasets, including those from USGS, into ArcGIS workflows.
  • Other Language Bindings: While Python and R are primary, developers in other languages (e.g., Java, C#, Go) may have created their own wrappers or helper functions to interact with the well-documented USGS Water Services REST APIs. These can often be found on platforms like GitHub by searching for relevant keywords such as "USGS API" or "NWIS client" in the respective language.

When considering community libraries, it is advisable to review their documentation, community activity, and maintenance status. Checking the project's GitHub repository for recent commits, open issues, and pull requests can provide insight into its ongoing support and reliability.