SDKs overview
The British National Bibliography (BNB) provides comprehensive bibliographic data for publications in the UK and Ireland. Unlike some modern API services that offer explicit, self-service Software Development Kits (SDKs) for direct API interaction, access to BNB data is primarily managed through licensing agreements for data formats such as MARC records and Linked Open Data (LOD) British National Bibliography information. This model means that developers typically work with data files or streams rather than making direct HTTP requests to a real-time API endpoint with a dedicated SDK.
Consequently, there are no official SDKs in the traditional sense, provided directly by the British Library, for a generic, publicly accessible BNB API. Instead, developers utilize existing libraries and tools designed to parse, process, and interact with standard bibliographic data formats like MARC 21 and various Linked Open Data serializations (e.g., RDF/XML, Turtle, JSON-LD). These tools, often community-driven or general-purpose, enable programmatic interaction with the data once it has been acquired through a licensing agreement with the British Library's bibliographic services team.
The flexibility of these standard data formats allows for integration across various programming languages and environments, enabling applications ranging from library cataloging systems to academic research tools. Developers needing to integrate BNB data into their applications will focus on selecting appropriate parsers and processing libraries for MARC or LOD, depending on the specific data format licensed.
Official SDKs by language
As noted, the British National Bibliography does not provide official, language-specific SDKs for a direct, self-service API. BNB data is primarily distributed via licensing agreements in standardized formats such as MARC 21 and Linked Open Data British National Bibliography services. Developers requiring programmatic access to this data will typically employ third-party or community-developed libraries that are designed to handle these specific data formats.
For example, to work with MARC 21 records, developers might use libraries that implement the ISO 2709 standard for bibliographic record exchange MARC 21 specifications from the Library of Congress. Similarly, for Linked Open Data, general-purpose RDF libraries are used to parse and query data. The table below outlines common approaches and relevant libraries developers might use, categorized by programming language, despite them not being 'official' BNB SDKs.
| Language | Common Package/Approach | Description | Maturity |
|---|---|---|---|
| Python | pymarc |
A library for reading, writing, and manipulating MARC records. | Stable, actively maintained |
| Java | marc4j |
A Java framework for working with MARC 21 records. | Stable, community-supported |
| Ruby | ruby-marc |
Ruby library for parsing and manipulating MARC records. | Stable, community-supported |
| JavaScript (Node.js) | marc-json, rdflib.js |
Libraries for converting MARC to JSON or handling RDF data. | Varies by library; generally active |
| PHP | php-marc |
A PHP library for reading and writing MARC records. | Stable, community-supported |
| General (LOD) | RDF libraries (e.g., Apache Jena, rdflib) | Frameworks for parsing, querying, and serializing RDF data in various formats (Turtle, JSON-LD, RDF/XML). | Highly mature, widely used |
Installation
Since there are no official BNB-specific SDKs, installation involves integrating general-purpose libraries for MARC or Linked Open Data processing into your development environment. The specific installation steps depend on the chosen programming language and package manager. Below are examples for common languages, illustrating how to add these types of libraries.
Python
For Python, pymarc is a widely used library for handling MARC records. Installation is typically done via pip:
pip install pymarc
To work with Linked Open Data, rdflib is a robust choice:
pip install rdflib
Java
For Java projects, marc4j can be included as a dependency using Maven or Gradle. For Maven, add the following to your pom.xml:
<dependency>
<groupId>org.marc4j</groupId>
<artifactId>marc4j</artifactId>
<version>2.8.0</version> <!-- Check for the latest version -->
</dependency>
For Linked Open Data, Apache Jena is a comprehensive framework. Its core libraries can be added via Maven:
<dependency>
<groupId>org.apache.jena</groupId>
<artifactId>jena-core</artifactId>
<version>4.9.0</version> <!-- Check for the latest version -->
</dependency>
<dependency>
<groupId>org.apache.jena</groupId>
<artifactId>jena-arq</artifactId>
<version>4.9.0</version> <!-- Check for the latest version -->
</dependency>
Ruby
For Ruby, the ruby-marc gem can be installed using Bundler or direct gem installation:
gem install ruby-marc
If using Bundler, add to your Gemfile:
gem 'ruby-marc'
Then run bundle install.
JavaScript (Node.js)
For Node.js environments, libraries like marc-json (for MARC data conversion) or rdflib.js (for RDF/LOD) can be installed via npm:
npm install marc-json
npm install rdflib
Quickstart example
This quickstart demonstrates how to parse a MARC 21 record using Python's pymarc library, assuming you have obtained a MARC file (e.g., bnb_records.mrc) from your British Library data license. This example will read records from a file and print specific fields, such as the title and author.
Prerequisites
- Python installed.
pymarclibrary installed (pip install pymarc).- A MARC 21 formatted file (e.g.,
bnb_records.mrc) containing BNB data.
Python example: Reading MARC records
from pymarc import MARCReader
def process_bnb_marc_file(filename):
"""Reads a MARC file and extracts basic bibliographic information."""
try:
with open(filename, 'rb') as fh:
reader = MARCReader(fh)
print(f"Processing MARC file: {filename}")
record_count = 0
for record in reader:
record_count += 1
print(f"\n--- Record {record_count} ---")
# Extract title (field 245, subfield a and b)
title_field = record['245']
if title_field:
title = title_field['a']
if 'b' in title_field: # Subtitle
title += ' ' + title_field['b']
print(f"Title: {title.strip()}")
# Extract author (field 100, subfield a)
author_field = record['100']
if author_field:
print(f"Author: {author_field['a'].strip()}")
# Extract publication year (field 264 or 260, subfield c)
pub_year = None
if '264' in record and 'c' in record['264']:
pub_year = record['264']['c']
elif '260' in record and 'c' in record['260']:
pub_year = record['260']['c']
if pub_year:
# Often includes punctuation like '.' or '©'
print(f"Publication Year: {pub_year.strip().replace('.', '').replace('©', '')}")
# Example: Accessing a specific subfield directly
# ISBN (field 020, subfield a)
isbn_field = record['020']
if isbn_field and 'a' in isbn_field:
print(f"ISBN: {isbn_field['a'].strip()}")
except FileNotFoundError:
print(f"Error: File '{filename}' not found. Please ensure the MARC file is in the correct directory.")
except Exception as e:
print(f"An error occurred: {e}")
if __name__ == "__main__":
# Replace 'bnb_records.mrc' with the path to your actual MARC file
process_bnb_marc_file('bnb_records.mrc')
Explanation
- Import
MARCReader: This imports the necessary class frompymarcto read MARC files. - Open File: The code opens the MARC file in binary read mode (
'rb'). - Iterate Records:
MARCReaderacts as an iterator, yielding oneRecordobject for each MARC record in the file. - Access Fields: Each
recordobject allows access to MARC fields by their three-digit tag (e.g.,record['245']for the title field). - Access Subfields: Within a field, subfields are accessed by their single-character code (e.g.,
title_field['a']for the main title). - Error Handling: Basic
try-exceptblocks are included to handle potentialFileNotFoundErrorand other general exceptions.
This example provides a foundation for parsing and extracting data. More complex applications might involve filtering records, transforming data into other formats (e.g., JSON, XML), or integrating with databases.
Community libraries
Given the British National Bibliography's reliance on standard data formats rather than a proprietary API with dedicated SDKs, community-developed libraries play a crucial role in enabling developers to work with BNB data. These libraries are often language-agnostic in their data format handling but are implemented in specific programming languages to provide developer-friendly interfaces.
MARC 21 processing libraries
MARC 21 (Machine-Readable Cataloging) is a widely adopted standard for the representation and communication of bibliographic and related information Library of Congress MARC Standards. Numerous community libraries exist across various programming languages to parse, validate, and manipulate MARC 21 records. These libraries abstract away the complexities of the ISO 2709 record structure, allowing developers to interact with records, fields, and subfields programmatically.
- Python:
pymarcis a well-established and actively maintained library for Python developers, offering functionalities for reading, writing, and modifying MARC records. It supports MARCXML conversion and various encoding schemes. - Java:
marc4jprovides a comprehensive framework for Java applications to process MARC 21 records, including parsing from various input streams and serializing to different output formats. - Ruby:
ruby-marcis a popular choice for Ruby developers, providing methods to parse and build MARC records, often used in conjunction with library management systems. - PHP: Libraries like
php-marcoffer similar capabilities for PHP-based applications, enabling server-side processing of MARC data.
Linked Open Data (LOD) libraries
The British National Bibliography also provides data as Linked Open Data, which adheres to W3C standards for RDF (Resource Description Framework) W3C RDF Primer. Working with LOD involves parsing RDF triples, querying using SPARQL, and serializing data into formats like Turtle, N-Triples, RDF/XML, or JSON-LD. Community and open-source libraries for LOD are typically more generic as they are designed to handle any RDF data, making them directly applicable to BNB's LOD offerings.
- Python:
rdflibis a powerful Python library for working with RDF, supporting various parsers and serializers, SPARQL querying, and graph manipulation. - Java: Apache Jena is a comprehensive open-source Java framework for building Semantic Web applications, including tools for RDF, RDFS, OWL, and SPARQL. It provides robust capabilities for handling large-scale LOD datasets.
- JavaScript: For client-side or Node.js applications, libraries like
rdflib.jsorN3.jsenable parsing, querying, and serializing RDF data, facilitating integration with web-based interfaces. - C#: DotNetRDF is a robust library for .NET developers, offering extensive support for RDF, SPARQL, and various serialization formats.
When selecting a community library, developers should consider factors such as active maintenance, community support, documentation quality, and compatibility with the specific version and nuances of the MARC or LOD data provided by the British National Bibliography. Direct engagement with the British Library's bibliographic services team is recommended to clarify data access specifics and format details relevant to any licensed dataset.