Skip to content
On this page

British Library Images Online

The class for scraping metadata and images from British Library Images Online.

Usage

Create a querier for British Library Images Online:

python
from libquery_extensions import BritishLibraryImagesOnline

directory = "./output/british-library-images-online"
querier = BritishLibraryImagesOnline(
    query_path=f"{directory}/queries/queries.jsonl",
    metadata_path=f"{directory}/metadata/metadata.jsonl",
    img_dir=f"{directory}/imgs",
)

Query metadata:

python
base_urls = [
    "https://imagesonline.bl.uk/search/?searchQuery=chart",
    "https://imagesonline.bl.uk/search/?searchQuery=map",
]
querier.fetch_metadata(base_urls=base_urls)

Query images:

python
querier.fetch_image()

Metadata Schema

Each metadata entry is stored as:

typescript
interface SourceData {
    assetName: string
    downloadUrl: string
    Source?: string
    Caption?: string
    'Title of Work'?: string
    Shelfmark?: string
    'Place and date of production'?: string
    Credit?: string
    'Artist/creator'?: string
    Author?: string
}

interface MetadataEntry {
    uuid: string
    url: string
    source: 'British Library'
    idInSource: string
    accessDate: string
    sourceData: SourceData
}