[フレーム]
Uploaded bynw13
PPT, PDF5,424 views

Establishing the Connection: Creating a Linked Data Version of the BNB

The document summarizes the British Library's process of creating a linked data version of metadata from the British National Bibliography (BNB). It describes establishing an open metadata strategy, initial steps taken in 2010 to develop linked data capabilities, and the current status. It then details the journey of migrating BNB MARC records to RDF, including selecting data to link to, matching approaches used, and the MARC to RDF conversion workflow.

Related topics:

Embed presentation

Downloaded 85 times
6 / 24
Library Metadata & The Promise of Linked Data Traditional library metadata uses a self contained, proprietary document based model The Semantic Web uses a more dynamic data based model to establish relationships between data elements via links By migrating from traditional models libraries could begin to: Integrate their resources in the web, increasing visibility & reaching new users Offer users a richer resource discovery experience Transition from costly specialist technologies & suppliers & widen their choice of options Open Standards Dynamic/Reactive Links to external resources Micro Portal - Interacts with users & systems in response to queries Offers options for further inquiry Proprietary, library specific standards Passive Self contained Linear text -‘Read’ by users as result of database query Offers end result ‘ Semantic’ Metadata Properties Traditional Library Metadata Properties
Establishing the Connection: Creating a Linked Data Version of the BNB Neil Wilson Head of Metadata Services
Changing Expectations Public Sector Metadata The Web has accelerated development of a collaboration culture & fostered expectations that information & content should be as freely available as the Internet itself Many wider benefit arguments have been advanced for public bodies to make their data freely available 2009 saw an increasing Government commitment to the principle of opening up public data for wider re-use. The "Putting the Frontline First: Smarter Government" report required " the majority of government-published information to be reusable, linked data by June 2011"
Developing an Open Metadata Strategy Choices and Challenges When developing an open metadata strategy we wanted to: Try and break away from library specific formats e.g. MARC and use more cross domain XML based standards e.g . DC, RDF etc Develop the new formats with communities using the metadata Get some form of attribution while also adopting a licensing model appropriate to the widest re-use of the metadata Adopt a multi track approach addressing the needs of: Traditional libraries Researchers wanting to ‘data mine’ catalogues & new linked data developers & users ... And deliver the above with decreasing resources
First Steps Toward An Open Metadata Strategy During 2010 We... Developed a capability to supply metadata using RDF/XML standards used in the wider web community Conducted trials with a range of new users including: the UK Intellectual Property Office & UNESCO Developed a free Z39.50 MARC record download service for libraries to assist with derived cataloguing etc Hosted a linked data workshop with 40 representatives from key international organisations
Current Status Since August 2010 We Have: Created a new email enquiry point for BL metadata issues: [email_address] Signed up nearly 400 organisations worldwide to the free MARC21 Z39.50 service Worked with JISC, Talis & other linked data implementers on technical challenges, standards & licensing issues Begun to offer sets of RDF/XML metadata under a Creative Commons 0 (CC0) license Supplied multi-million record sets to organisations including: the Open Bibliography Project, the Open Library & Wikimedia Commons
Library Metadata & The Promise of Linked Data Traditional library metadata uses a self contained, proprietary document based model The Semantic Web uses a more dynamic data based model to establish relationships between data elements via links By migrating from traditional models libraries could begin to: Integrate their resources in the web, increasing visibility & reaching new users Offer users a richer resource discovery experience Transition from costly specialist technologies & suppliers & widen their choice of options Open Standards Dynamic/Reactive Links to external resources Micro Portal - Interacts with users & systems in response to queries Offers options for further inquiry Proprietary, library specific standards Passive Self contained Linear text -‘Read’ by users as result of database query Offers end result ‘ Semantic’ Metadata Properties Traditional Library Metadata Properties
Our Linked Data Journey... What to Offer? Wanted to offer data allowing useful experimentation & advancing discussions from theory to practice Why BNB? General database of published output and not an institutional catalogue of unique items Mass produced works on all subjects, many with internationally recognised identifiers e.g. ISBN Reasonably uniform format across 60 years of publication Significant amount of data – 3 million records in various languages
Our Linked Data Journey... What do we need to get there? Wanted to undertake the work as an extension of existing activities and as an opportunity to develop expertise using: Existing staff – librarians rather than IT experts As many pre-existing tools or technologies as possible Standard PC hardware for conversion Library MARC21 data as a starting point Established linked data resources to connect to A proven platform that would enable us to concentrate on the data issues
Our Linked Data Journey... First stage: How To Migrate the Metadata? From a flat catalogue card model to something more appropriate... Preliminaries: Staff training in linked data modelling concepts & increased familiarisation with RDF & XML concepts Experience of working with: JISC Open Bibliography Project & Others Feedback on initial MARC to XML conversion work Incremental approach adopted Open Data License RDF/XML Format Add External Links Re-model Create Linked Data
Our Linked Data Journey... Second stage: Selecting trusted resources to link to To begin placing library data in a wider context & supplement or replace literal values in records Looked for library sites: Dewey Info LCSH SKOS VIAF Plus more general sites: GeoNames Lexvo RDF Book Mashup
Our Linked Data Journey... Third Stage: Matching and Generating Links Three main approaches used: Automatic Generation of URIs from elements in records e.g. DDC Matching of text in records with linked data dumps e.g. personal names to VIAF & subjects to LCSH to identify URIs Two stage crosswalk/matching process for some coded information e.g. MARC country & language codes for GeoNames
Our Linked Data Journey... MARC to RDF Conversion Workflow 1) Selection In-house utilities / MARC Report Exclusions (CIP; multiparts; serials) 2) Pre-processing MARC Global Normalise data values, Remove trailing punctuation Move/copy data values to improve machine matching/transformation 3) Character set conversion In-house utilities Decomposed UTF-8 converted to precomposed for conformance with W3C recommendations 4) URI creation In-house utilities Create BL URIs in MARC fields) Harvest URIs from external sources 5) Data Transformation MARC Report & MARC 21/RDF XSLT Convert to RDF & Insert URI prefixes MARC to RDF Conversion Consists of multiple automated steps using a range of tools
Our Linked Data Journey... MARC to RDF Conversion Workflow
Our Linked Data Journey... Which took us from here...
Our Linked Data Journey... Via here...
Our Linked Data Journey... To here...
bnb.data.bl.uk Preview Options bnb.data.bl.uk/sparql bnb.data.bl.uk/describe bnb.data.bl.uk/search . Includes: BNB Books 2005-11 485,000 records 18,000,000 RDF Triples
bnb.data.bl.uk Sample ‘Labelled Concise Bound Description ’
Our Linked Data Journey... Journey’s End...Point? Preview Details at: http:// www.bl.uk/bibliographic/datafree.html Roadmap for next steps includes: Staged release over coming months for: books, serials, multi-parts etc Aiming to update on a monthly basis once complete Documentation & further refinement of data model Looking at RDF triple dump option What else might be offered?
Lessons Learned on the Journey General It is a new way of thinking Legacy data wasn’t designed for this purpose so starting can be problematic There are many opinions...but few real certainties Everyone is learning & multiple solutions exist so you may be the best judge Don’t reinvent the wheel...there are often tools or experience you can use. Start simple & develop in line with evolving staff expertise Give careful thought to data modelling & sustainability issues e.g. Where possible use cross domain standards e.g. ISO codes in data Select relevant & stable targets when providing links if you are doing so
Lessons Learned on the Journey Data Issues Reality check by offering samples for feedback to wider groups Be prepared for some technical criticism in addition to positive feedback & try to continually improve in response Conversion inevitably identifies hidden data issues...& creates new ones! ... But it’s often better to release an imperfect something than a perfect nothing!
Lessons Learned Along The Way Staff and Resource Issues It can be a steep learning curve so: Look for training opportunities to develop staff skills to support new open metadata standards Cultivate a culture of enquiry & innovation among staff to widen perspectives on new possibilities Look into collaborative pilot projects with peer organisations to share resources & expertise See what tools are already out there that can save you development time or assist in checking data
Final Thoughts... For Others Contemplating a Similar Journey It’s never going to be perfect first time We expect to make mistakes We aim to learn from them We hope others will learn something too ... and that everyone benefits from the experience So if anyone is thinking of undertaking a similar journey..... Just do it!
Any Questions...? bnb.data.bl.uk/sparql bnb.data.bl.uk/describe bnb.data.bl.uk/search Images from

More Related Content

Online Presentation
PPT
Online Presentation
bynw13
Deep Dive Into KBART
PPTX
Deep Dive Into KBART
NISO Standards update: KBart and Demand Driven Acquisitions Best Practices
PPTX
NISO Standards update: KBart and Demand Driven Acquisitions Best Practices
UKSG Conference 2015 - In and out: how does that metadata get into a knowledg...
PPTX
UKSG Conference 2015 - In and out: how does that metadata get into a knowledg...
Documents, services, and data on the web
PDF
Documents, services, and data on the web
NISO Update June 2014 KBART Levin
PPT
NISO Update June 2014 KBART Levin
Preparing Catalogers for Linked data
PPTX
Preparing Catalogers for Linked data
AALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNext
PDF
AALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNext
Online Presentation
Online Presentation
bynw13
Deep Dive Into KBART
Deep Dive Into KBART
NISO Standards update: KBart and Demand Driven Acquisitions Best Practices
NISO Standards update: KBart and Demand Driven Acquisitions Best Practices
UKSG Conference 2015 - In and out: how does that metadata get into a knowledg...
UKSG Conference 2015 - In and out: how does that metadata get into a knowledg...
Documents, services, and data on the web
Documents, services, and data on the web
NISO Update June 2014 KBART Levin
NISO Update June 2014 KBART Levin
Preparing Catalogers for Linked data
Preparing Catalogers for Linked data
AALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNext
AALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNext

What's hot

The world beyond MARC: let’s focus on asking the right questions
PDF
The world beyond MARC: let’s focus on asking the right questions
Lawless-3-jun15
PDF
Registering content to enable connections - Rachael Lammey
PDF
Registering content to enable connections - Rachael Lammey
The Linked Data Lifecycle
PDF
The Linked Data Lifecycle
Linked Data (1st Linked Data Meetup Malmö)
PDF
Linked Data (1st Linked Data Meetup Malmö)
Linked Data for the Masses: The approach and the Software
PDF
Linked Data for the Masses: The approach and the Software
Citation Analysis for the Free, Online Literature
PPT
Citation Analysis for the Free, Online Literature
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
PPTX
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
Providing Tools for Author Evaluation - A case study
PPT
Providing Tools for Author Evaluation - A case study
Register "New Directions in Cataloging and Metadata Creation"
PDF
Register "New Directions in Cataloging and Metadata Creation"
Metadata harvesting
PPTX
Metadata harvesting
UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...
PPTX
UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...
Analysing the performance of open access papers discovery tools
PPTX
Analysing the performance of open access papers discovery tools
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
PDF
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Linked open data project
PPTX
Linked open data project
Wiggins-7-jun15
PDF
Introduction to DataCite - Martin Fenner
PDF
Introduction to DataCite - Martin Fenner
Crossref LIVE US Online
PDF
Crossref LIVE US Online
Crossref Services - LIVE Mumbai
PDF
Crossref Services - LIVE Mumbai
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
PDF
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
The world beyond MARC: let’s focus on asking the right questions
The world beyond MARC: let’s focus on asking the right questions
Lawless-3-jun15
Registering content to enable connections - Rachael Lammey
Registering content to enable connections - Rachael Lammey
The Linked Data Lifecycle
The Linked Data Lifecycle
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
Citation Analysis for the Free, Online Literature
Citation Analysis for the Free, Online Literature
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
Providing Tools for Author Evaluation - A case study
Providing Tools for Author Evaluation - A case study
Register "New Directions in Cataloging and Metadata Creation"
Register "New Directions in Cataloging and Metadata Creation"
Metadata harvesting
Metadata harvesting
UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...
UKSG webinar: Making Connections - Creating Linked Open Library Data with Nei...
Analysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery tools
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Linked open data project
Linked open data project
Wiggins-7-jun15
Introduction to DataCite - Martin Fenner
Introduction to DataCite - Martin Fenner
Crossref LIVE US Online
Crossref LIVE US Online
Crossref Services - LIVE Mumbai
Crossref Services - LIVE Mumbai
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...

Similar to Establishing the Connection: Creating a Linked Data Version of the BNB

OCLC Linked Data Roundtable event IFLA 2012
PPT
OCLC Linked Data Roundtable event IFLA 2012
bynw13
British Library Linked Open Data Presentation for ALA June 2014
PPT
British Library Linked Open Data Presentation for ALA June 2014
bynw13
W3C Library Linked Data Incubator Group - 2011
PPT
W3C Library Linked Data Incubator Group - 2011
Linked data and the future of libraries
PDF
Linked data and the future of libraries
W3C Library Linked Data Incubator Group: Review of the Final Report
PPT
W3C Library Linked Data Incubator Group: Review of the Final Report
Linked Data and why we (librarians) should care
PPT
Linked Data and why we (librarians) should care
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
PPTX
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
Of Cataloging & Context
PPT
Of Cataloging & Context
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
PPTX
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data as a library data platform
PDF
Linked data as a library data platform
Linked Open Data: Identifying Opportunities
PDF
Linked Open Data: Identifying Opportunities
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
PPTX
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
Comet project
PPTX
Comet project
NISO Webinar: Library Linked Data: From Vision to Reality
PPTX
NISO Webinar: Library Linked Data: From Vision to Reality
Linked data and voyager
PPT
Linked data and voyager
Opening Up The BL's Metadata
PPT
Opening Up The BL's Metadata
bynw13
Using Linked Data Resources to generate web pages based on a BBC case study
PDF
Using Linked Data Resources to generate web pages based on a BBC case study
Linked library data
PPT
Linked library data
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
PPTX
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
CILIP Conference - x metadata evolution the final mile - Richard Wallis
PPTX
CILIP Conference - x metadata evolution the final mile - Richard Wallis
OCLC Linked Data Roundtable event IFLA 2012
OCLC Linked Data Roundtable event IFLA 2012
bynw13
British Library Linked Open Data Presentation for ALA June 2014
British Library Linked Open Data Presentation for ALA June 2014
bynw13
W3C Library Linked Data Incubator Group - 2011
W3C Library Linked Data Incubator Group - 2011
Linked data and the future of libraries
Linked data and the future of libraries
W3C Library Linked Data Incubator Group: Review of the Final Report
W3C Library Linked Data Incubator Group: Review of the Final Report
Linked Data and why we (librarians) should care
Linked Data and why we (librarians) should care
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
Of Cataloging & Context
Of Cataloging & Context
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data as a library data platform
Linked data as a library data platform
Linked Open Data: Identifying Opportunities
Linked Open Data: Identifying Opportunities
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
Comet project
Comet project
NISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to Reality
Linked data and voyager
Linked data and voyager
Opening Up The BL's Metadata
Opening Up The BL's Metadata
bynw13
Using Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case study
Linked library data
Linked library data
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
CILIP Conference - x metadata evolution the final mile - Richard Wallis
CILIP Conference - x metadata evolution the final mile - Richard Wallis

Recently uploaded

Emminent personalities from agriculture sciences.pptx
PPTX
Emminent personalities from agriculture sciences.pptx
Historical Background of Pharmacy .pptx
PPTX
Historical Background of Pharmacy .pptx
14 November 2025 The Impact of Digital Technologies on Students Learning
PPTX
14 November 2025 The Impact of Digital Technologies on Students Learning
ENVIRONMENTAL SERVICES PROVIDERS ASSOCIATION.pdf
PDF
ENVIRONMENTAL SERVICES PROVIDERS ASSOCIATION.pdf
Morphology of Eukaryotic Chromosome Genetics.pptx
PPTX
Morphology of Eukaryotic Chromosome Genetics.pptx
Optomery MCQ pdf/exam/pyq/joinquiz/..pdf
PDF
Optomery MCQ pdf/exam/pyq/joinquiz/..pdf
Unit 7- Organ Function Test(Biochemical parameters.
PPTX
Unit 7- Organ Function Test(Biochemical parameters.
MOOD DISORDER PPT 5TH SEMESTER 2025.pptx
PPTX
MOOD DISORDER PPT 5TH SEMESTER 2025.pptx
The bronze sculptures of the Chola dynasty
PDF
The bronze sculptures of the Chola dynasty
The Rock cut marvel Kailash temple at Ellora
PDF
The Rock cut marvel Kailash temple at Ellora
Advanced Microbiology, Microbial Physiology and Biostatistics-Comprehensive H...
PDF
Advanced Microbiology, Microbial Physiology and Biostatistics-Comprehensive H...
the beginning of Pallava architecture: mahendra and mamalla style
PDF
the beginning of Pallava architecture: mahendra and mamalla style
ANATOMY NOTES- Dr. Sudhadevi Sadanandan
DOCX
ANATOMY NOTES- Dr. Sudhadevi Sadanandan
SEMESTER- 6 UNIT-3 Child guidance clinic.pptx
PPTX
SEMESTER- 6 UNIT-3 Child guidance clinic.pptx
Herniation Syndromes - Neuro Imaging Master Series
PDF
Herniation Syndromes - Neuro Imaging Master Series
Capitol Doctoral Presentation --November 2025.pptx
PPTX
Capitol Doctoral Presentation --November 2025.pptx
Pharmaceutical engineering (sem-3) unit 4-1
PPTX
Pharmaceutical engineering (sem-3) unit 4-1
(2) Physiology (Cell Structure). pptx
PPTX
(2) Physiology (Cell Structure). pptx
Gender, School and Society - B.Ed Course
PPTX
Gender, School and Society - B.Ed Course
Lupus nephritis updates of managment....
PPTX
Lupus nephritis updates of managment....
Emminent personalities from agriculture sciences.pptx
Emminent personalities from agriculture sciences.pptx
Historical Background of Pharmacy .pptx
Historical Background of Pharmacy .pptx
14 November 2025 The Impact of Digital Technologies on Students Learning
14 November 2025 The Impact of Digital Technologies on Students Learning
ENVIRONMENTAL SERVICES PROVIDERS ASSOCIATION.pdf
ENVIRONMENTAL SERVICES PROVIDERS ASSOCIATION.pdf
Morphology of Eukaryotic Chromosome Genetics.pptx
Morphology of Eukaryotic Chromosome Genetics.pptx
Optomery MCQ pdf/exam/pyq/joinquiz/..pdf
Optomery MCQ pdf/exam/pyq/joinquiz/..pdf
Unit 7- Organ Function Test(Biochemical parameters.
Unit 7- Organ Function Test(Biochemical parameters.
MOOD DISORDER PPT 5TH SEMESTER 2025.pptx
MOOD DISORDER PPT 5TH SEMESTER 2025.pptx
The bronze sculptures of the Chola dynasty
The bronze sculptures of the Chola dynasty
The Rock cut marvel Kailash temple at Ellora
The Rock cut marvel Kailash temple at Ellora
Advanced Microbiology, Microbial Physiology and Biostatistics-Comprehensive H...
Advanced Microbiology, Microbial Physiology and Biostatistics-Comprehensive H...
the beginning of Pallava architecture: mahendra and mamalla style
the beginning of Pallava architecture: mahendra and mamalla style
ANATOMY NOTES- Dr. Sudhadevi Sadanandan
ANATOMY NOTES- Dr. Sudhadevi Sadanandan
SEMESTER- 6 UNIT-3 Child guidance clinic.pptx
SEMESTER- 6 UNIT-3 Child guidance clinic.pptx
Herniation Syndromes - Neuro Imaging Master Series
Herniation Syndromes - Neuro Imaging Master Series
Capitol Doctoral Presentation --November 2025.pptx
Capitol Doctoral Presentation --November 2025.pptx
Pharmaceutical engineering (sem-3) unit 4-1
Pharmaceutical engineering (sem-3) unit 4-1
(2) Physiology (Cell Structure). pptx
(2) Physiology (Cell Structure). pptx
Gender, School and Society - B.Ed Course
Gender, School and Society - B.Ed Course
Lupus nephritis updates of managment....
Lupus nephritis updates of managment....

Establishing the Connection: Creating a Linked Data Version of the BNB

  • 1.
    Establishing the Connection: Creating a Linked Data Version of the BNB Neil Wilson Head of Metadata Services
  • 2.
    Changing Expectations Public Sector Metadata The Web has accelerated development of a collaboration culture & fostered expectations that information & content should be as freely available as the Internet itself Many wider benefit arguments have been advanced for public bodies to make their data freely available 2009 saw an increasing Government commitment to the principle of opening up public data for wider re-use. The "Putting the Frontline First: Smarter Government" report required " the majority of government-published information to be reusable, linked data by June 2011"
  • 3.
    Developing an Open Metadata Strategy Choices and Challenges When developing an open metadata strategy we wanted to: Try and break away from library specific formats e.g. MARC and use more cross domain XML based standards e.g . DC, RDF etc Develop the new formats with communities using the metadata Get some form of attribution while also adopting a licensing model appropriate to the widest re-use of the metadata Adopt a multi track approach addressing the needs of: Traditional libraries Researchers wanting to ‘data mine’ catalogues & new linked data developers & users ... And deliver the above with decreasing resources
  • 4.
    First Steps Toward An Open Metadata Strategy During 2010 We... Developed a capability to supply metadata using RDF/XML standards used in the wider web community Conducted trials with a range of new users including: the UK Intellectual Property Office & UNESCO Developed a free Z39.50 MARC record download service for libraries to assist with derived cataloguing etc Hosted a linked data workshop with 40 representatives from key international organisations
  • 5.
    Current Status Since August 2010 We Have: Created a new email enquiry point for BL metadata issues: [email_address] Signed up nearly 400 organisations worldwide to the free MARC21 Z39.50 service Worked with JISC, Talis & other linked data implementers on technical challenges, standards & licensing issues Begun to offer sets of RDF/XML metadata under a Creative Commons 0 (CC0) license Supplied multi-million record sets to organisations including: the Open Bibliography Project, the Open Library & Wikimedia Commons
  • 6.
    Library Metadata & The Promise of Linked Data Traditional library metadata uses a self contained, proprietary document based model The Semantic Web uses a more dynamic data based model to establish relationships between data elements via links By migrating from traditional models libraries could begin to: Integrate their resources in the web, increasing visibility & reaching new users Offer users a richer resource discovery experience Transition from costly specialist technologies & suppliers & widen their choice of options Open Standards Dynamic/Reactive Links to external resources Micro Portal - Interacts with users & systems in response to queries Offers options for further inquiry Proprietary, library specific standards Passive Self contained Linear text -‘Read’ by users as result of database query Offers end result ‘ Semantic’ Metadata Properties Traditional Library Metadata Properties
  • 7.
    Our Linked Data Journey... What to Offer? Wanted to offer data allowing useful experimentation & advancing discussions from theory to practice Why BNB? General database of published output and not an institutional catalogue of unique items Mass produced works on all subjects, many with internationally recognised identifiers e.g. ISBN Reasonably uniform format across 60 years of publication Significant amount of data – 3 million records in various languages
  • 8.
    Our Linked Data Journey... What do we need to get there? Wanted to undertake the work as an extension of existing activities and as an opportunity to develop expertise using: Existing staff – librarians rather than IT experts As many pre-existing tools or technologies as possible Standard PC hardware for conversion Library MARC21 data as a starting point Established linked data resources to connect to A proven platform that would enable us to concentrate on the data issues
  • 9.
    Our Linked Data Journey... First stage: How To Migrate the Metadata? From a flat catalogue card model to something more appropriate... Preliminaries: Staff training in linked data modelling concepts & increased familiarisation with RDF & XML concepts Experience of working with: JISC Open Bibliography Project & Others Feedback on initial MARC to XML conversion work Incremental approach adopted Open Data License RDF/XML Format Add External Links Re-model Create Linked Data
  • 10.
    Our Linked Data Journey... Second stage: Selecting trusted resources to link to To begin placing library data in a wider context & supplement or replace literal values in records Looked for library sites: Dewey Info LCSH SKOS VIAF Plus more general sites: GeoNames Lexvo RDF Book Mashup
  • 11.
    Our Linked Data Journey... Third Stage: Matching and Generating Links Three main approaches used: Automatic Generation of URIs from elements in records e.g. DDC Matching of text in records with linked data dumps e.g. personal names to VIAF & subjects to LCSH to identify URIs Two stage crosswalk/matching process for some coded information e.g. MARC country & language codes for GeoNames
  • 12.
    Our Linked Data Journey... MARC to RDF Conversion Workflow 1) Selection In-house utilities / MARC Report Exclusions (CIP; multiparts; serials) 2) Pre-processing MARC Global Normalise data values, Remove trailing punctuation Move/copy data values to improve machine matching/transformation 3) Character set conversion In-house utilities Decomposed UTF-8 converted to precomposed for conformance with W3C recommendations 4) URI creation In-house utilities Create BL URIs in MARC fields) Harvest URIs from external sources 5) Data Transformation MARC Report & MARC 21/RDF XSLT Convert to RDF & Insert URI prefixes MARC to RDF Conversion Consists of multiple automated steps using a range of tools
  • 13.
    Our Linked Data Journey... MARC to RDF Conversion Workflow
  • 14.
    Our Linked Data Journey... Which took us from here...
  • 15.
    Our Linked Data Journey... Via here...
  • 16.
    Our Linked Data Journey... To here...
  • 17.
    bnb.data.bl.uk Preview Options bnb.data.bl.uk/sparql bnb.data.bl.uk/describe bnb.data.bl.uk/search . Includes: BNB Books 2005-11 485,000 records 18,000,000 RDF Triples
  • 18.
    bnb.data.bl.uk Sample ‘Labelled Concise Bound Description ’
  • 19.
    Our Linked Data Journey... Journey’s End...Point? Preview Details at: http:// www.bl.uk/bibliographic/datafree.html Roadmap for next steps includes: Staged release over coming months for: books, serials, multi-parts etc Aiming to update on a monthly basis once complete Documentation & further refinement of data model Looking at RDF triple dump option What else might be offered?
  • 20.
    Lessons Learned on the Journey General It is a new way of thinking Legacy data wasn’t designed for this purpose so starting can be problematic There are many opinions...but few real certainties Everyone is learning & multiple solutions exist so you may be the best judge Don’t reinvent the wheel...there are often tools or experience you can use. Start simple & develop in line with evolving staff expertise Give careful thought to data modelling & sustainability issues e.g. Where possible use cross domain standards e.g. ISO codes in data Select relevant & stable targets when providing links if you are doing so
  • 21.
    Lessons Learned on the Journey Data Issues Reality check by offering samples for feedback to wider groups Be prepared for some technical criticism in addition to positive feedback & try to continually improve in response Conversion inevitably identifies hidden data issues...& creates new ones! ... But it’s often better to release an imperfect something than a perfect nothing!
  • 22.
    Lessons Learned Along The Way Staff and Resource Issues It can be a steep learning curve so: Look for training opportunities to develop staff skills to support new open metadata standards Cultivate a culture of enquiry & innovation among staff to widen perspectives on new possibilities Look into collaborative pilot projects with peer organisations to share resources & expertise See what tools are already out there that can save you development time or assist in checking data
  • 23.
    Final Thoughts... For Others Contemplating a Similar Journey It’s never going to be perfect first time We expect to make mistakes We aim to learn from them We hope others will learn something too ... and that everyone benefits from the experience So if anyone is thinking of undertaking a similar journey..... Just do it!
  • 24.
    Any Questions...? bnb.data.bl.uk/sparql bnb.data.bl.uk/describe bnb.data.bl.uk/search Images from

Editor's Notes

  • #12 NoteCharacter set issues
  • #20 Its another step on the journey rather than an end result

AltStyle によって変換されたページ (->オリジナル) /