Aims, objectives and target outputs (see project plan for specific deliverables)

1) Further contribution to Open Bibliography
  • Cambridge University Library’s Voyager catalogue contains approximately 15 million bibliographic records including records derived from OCLC, RLUK and the British Library in addition to locally-created records. The initial aim of this project will be to identify and release a substantial record set to an external platform under an open license (Public Domain Dedication License) as MARC 21
    • For OCLC-derived bibliographic records data will be released in a fashion compliant  with their WorldCat Rights and Responsibilities for the OCLC Cooperative. 3 The inclusion of OCLC metadata brings the number of records which will be released to over 2,200,000
    2) Linked data creation and publication
    • The project aims to then deploy and test and number of technologies and methodologies for releasing open bibliographic data including XML, RDF, SPARQL, and JSON.
    • It will investigate use of the Open Knowledge Foundation’s ORDF Python library4 built upon RDFLib for working with RDF Data as a platform for sharing data
    •  This project will also examine ways to link the above output to OCLC’s data enrichment services in assigning FAST and VIAF headings. It also hopes to examine other linking mechanisms and opportunities as they arise throughout the project
    • Additional metatadata derived from Cambridge University Library published under a Public Domain Dedication License as MARC-21
    • The above dataset accessable As RDF/XML and other notation 
    • Data derived from OCLC released in a similar fashion under a license compliant with the rights and responsibilites of the co-operative
    • A working RDF / triplestore with SPARQL endpoint containing the above data
    • Full documentation on work described above
    • A full investigation into examing and determining provenance in MARC-21 data for the UK HE community
    • An investigation of the licensing issues for data, relevent to library operation

    Wider Benefits to Sector & Achievements for Host Institution
     The project will bring value to the wider community by contributing substantially to the implementation of the Resource Discovery Task Force vision of open metadata through release of over 2.2 million bibliographic records under an open licence. The release of this volume of data will itself be of value to the community but the engagement of a major content provider in this process and in facilitating data linking brings added value to the project.

    An investigation into linking to FAST and VIAF headings will provide an exemplar of the potential usefulness of a structured semantic approach to data. The project will look at the value data enrichment offers for resource discovery in the context of the RDTF vision.

    Staff of both the University Library and CARET will develop further skills in dealing with structured data, testing alternative technologies and approaches: JSON and XML, RDF/SPARQL. Involvement of both library and VLE developers will test the potential of open metadata and linked data in two different communities holding distinct and useful sets of user data.

     The project team have an excellent record of documenting and presenting their work in a way which encourages take-up by other institutions that will be applied to this project. 

    Examples of this approach can be found at  the library’s API developers' portal and at  the CUL widgets development.

    Risk Analysis and Success Plan  

    (P x S)
    Action to Prevent/Manage Risk
    Retention of project staff
    Specialists with similar range of skills available within existing teams.
    Failure to meet schedule in workplan
    Experienced project director and board will be appointed
    Problems with metadata linking
    Able to call upon OCLC for advice and assistance with linking to FAST and VIAF headings and other Cambridge expertise where required.
    Intellectual property rights
    University Legal Services team will be consulted where required. The team will observe JISC and OCLC advice on IPR issues in relation to metadata release.
    Licenses cannot be obtained to permit intended use
    Project takes into account scope of existing licenses and previous work with IPR holders.
    Breach of permitted uses
    Take-down policy

    IPR issues
    In addition, the project will provide documentation and 'how to guides' for assessing the provenance of bibiliographic data from a technical viewpoint, supplementing the information available at the JISC Open Bibliographic Data guide.

    Project Team Relationships and End User Engagement 
    The project will not recruit new staff but will make use of existing staff in the Library and CARET.

    Ed Chamberlain, Systems Development Librarian, will contribute 0.5 FTE for technical work and will undertake management of the project. He brings extensive experience of project management on cross-departmental projects, particularly with the CARET and other libraries of the university, and was responsible for releasing and documenting existing APIs to library services.

    Dan Sheppard, Senior Research Associate at CARET will also contribute 0.5 FTE as software developer. The project will also call upon the expertise of two further members of library staff:

    Hugh Taylor, Head of Collection Development and Description, for bibliographic data and related data ownership and licensing issues, and Huw Jones, Digital Library Metadata Specialist, who will contribute particularly towards identifying and evaluating service innovations based on the open metadata and linking to library location and membership data.

    A project board, comprising of representatives of Cambridge University Library and the
    University of Cambridge Centre for Applied Research in Educational Technologies will take  responsibility for overseeing the project:

    Patricia Killiard, Head of Electronic Services and Systems, Cambridge University Library
    Hugh Taylor, Head of Collection Development and Description, Cambridge University Library
    John Norman, Director, Centre for Applied Research in Educational Technologies, University of Cambridge
    The team will engage end users primarily through this blog as an inital entry point to a deeper set of online documentation hosted at lib.cam.ac.uk/api. This will follow on from the documentation created by the JISC funded Cambrdge widgets project

    Projected Timeline, Workplan, Deliverables & Overall Project Methodology

    WP1 Project management and communication
    Writing a detailed project plan. Creation of a project blog to be updated regularly. Documenting technical approaches, methodologies, solutions, and problems.

    o        Project plan,
    o        Project blog with notes of meetings, team discussions, and final project report.
    o        Documentation on technologies and methodologies.

    February-July 2011
    WP2 Data release

    o        Export of bibliographic data from Cambridge University Library Voyager catalogue to an external data store with appropriate API and semantic interoperability.
    o        Investigate use of the Open Knowledge Foundation’s ORDF Python library built upon RDFLib for working with RDF Data.

    o        Set of around 2 million bibliographic records openly available as structured data XML/RDF, JSON/SPARQL
    o        Skills development of team in above technologies
    o        Skills development of team around use of the ORDF Python library

    February-April 2011
    WP3 Linking to OCLC FAST and VIAF headings

    o         Assign FAST and VIAF headings to open metadata using linked approach.
    o        Use of OCLC’s Linked Data Framework

    o        Enriched metadata set with FAST and VIAF
    o        Experience in linking to OCLC/external data and use of the framework

    April-June 2011
    WP4 Linking to library location and membership data

    •   Create RDF links to library location and membership data
    •   Explore and trial services which can be built around RDF triples and linking to local data

    • RDF links to local data
    • Team skills in creating and using RDF triples
    • Pilot user services or service specifications

    April-June 2011
    WP5 Intellectual property rights
    • Liaison with the RDTF management framework project to discuss schemas and licensing
    • Identify and document bibliographic data ownership for records in library’s Voyager catalogue.
    • Identify and document IPR status of record sets with respect to release as open data.
    • Initiate discussion with selected IPR owners on issues around open release

    • Team understanding of RDTF requirements with respect to schemas and licensing
    • Summary documentation on data ownership
    • Report on IPR issues in relation to release of further records as open data

    February-June 2011
    WP6 Evaluation and Sustainability

    • Investigate options for data refresh
    • Peer evaluation through blog posts and feedback
    • Participate in JISC Programme’s evaluation and align reporting to its evaluation requirements

    • Blog posts on options explored throughout project
    • White paper on sustainability and data refresh
    • Report

    April-July 2011


    Directly Incurred
    August 10– July 11 August 11– July 12 TOTAL £
    Total Directly Incurred Staff (A) £32,042 £0 £32,042
    Non-Staff August 10– July 11 August 11– July 12 TOTAL £
    Travel and expenses £800 £0 £800
    Hardware/software £500 £0 £500
    Dissemination £500 £0 £500
    Other – Contingency £1,000 £0 £1,000
    Total Directly Incurred Non-Staff (B) £2,800 £0 £2,800
    Directly Incurred Total (C)
    £34,842 £0 £34,842
    Directly Allocated August 10– July 11 August 11– July 12 TOTAL £
    Academic Grade 11 - UL, sp64, 5 days
    £1,684 £0 £1,684
    Academic Grade 12 - CARET, sp69, 5 days £1,960 £0 £1,960
    Academic Grade 7 – UL, sp46, 5 days
    £975 £0 £975
    Estates £5,655 £0 £5,655
    Directly Allocated Total (D) £10,274 £0 £10,274
    Indirect Costs (E) £38,628 £0 £38,628
    Total Project Cost (C+D+E) £83,744 £ £83,744
    Amount Requested from JISC £40,000 £ £40,000
    Institutional Contributions £43,744 £ £43,744
    Percentage Contributions over the life of the project JISC
    No. FTEs used to calculate indirect and estates charges, and staff included 1 FTE All Directly incurred Staff