Encoding Standards Subcommittee Report: MLA 2016

Encoding Standards Subcommittee Business Meeting
Friday, March 4, 2016
4:00-5:30 PM

Members present: Anne Adams, Catherine Busselen, Ralph Hartsock, Chris Holden, Karla Jurgemeyer, Keith Knop, Lisa McFall, Karen Peters, Jim Soe Nyun, Hermine Vermeij, Jay Weitz, Steve Yusko

Members absent (excused): Thom Pease, Matthew Wise

Non-members present: Margaret Corby, Kathy Glennan, Jean Harden, Chris Hertzog, Kevin Kishimoto, Nancy Lorimer, Deb Morris, Tomoko Shibuya, Tracey Snyder

1. Minutes of the 2015 joint Metadata and MARC Formats Subcommittees were approved without change.

2. Outgoing members Lisa McFall and Ralph Hartsock, who are rotating off the Subcommittee at the end of this meeting, were thanked for their service. This was followed by a call for new members, with applications due by 11:00AM Saturday, March 5, 2016.

3. Report on previous year’s MARC activity: work was done on one Proposal and three Discussion Papers, which were presented at MARC Advisory Committee (MAC) meetings at ALA Midwinter:

  • MARC Proposal 2016-20, defining $r and $t, and redefining $3 in the (Bibliographic and Authority) MARC 382 Field.
  • MARC Discussion Paper 2016-DP01, defining $3 and $5 for the (Bibliographic) MARC 382 Field. Some concern was expressed that $3 has been used in North American to record descriptive metadata (eye-readable labels) as opposed to recording structural metadata intended to link fields. It was further noted that $8 will become available later this year for recording structural metadata.
  • MARC Discussion Paper 2016-DP02, clarifying code values in (Bibliographic) MARC Field 008/20 (Format of Music). It was suggested that an additional code (perhaps “p”) be added for piano scores, rather than using code “z” for these.
  • MARC Discussion Paper 2016-DP03 (co-sponsored by OLAC), adding a 1st indicator 6 in the (Bibliographic) MARC 028 Field to record distributor numbers for music and moving image materials, and making corresponding clarifying changes to the MARC 037 (Source of Acquisition) Field.

MAC approved the Proposal, and the Discussion Papers will be returning this summer as Proposals, with few changes recommended. Volunteers will be needed to turn the Discussion Papers into proposals, including addressing any issues raised. The Chair has posted a detailed report on the CMC Blog at: https://drive.google.com/file/d/0BxViFaIR72G1eUpOOGxxVVRfSzQ/view

4. Report on ALA Midwinter (from the ALCTS Metadata Interest Group):

  • ALA Metadata Standards Committee (under LITA/ALCTS/RUSA) is working on Principles for Evaluating Metadata Standards, a sort of best practices that includes criteria for critiquing the standards. Will be presented in final form this summer, with some testing.
  • The Digital Public Library of America (DPLA) is putting forward some rights vocabularies; note that these are not something that they are looking to try with BIBFRAME.

More details on both of these, including links to the draft Principles for Evaluating Metadata Standards, are included in the Chair’s CMC Blog post (mentioned above).

5. Report on previous year’s metadata work:

  • Work has been done on the Music Metadata Requirements (MMR) site: http://www.musiclibraryassoc.org/mpage/cmc_meta_resources . Noted that we should broaden the site’s scope to include MARC.
  • Feedback given for PBCore 2.1: rewrite of schema reworking documentation, and creating documentation when none had previous existed. 2.1 has now been released.
  • Feedback given to Christy Crowl regarding ProMusicDB.
  • Feedback given on MADS/RDF update: noted that RDA elements in MARC were being accommodated for name entities but not so much for titles/musical elements of titles (such as opus numbers). Request that these elements be included in the schema were received too late to be included in this update, but they will be included in the next version (timetable unknown, but should be fairly soon).

6. Update on MLA BIBFRAME work and plans for future workflow to provide input to LC:

  • Final BIBFRAME Task Force meeting was held on March 3.
  • MLA’s future work on BIBFRAME will likely best be done by one or more Working Groups rather than Task Forces (more informal , with less constraints) consisting of a core of people with expertise (likely drawn from members of the BIBFRAME Task Force) together with others from within and outside of CMC (who will develop expertise). Will build on work done by LC’s LD4P (Linked Data for Production) and Linked Data for Performed Music Initiative. LC is awaiting approval next week of a grant that would fund an LD4P Ontology product. The grant has four components, of which the development of use cases is where MLA/CMC has the best potential to make a contribution. The other components are remediation for conversion, ontology development (including development of an external vocabulary for medium of performance, possibly hosted by MLA?), and a PCC component for profile development (possibly with ARSC?). This will be discussed further tomorrow, and Tracey will be issuing calls for participation after the grant is approved (and even if it’s not).

7. Update for work on PBCore 3.0: report submitted by Thom Pease (see Addendum I below). PBCore primarily impacts those who deal with recorded audio and TV/radio. An AMIA/PBCore subcommittee recommended that PBCore 3.0 not be pursued for at least three years: the EBUCore standard is already doing a lot of what they were intending to do with PBCore, so the intention now is to develop something complimentary to EBUCore work. Making a list of PBCore users; will send out a survey to assess the state of PBCore use and ongoing use requirements. Thom Pease also provided a brief update on RFTF (see Addendum II). At some point, we will be asked to comment on these, but not so much this year.

8. New business:

  • Ideas for further MARC development/MARC-related issues:
    i. 382 is not perfect, but has been tweaked a lot of late. Best to leave alone for now?
    ii. Investigate the hold-up (LC?) in implementing 382 $e in authority records.
    iii. Discussed whether form of music notation should be moved from the 546 $b to the 348. Various opinions on the matter suggest that a Discussion Paper may be in order; needs to be referred to a discussion group.
    iv. Let Kathy Glennan know if the MARC mapping in the RDA Toolkit is not correct.
    v. Discussed machine generation of 382 and 655 fields from existing 650 fields. In at least some cases, these may need human evaluation; suggested using a different indicator (or other means) to indicate that these were machine generated and may need human evaluation. Issue should be referred to a discussion group (including Casey Mullin). Noted that MARC discussion paper 2016DP-12 (Designating Matching Information in the MARC21 Authority Format) is coming back as a proposal and might be of use with this issue?
    vi. Mapping 245 $c in BIBFRAME. Jay: OCLC has done a lot of work to try to parse 245 $c to try to determine if there was additional title information there to try to match things correctly. Could adapt some of that work to future conversion. There is an ALA group that has to do with removing ISBD punctuation from subfield borderlines to make it easier to differentiate. Find some way to differentiate primary title information from other titles. Nancy Lorimer: add 700s to cover 245 $c works. See how it pans out into MARC proposals. Kathy Glennan: hesitate to take the lead on this, but discussing it might not be a bad idea in case someone else puts it out there. Jim: we’ll put it on our “Watch List.”
    vii. Conversion issue: parsing 245 $c in BIBFRAME when (additional) works are listed there. A group at OCLC has done a lot of work trying to parse 245 $c in such cases, and is looking at the possibility of defining new subfields for subsequent statements of responsibility. Also, there is an ALA group looking at removing ISBD punctuation from subfield borderlines to make this information easier to differentiate. Further discussion suggested in preparation for suggestions/ideas from other bodies. The Chair indicated that this issue should be put on our “watch list.”
  • Stay tuned for the possibility of engagement with further developments with PBCore. Thom Pease is currently involved in these discussions.
  • Gauging interest in reviewing/evaluating various non-BIBFRAME schemas such as Schema.org, MusicBrainz, and Music Ontology. Should definitely review Schema.org: it may not do everything we want, but it’s beginning to be used more and more, and extension schemas can be applied (could develop if we want to). We know now that there is no intention that BIBFRAME cover every aspect of bibliographic data, so it may be worthwhile to look at other options, e.g. Europeana. We could look at MusicBrainz in conjunction with BIBFRAME, too. Important not to reinvent the wheel if something worthwhile is already out there.
  • Other topics from the floor/table
    i. Could look at Discogs, too, in regards to (c.).
    ii. Kevin Kishimoto notes that a lengthy article on music ontology has been published in the Journal of Knowledge Organization and has promised to send a reference. (Citation: Madalli, Devika P., B. Preedip Balaji, and Amit Kumar Sarangi. “Faceted Ontological Representation for a Music Domain.” Knowledge Organization 42, no. 1 (February 2015): 8-24. Computers & Applied Sciences Complete, EBSCOhost (accessed March 5, 2016).)

Meeting adjourned at 5:21 p.m.

Respectfully submitted,
Karen Peters, with grateful acknowledgement to Lisa McFall for her helpful input.


I. PBCore Update by Thom Pease

At AMIA in Portland, there were updates by all the groups involved with the AMIA PBCore subcommittee, including the website, the schema team, the documentation team, controlled vocabularies, education, and communication workinggroups. There was much rejoicing about the launch of PBCore 2.1, and they thanked everyone that provided input. As a result, not only was the schema improved, but also the documentation.

It was decided that for the immediate term, not to pursue PBCore 3.0, for at least three years. There will be more work supporting mapping towards an EBUCore schema in RDF, which would let PBCore piggyback on that work, since they are so similar.

On February 19, I participated in a conference call with a couple members of the PBCore Outreach Working Group of which I am a member.

We were just formed from the AMIA PBCore Subcommittee, and we have a number of initiatives to query the users of PBCore. Outreach would be to producers of audio and audio-visual content, as well as public and private organizations, libraries and archives which hold this content, and organizations/associations. First, we’re making a list of known PBCore users, a list of PBS and other media organizations, and other entities. We’re looking to make a survey and get it out and results back before AMIA in Pittsburgh. This survey will assess the state of PBCore use and ongoing user requirements, and with the cooperation of AMIA signal the intent for broader PBCore community outreach.

II. Radio Preservation Task Force Report by Thom Pease

Radio Preservation Task Force conference organized under the auspices of the Library of Congress’s 2012 National Recording Preservation Plan. While metadata and access were a frequent topic of conversation throughout the conference, the metadata session was a particularly interesting forum, focusing on short presentations by a number of presenters. Casey Davis (WGBH) and Rachel Curtis (Library of Congress) talked about the American Archive of Public Broadcasting and their use of PREMIS and PBCore. They talked about their use of Minimum Viable Cataloging which should allow teams of graduate student interns to spend 15 minutes at a time to enhance metadata for the records representing 40,000 hours within their Archival Management System. They expect this will take six years. The AMS system is the back-end behind the public catalog, and is built on a Blacklight/SOLR index. There will be a number of National Digital Stewardship residents at various public media stations over the course of this next year. One of them, Mary Kidd, who is working at New York Public Radio (WNYC), talked about analyzing production workflows in her organization which is both producing lots of new content and archiving decades of historic material. She is scanning hard drives and identifying various information silos throughout the organization.

The common thread of each was to try to get all of the information about audio content throughout the organization into PBCore XML and into their datastore.

There was an interesting session on podcast archiving. Jeremy Morris and Andrew Bottomley from University of Wisconsin-Madison talked about how to give access and still respect the copyright’s owners’ rights, and what users in the future might want in terms of design and metadata from such a database. Particularly interesting was the dynamic nature of RSS feeds and overwriting of content, such as when podcasts are updated; users may want the original ads or underwriting credits.

William Vanden Dries is tasked by the NRPB with creating a collection database of institutions and collectors that hold radio content. He is doing through the aegis of ARSC and Indiana University where he is affiliated. One of the biggest takeaways of the session was the best things that people working with metadata can do is create tools which federated search (the example cited was footage.net for stock footage) and APIs to be run on large sets of data.