CMC Chair: ALA Midwinter Report 2018

Reported by Tracey Snyder (Cornell University), Chair, MLA Cataloging and Metadata
Committee (CMC)

PCC At Large and PCC Participants Meeting (combined session)

Due to factors related to the federal government shutdown, the PCC At Large program and the PCC Participants Meeting were combined into one session, and several usual attendees from the Library of Congress were not able to attend, sending their regrets to the membership. PCC Chair Lori Robare began the session with a few announcements. (1) The new PCC Directory is complete, has been tested, and will have a phased rollout beginning in May or June. (2) The volume of correspondence in the NACO email account has been very difficult to manage, especially in light of the federal government shutdown, and participants are asked to please be patient and avoid re-sending messages. Messages should include the LCCN and the 1XX of the authority record in question. The PCC Secretariat would like to recruit more NACO reviewers to
help keep up with the workflow. Those interested should email NACO with the
subject line “NACO review interest.” (3) Best Practices recommendations from the PCC Ad Hoc
Task Group on Gender in Name Authority Records (see were approved in principle but are not yet policy; a survey will be issued and the results considered before policy is implemented.

A large part of the session was devoted to a review of the draft document outlining PCC’s strategic directions for 2018 through 2021. Since the meeting, comments from the membership (including the MLA CMC chair and the CMC’s PCC funnel coordinators) were submitted, and the draft document was made final and published with a date of February 23, 2018. (See After the previous strategic plan (2015-2017) set the stage for moving away from record-based work and toward statement-based work and identity management, the current document emphasizes the transition to a Linked Data environment and seeks to ensure success for the PCC as an
organization. The Vision, Values, and Mission sections have been updated with this in mind.

The session also featured a presentation by Kathy Glennan, Chair-Elect of the RDA Steering
Committee (RSC), about RDA, Linked Data, and the 3R project (see Kathy’s presentation is online (see,
as are many other RDA-related presentations from this and other conferences (see Kathy reviewed the different “flavors” of RDA: RDA Reference (for developers), RDA Vocabularies (for developers), RDA Registry (for application developers), RDA Toolkit (for catalogers, the only component that is not freely available), and RIMMF (for trainers). Linked Data is one of several possible implementation scenarios of RDA and is enabled by the RDA Registry. Kathy mentioned the connection between RDA and other standards, including the IFLA LRM and CIDOC CRM, and review the goals of the 3R project, whereby the new version of the RDA Toolkit coming in June 2018 will implement new LRM entities and offer an improved interface and more generalized and flexible instructions. Kathy
provided details on several new concepts coming to the Toolkit:

Recording methods (unstructured description, structured description, identifier, and/or IRI, used as appropriate for a given entity)

Manifestation statement (a quite literal transcription of data from a resource with attention to punctuation, capitalization, etc.)

Reconceptualizing serials via LRM as diachronic works (where each issue is an aggregate manifestation of articles and each issue has a whole/part relationship with the serial work)

Redefinition of person (a real person who is living or has lived, who may nonetheless have other bibliographic identities that can continue to be described as such in name authority records)

Relationships preferred over attributes (for example, in the Nomen element, an appellation relationship between a work and its title, or a corporate body and its name, etc.)

Kathy also said that events could be modeled in relation to LRM entities Place, Timespan, Agent, and Nomen, as appropriate.

Finally, Kathy reviewed the new governance structure, which will include the new NARDAC (North American RDA Committee), and mentioned potential next steps for PCC, including revising LC-PCC Policy Statements and developing training on LRM and the new RDA Toolkit.

Linked Library Data Interest Group

The slides from this session are available at:

MJ Han (University of Illinois at Urbana-Champaign) spoke about a project funded by a grant from the Andrew W. Mellon Foundation involving name reconciliation for digitized special collections. In creating knowledge cards for the search results page on the library’s Linked Open Data website, two questions arose. (1) How can we identify and reconcile named entities already described in established linked open data sources? (2) How can we best manage unique names often found in local special collections’ data, but not found elsewhere? For the Motley Collection, which includes names of actors, authors, carpenters, composers, dancers, set designers, and others active in theater, URIs for such entities were retrieved manually from sources supporting linked data (such as the LC NAF, VIAF, IMDb, and Wikipedia) and other sources on the Web (such as Encyclopedia Britannica and various other websites). For the Kolb-Proust Archive, which includes names of family members, friends, politicians, journalists, and others related to Marcel Proust, the many entities who were not represented by URIs in sources such as Wikipedia and VIAF were tracked locally in a simple three-column spreadsheet and coded with tags (such as schema:birthDate, schema:deathDate, etc.) for publication as linked open data. MJ noted that special collections require using a variety of sources for name authority information. She also noted that when publishing local names as linked open data, adding local links to other linked data sources such as Wikipedia improves the visibility of local collections and contributes unique information to the Web.

Anna Neatrour and Jeremy Myntti (University of Utah) spoke about a project funded by a grant from IMLS to develop a collaborative, regional authority file (Western Name Authority File) of personal and corporate names from digital collections from institutions in the western United States. After reviewing several data models, the participants chose to use EAC-CPF (which is also used in the SNAC project at the University of Virginia). They used OpenRefine to de-duplicate variant forms of the same name and reconcile thousands of names with the LC NAF and retrieve URIs. After reviewing several possible tools for creating and maintaining the regional authority file, the participants chose to use OmekaS and loaded a subset of EAC-CPF. A sample record (for Maurice Abravanel) illustrated how the regional authority file lists variant
forms of a person’s name, collections where that person is mentioned, and links to LC NAF. The
regional authority file is being piloted and assessed.

Xiping Liu, Anne Washington, and Andrew Weidner (University of Houston) spoke about Cedar, a locally developed linked data authority service that works with a local program to mint and resolve identifiers. The principles guiding the work in developing and populating Cedar relate to authority control (beyond the LC NAF), workflow efficiency, and application of linked data. The top concepts hierarchically in Cedar are Agent, Collection, Concept, Place, and Time Period. Cedar has been populated by bulk import as well as individual entry of names and other entities. URIs from the LC NAF and VIAF are added to the local records. New terms can be created in the system for use instead of equivalent or roughly equivalent terms that exist in
LCSH (for example, “LGBTQ people” and “LGBTQ communities” instead of “Sexual minorities”). Cedar has been put to use in cataloging and metadata cleanup for the university’s electronic dissertations and theses, in conjunction with OpenRefine for name remediation.

CaMMS Forum

This forum was called “Cooperatively Conscientious Cataloging,” and attendees spent the bulk of the session discussing their thoughts in small groups on several questions posed by the organizers before reporting out. The organizers aim to incorporate the results of the discussions into the development of a cataloging code of ethics, a set of best practices for ethical cataloging, and/or other documents deemed appropriate. There was a sense in the room that both a set of broad values and a practical document illustrating some ethical dilemmas and solutions guided by those values are desired. It is yet to be determined if such documents will
be drafted by a task force and then opened up for comments/contributions from the larger community, or crowdsourced initially and then refined by a task force, or some other arrangement. Attendees articulated a need to get broad input and avoid being overly prescriptive on behalf of different communities. Much of the small-group discussion centered on personal information in name authority records and failings of subject heading systems. One sentiment that was expressed was that it is difficult to be truly neutral, whether due to a tendency to over-describe materials relevant to one’s own personal community or a tendency to under-describe materials that one either finds personally offensive or is unfamiliar with.

