Reported by: Rebecca Belford (Oberlin College), Chair, MLA CMC Vocabularies Subcommittee
SAC SubCOmmittee on faceted vocabularies
1. OLAC video game genre/form terms (Rosemary Groenwald)
- Completed; indication of use: landing page for terms has over 1000 views
- OLAC board entertaining continuing maintenance
- $2 olacvggt validates in OCLC
- Rosemary’s library [Mt. Prospect Public Library, Illinois] is working on retrospective addition,
currently approximately 30% complete. University of Washington will be undertaking a project to add retrospectively, locally and in OCLC.
2. OCLC Music Toolkit and MLA retrospective implementation algorithm (Rebecca Belford)
- Q: Are there presentations available? A (Casey Mullin): Yes, slides from MLA 2018, YouTube screencast, and background documents all available on the MLA CMC blog.
3. LC PSD report (Janis Young)
- Cancellations of Filmed x, Televised x, and X radio programs canceled. High-level terms (filmed
performances, televised performances, etc.) were retained as were some that don’t have
corresponding genres (e.g. Filmed baseball games).
- Scope Notes for the “Broadest Terms” in LCGFT: For eleven LCGFT broadest terms, scope notes
indicating that they could be used only for collections “that are composed of multiple genres and/or forms to which more specific headings” were removed. Those scope notes restate the general principle of specificity, covered in the LCGFT draft manual.
This includes “Sound recordings”–available for use for music even if a narrower non-sound- recording genre term present (e.g. Symphonies). LC is neutral as to whether this may be interpreted as permission vs. instruction to use. LC will be watching for best practices from the music cataloging community to use in developing their instruction sheet. TBD: usage in MARC 380.
“Pictures” retained this scope note, for differentiation from “Art” and Art NTs
- Moratorium on LCDGT proposals is still in effect. LC still examining structural and sustainability issues.
- The full report is available from LC’s ALA site, https://www.loc.gov/librarians/american-library-association/midwinter/lc-update/ under “Cataloging Policy and Standards.”
4. CEAL (Council on East Asian Libraries) (Charlene Chou)
- Conference in March.
- Discussion based on CJK terms in addition to subject/genre terms. They are hoping for input from the East Asian library community and more communication between public services and catalogers. An example of such a term is the recently approved term for Japanese light novels (“Light novels”).
- Adam Schiff: Regional training material on faceted vocabs is available for use
PCC Policy Committee response to SAC white paper
Note: As of this writing, response not yet published.
PCC response highlights: support for being able to incorporate faceted vocabs in PCC work; including standards and training; appointing of PCC liaison to SSFV as standing position; understanding that task work SSFV/PCC may cross lines to collaborate/reduce redundancy. Fits with PCC strategic direction of linked data.
Additional responses to white paper:
- National Library of Medicine: concerns over assigning comprehensive demographic data to people and resources (ethical concerns)
- CC:DA: Generally supportive. White paper recommendation to create more authority records may not be practical at scale. Suggestion to think about external registry for works a more open solution. The practice of assigning a broader term in addition to a narrower term (in the spirit of collocation) violates specificity but provides both functions. Retrospective application: literature is a good candidate as the first area. Special considerations for aggregates, which are not a new problem but are amplified in a faceted landscape.
Review SSFV charge
The current charge is from 2017:
The Subcommittee on Faceted Vocabularies will work as a coordinator between the Library of Congress and the library community and will follow the progress of the LCMPT implementation steered by MLA. It will identify problematic areas concerning the application of the new faceted vocabularies and will make recommendations, providing examples and special cases, if appropriate. The work will be divided in two parts.
In Part 1, the subcommittee will focus on the development of clear guidelines for the application of LCGFT and LCDGT with the aim of providing a document of best practices for using these vocabularies. Contemporaneously, the subcommittee will facilitate the continuing progress of the Working Group on Genre/Form of Video Games until its completion.
In Part 2, the subcommittee will elaborate on the examples provided in the above best practices document and will enhance the utility of the faceted vocabularies with new training materials produced in close coordination with the Library of Congress. Additionally, the subcommittee will continue to monitor the project of the genre/form of video games until its completion. The subcommittee will report on its progress to SAC at the ALA Midwinter and Annual Conferences.
–as of 2/3/19, on the SSFV website
Need an operational definition for “faceted vocabulary.” LC’s definition, from LCGFT Introduction: “a fully faceted vocabulary [is one] in which each term represents a single concept, and multiple terms are assigned to bring out multiple concepts.” Discussion:
- Terminology precision: “Facet” vs the application of suite of faceting terms, i.e., faceted vocabularies vs vocabularies used for faceting
- Relationship to FAST work
- Display and indexing aspect: Make sure SSFV does not operate completely separately; at the least, be in collaboration with those addressing this.
- Active liaising with OCLC. Having PCC endorsement might help with batch application.
- Survey of who is indexing/displaying various facets
Additional discussion related to charge:
- Can we use recommendations from the white paper? These include training, work-level authority records, retrospective implementation, indexing and display of faceted data, authorities.
- Desire for best practices (BPs). These should be an extension of LC rather than simultaneous/parallel.
- Is training contingent on LC practices and resulting BPs?
PSD reminder: LCDGT not necessarily stable from LC perspective; may make sense to form BPs after standard practice created.
Q: What is most likely to change?
A: Occupational, social, national/regional are the three problematic categories; also mindful of application ethical issues as raised by NLM. Not clear what will be an application change vs term change vs hierarchical changes. Working on a big set of changes and explanations, rather than piecemeal. Timeframe not established.
Q: examples of problematic hierarchies?
A: Families. Example term: “Children” currently defined as an age range, but is also a relationship within a family.
Chair raised concerns and ethical issues based on correspondence with colleagues:
- Concerns for living creators included:
- Gender in authority records-consider as antecedent; already written about o Strong preference for specific terms, e.g. Indian/Native America
- Preference beyond privacy and safety
- Changes in identification over time
- Additional or other considerations for non-living creators include:
- Considerations of contemporary definitions for retrospective application (for example, the concept/range of “childhood” has changed over centuries)
- Justification for assigning terms and the basis for evidence
For SSFV work, this deep thinking about ethical considerations won’t be tied to a specific LC hierarchy so it isn’t dependent on full completion of LCDGT. This can be an opportunity to promote awareness among public services colleagues. This is a good mesh with LC work because PSD does not necessarily have the time to engage in this meta-level work. Outside work or collaborations to consider: PCC gender task force, Wikipedia guidelines for biographies for living persons, ALA roundtables (GLBT, Social Responsibilities).
SSFV general administrative considerations/directions
- Scope of tasks and subcommittee size. SSFV members alone cannot cover everything. Could expand into subgroups, chunk out tasks, possibly bring in outside people for specific subgroup work
- Thematic subgroups: genre-form (retrospective), geographic, temporal, demographic, (later: display/indexing/faceting)
- Need clarification from SAC on term length for SSFV members
- For discussion with co-chairs of SAC: method of getting broader feedback/sharing while in development, if we want a larger feedback body of input than SAC
Subject Analysis Committee Business Meetings
Meeting dates: Sunday, January 27 and Monday, January 28. Full written reports are posted to ALA Connect; see Chairs’ Report, (15) below, regarding access issues.
- Sears List of Subject Headings (Maria Hugger)
No report; following EBSCO’s sale of the Sears List of Subject Headings, the new owner has not yet hired any librarians to manage the subject vocabulary.
- Policy and Standards Division of LC (Janis Young)
[See notes from the SSFV meeting.] The full report is available from LC’s ALA site, https://www.loc.gov/librarians/american-library-association/midwinter/lc-update/ under “Cataloging Policy and Standards.”
- CC:DA Liaison (Robert Maxwell). Excerpts from the written report:
CCC:DA. Due to the RDA Steering Committee’s (RSC) 3R Project (RDA Restructure and Redesign), CC:DA has had no discussion papers or proposals to work on since ALA. However, two CC:DA Task Forces have been in operation between July and December, 2018.
- 3R Task Force. This Task Force is charged with providing feedback and assistance to the ALA Representative to the North American RDA Committee (NARDAC) on issues related to the 3R project including, but not limited to, feedback on the beta RDA Toolkit site. The TF has held virtual discussions via e-mail since Annual.
- Virtual Participation Task Force. This Task Force is charged with exploring ways to lower the barriers to participation in CC:DA, especially for representatives from liaison groups and from specialist communities. … In its November 15 report the TF recommended testing procedures in which CC:DA would hold mixed in-person and virtual meetings, beginning with the 2019 Midwinter meeting, in which two or three participants will test attending the meeting virtually. CC:DA agreed to try this.
- NARDAC (North American RDA Committee). NARDAC has completed the first year of its existence. Stephen Hearn will replace Kathy Glennan in 2019 as one of the ALA representatives, with a three-year term. NARDAC spent most of the last year reviewing the text of the developing new RDA Toolkit, and have met virtually four times since ALA Annual.
- RDA Steering Committee. Gordon Dunsire’s term as chair ended December 31, 2018. Kathy Glennan is now the RSC chair. Gordon began a two-year term as Technical Team Liaison Officer in January. The Beta Toolkit was released in June 2018 (https://beta.rdatoolkit.org). Another formal release is expected in January. The English text is planned to be stable by April 2019. The projected date for completion of the project is now December 2019, including translations and policy statements.
- SAC Research and Presentation Working Group (Jennifer Bromley). Excerpts from the written report:
The SAC Research and Presentation Working Group has speakers lined up for both Midwinter 2019 and Annual 2019. The SAC Annual Meeting Program will take place on Monday, June 24, 2019, 1:00-2:00 pm, in Washington, D.C. The topic will be subject metadata in WorldCat, and our presenter is Professor Oksana Zavalina from University of North Texas. Both Jennifer and Paromita are cycling off in June, and we need a new chair and members. It’s a great way to develop contacts and stay informed about the research in the field. If you are interested in the group let Rocki and Chris, our co-chairs know.
- Music Library Association (present: Casey Mullin on behalf of Rebecca Belford)
[Summary of relevant MLA activities. See written report.]
- Art Libraries Society of North America (ARLIS/NA) (Sherman Clarke). Excerpts from the written report:
The main project that the Cataloging Advisory Committee is working on is an update of Cataloging exhibition publications: best practices. It was issued in 2008 and compliant with AACR2. It is being updated to RDA. The guidelines have a section on subject headings with mention of genre/form. The genre/form section will be expanded to reflect the LCGFT visual works terms. We hope to move forward quickly after the release of 3R. The current version is available at https://arlisna.org/publications/arlis-na- research-reports/147-cataloging-exhibition-publications-best-practices. ARTFRAME is an extension of BIBFRAME for art objects. It was based at Columbia and was part of the LD4P grants that ended in 2018. The ACRL Rare Books and Manuscripts Section, Society for American Archivists, and ARLIS/NA have joined to sponsor a steering committee and working group to continue the work.The ARLIS/NA Book Art Special Interest Group has developed a thesaurus of terminology for indexing artists’ books with emphasis on the physical features. A first version is available at http://allisonjai.com/abt/vocab/index.php. The 2019 conference will be held in Salt Lake City in March.
- American Association of Law Libraries (AALL) (Cate Kellett). Excerpts from the written report:
Deep Dive Program: FCIL Basics for Metadata Professionals. In July, TS-SIS collaborated with the Foreign, Comparative and International Law Special-Interest-Section (FCIL-SIS) to cosponsor the program FCIL Basics for Metadata Professionals: Collaborating to Ensure Access to Foreign and International Legal Materials. The speakers aimed to introduce seasoned catalogers to the peculiarities of cataloging foreign, comparative, and international law materials while also serving as a broad overview of cataloging legal materials for reference librarians and other professionals who weren’t familiar with the specifics of foreign and international law cataloging practices. A significant portion of the presentation was dedicated to subject access. This included an overview of key differences between common law and civil law systems as well as tips and tricks for finding subject headings in foreign language catalog records.LCGFT for Law Materials: Work is ongoing to develop a cohesive plan to retrospectively apply Library of Congress Genre/Form Terms to law materials.
- SAC Subcommittee on Faceted Vocabularies (Casey Mullin)
[See conference report from the SSFV meeting.]
Excerpts from the written report:
Past Chair Lia Contursi decided to step down as chair. Casey Mullin was appointed as the new chair. Lia continues to serve SSFV as a regular member, and we thank her for her service and leadership. SSFV would like to expand its ranks of faceted vocabulary experts, and will work with the SAC Co-Chairs to solicit volunteers and make additional appointments (e.g., a PCC representative), in accordance with proper protocols.
- Report on new FAST steering committee (Judy Jeng)
[See conference report from the Faceted Subject Access Interest Group.]
- Dewey Classification Editorial Policy Committee (ECIP) liaison (Deborah Rose-Lefmann)
Editorial rule changes: 1) Policy for when terms should be removed from the Relative Index because they are considered offensive to some group, and when they are not removed. New guideline, if the term is considered offensive by some in the affected group and the correct term is reference sources, the offensive term should be deleted. In other situations the term may be considered offensive by some members of the group but preferred by others, in which case it remains. 2) Common forms of personal names allowed in the index, meaning spellings of common forms can be used even though they do not match ALA Romanization.Table 2 updates: local option to allow prioritization by Indigenous group over topic.
Excerpts from the written report:
- Meeting 141 of the Decimal Classification Editorial Policy Committee (EPC) was held Oct. 15-16, 2018 at OCLC, Dublin, Ohio.
- EPC approved the following updates to Table 2 (Geographic notation):
- Table 2 notation for Indigenous nations, which we have been working on for years, was tabled in favor of an official option in the Introduction for prioritizing Indigenous groups in knowledge classification. This is the result of discussions with Indigenous knowledge organization professionals.
- Changes to administrative units in England
- EPC approved selected updates in the following schedules:
- Literature (800s): Changes to the period tables under Specific forms of Latin and Greek literature
- History and geography (930s-990s): Open-ended administrations for 15 countries were updated.
- The Committee requested revision of some proposals; they will probably be voted on in meeting 141A or 142.
- EPC requested more detail in a proposal that laid out an optional, chronological notation in the 200s with less emphasis on Christianity. This notation expands upon the option given in the Manual note 220-290 Optional arrangement for Bible and specific religions. We welcome any feedback on current use of, or interest in, this option.
- Dewey Section liaison (Caroline Saccucci, via written report)
The full report is available from LC’s ALA site, https://www.loc.gov/librarians/american-library- association/midwinter/lc-update/ under “Cataloging in Publication (CIP) and Dewey Programs.”
- Dewey Decimal Classification and OCLC Dewey Services (present: Violet Fox on behalf of Alex Kyrios). Excerpts from the written report:
- Since ALA Annual last year, the Dewey editorial team has published changes approved by the Editorial Policy Committee (EPC) in Meeting 141. Such changes cover topics including Indigenous knowledge organization, terminology for groups of people, Orthodox Christianity, endangered languages, transgender identity, and autonomous vehicles.
- Greater collaboration with PANSOFT, the German company that created and runs WebDewey, have resulted in an increased ability to make fixes and enhancements to the system. Regular overnight publication capability has been restored—for some time, this functionality was unavailable, and publication occurred generally only once a month.
- Over the summer, we again benefited from the contributions of Rachel Maxwell, our returning intern, who reviewed LCSH mappings from People, Places & Things (2001) that needed updating.
- The editorial team continues to prioritize increasing community involvement in maintenance of the schedules. We are pursuing a pilot project in collaboration with a graphic designer to improve the DDC’s coverage of graphic design topics, and we welcome all interested parties who want to get involved in similar projects—whether your library uses Dewey or not! We are also increasing collaboration with Dewey classifiers at the Library of Congress.
- MARC Advisory Committee (MAC) (Stephen Hearn)
Brief summary of MAC discussions at Midwinter. This is Stephen’s last SAC meeting (as MAC liaison) before moving into the role of NARDAC representative.
- IFLA liaison (George Prager). Excerpts from the written report:
- The 2018 IFLA Conference was held in Kuala Lumpur, Malaysia, from August 24-30, 2018.
- All IFLA Conference presentations are freely available in the IFLA Library.
- The Bibliography, Cataloguing, and SA&A Sections held a joint session entitled “Metadata Reports” on August 28. Chris Oliver, chair of the group formerly known as the FRBR Review Group, reported that the group’s name has been officially changed to the Bibliographic Conceptual Models Review Group (BCM Review Group), to better reflect the scope of its work. The group is working on aligning the object-oriented version of FRBR, FRBRoo 2.4, with the IFLA Library Reference Model, published August 2017. The new version will be named LRMoo. A draft document should be ready in 2019.
- Massimo Gentili-Tedeschi, chair of the ISBD Review Group, reported that his group had just received permission from the IFLA Standards Committee to start revision of the ISBD standard, basing it on the LRM model. It will first focus on the manifestation entity. More information on this revision is available on the IFLA Cataloguing Section webpages.
- Agnese Galeffi reported that the latest revision of the Statement of International Cataloguing Principles (ICP) is almost ready for publication on the IFLA web site, incorporating new entities in the conceptual model; the 13 principles will remain unchanged. It will retain LRM terminology, and try to explain the relationship between nomen, name, and authorized access points.
- The Multilingual Dictionary of Cataloguing Terms and Concepts (MulDiCat) is a tool in 26 languages for translators for translating library texts. Melanie Roche, chair of the MulDiCat Review Group, reported that the group has finished consolidating a final list of new terms. (The August 2012 version is still the latest one publicly posted).
- Clement Oury, member of the PRESSoo Review Group, stated that there have been no recent changes to PRESSoo, an extension of FRBRoo designed to represent bibliographic information about continuing resources, and more specifically about serials. It will be revised once FRBRoo has been revised as LRMoo.
- Report of the co-chairs of SAC (Chris Long (present)/Rocki Strader)
- Casey Mullin is the new chair of SSFV
- Judge Jeng will reporesent SAC on FAST
- Kate Callet is the new AALL representative
- Congratulations to Stephen Hearn on NARDAC appointment and thank you for SAC service
- SAC voted yesterday [Jan. 27, 2019] to disband video game working group–work is completed.
- ALA Connect administration
- The ALA default setting, requiring a login and access, does not fit SAC’s idea of open access to the committee space.
- In response, ALA has assigned an IT person to work on making it so anyone who wants access can join the space
- Documents will be organized before ALA Annual 2019
- Written report will be added to ALA Connect after the meeting
- PCC SACO (Paul Frank)
Reminder that PCC’s proposal of optional removal of descriptive punctuation does not affect subject access points. The change would affect descriptive data elements only.
- Approaching Grey House Publishing regarding a representative for the Sears headings? Reorganization resulted in to Sears rep to SAC.
- SAC is seeking a new representative from the MARC Advisory Committee: this is a call to volunteer, or volunteer others.
- Illegal aliens update. Summer 2018, CaMMS did not want to forward request for an update. Any different/new thinking now that that midterm elections have passed? Discussion of whether we can now ask for an update. Procedurally, any request needs to go through CaMMS and the ALA office.
Alcts camms cataloging norms interest group
Meeting date: Saturday, January 26.
There were three presentations.
“Lower the Barrier and Be Empowered: Creating and Including Linked Data Vocabularies for Digital Collections” Sai (Sophia) Deng, University of Central Florida
Note: Slides are available, without login, via ALA Connect.
Deng began with a summary of different digital repository platforms (Islandora, Samvera, CONTENTdm, Omeka, DSpace) and their linked data capabilities. Options vary. RDF is a common feature of those that currently accommodate linked data. UCF’s IR, “STARS,” is configured with extra fields for Linked Data links. Theses and dissertations were selected as a linked data project in STARS. To obtain the links, staff used OpenRefine with an added column to fetch URLs from a web service. The procedure included:
- Use OpenRefine’s LC Reconciliation service (42% initial match) to match names to LC and VIAF; review, edit; add URLs to names matching LC or VIAF; add link to existing Wikidata names if not in VIAF
- In Wikidata, create Wikidata entry for advisor names not in VIAF or existing Wikidata; add URL
- Combine names
- Add LC subject links to IT links
- Add FAST headings by fetching URLs
- Closing slides offered ideas for implementing one’s own linked data project and a resource list.
Q: Please expand on process for adding faculty to Wikidata
A: Names are added one at a time. Wikidata requires additional permissions for bulk. The time required to prepare a full entry for each faculty member makes adding names individually logical.
“Enhancing Metadata and Improving Discoverability for Digital Collections”
Dave Van Kleeck (presenting) and Chelsea Dinsmore, University of Florida
Note: Slides are available, without login, via ALA Connect.
This presentation follows the project the authors presented at ALA 2018 Annual in their talk, “Toward Solving Legacy metadata Issues and Improving Discoverability in Digital Collections.” The UF digital collections include a body of 300+ collections, 13 million pages, and 540000+ metadata records, on a local platform (SobeckCM).
Metadata challenges include the length of time information has been input (1990s onward), the variety of input standards over time (such as spreadsheets input by various noncatalogers), inconsistent curatorial oversight, maintenance of hundreds of collections, and platform limitations. UF partnered with Access Innovations on a pilot project for automated metadata generation for approximately 29,000 ETDs, including born digital and digitized print. Their hypothesis was that they could enhance subject access by adding JSTOR terms based on OCR [Optical Character Recognition]. After selecting JSTOR for a thesaurus, pilot staff ran OCR on the full text and existing metadata, then used DataHarmony to run MAIstro (machine indexer) to extract metadata and enhance records. Workflow steps include staff review, exporting information to OCLC if desired, and using the opportunity to clean up metadata in the staff system. Staff review includes addition of Florida-specific narrow terms.
To test the effect, they examined search results, comparing number of results for LCSH + title to JSTOR terms + keyword. There was almost no difference in the number of search results. This uncovered indexing areas for improvement.
- Quality of text OCR variable, some had to be redone.
- All collections are special; each needs to be tested.
- MAI is machine assisted. Expert review is needed.
- Standardizing METS fields in the SobekCM platform is essential.
- To move forward they need a Florida-specific taxonomy, which will take longer than anticipated.
- Process is iterative.
In the near future, they will use their newspaper collections to develop Spanish- and French-language thesauri. Longer term plans are to apply the process to all new collections, retrospectively enhance existing collections, and work on cleanup and improved rule-building, They envision a broader application to print materials.
Q: Are you testing if subject enhancement helps users find things?
A: Goals include additional qualitative testing; not much testing yet.
Q: Followup on less than expected improvement in search results after OCR as being a Sobeck [platform] problem. Can you expand on what those problems were?
A: Mapping from text files back to Sobeck including data mapping to unexpected fields; Sobeck term processing was not reading all mapped fields. After the programmer left UF, they currently do not have an in-house developer.
Q: Do you have plans to incorporate Florida-specific terms into NACO?
A. They are thinking about it. Long-term goal is to add to LC Name Authority File, but they are still in the getting everything up and running stage. They are a member of all the PCC programs so NACO fits with goals.
“Responsibilities & workflows: keeping agile in a rapidly changing environment”
Tricia Mackenzie and Kimberley Edwards, George Mason University
Note: Slides are available, without login, via ALA Connect.
The speakers presented a case study of platform migration at George Mason University (GMU) Libraries.
Pre-migration planning: George Mason was on a standalone instance of Voyager, while the Washington Research Library Consortium (WRLC) consortium, of which GMU is a member, was planning to migrate to Alma. In fall of 2016, GMU decided to migrate to Alma with the rest of the consortium, with a live date of 2018. Following the decision, they began preparing data as quickly as possible.
Test environment: Access to the Alma test environment in December 2017 allowed them to see how the data would look and to test out workflows in the new system. They had problems with both. Errors in migration were sufficient to prompt a second test load. System limitations, like not being able to make bulk changes to the system or data, made testing workflows challenging. In house, access to Test was not widely granted, affecting available feedback. Before going live, there was a four-week freeze on all ILS work.
Live: The first step after going live was to set up permissions and user roles. Updating workflows and training required the system to be live to decide how to share records in practice.
Workflow impact: There was a cycle of testing, documentation, and training for each task, admittedly not necessarily an orderly or logical process. Workflows in each Alma zone and their interrelationships needed to be determined and tested:
- IZ – institution zone – holdings, items, acquisitions
- NZ – network zone – shared bib records
- CZ – community zone – ExLibris catalog, global authorities, e-resources
They began in categories: 1) rush cataloging; 2) GMU-only; 3) shared records, categorizable as either “theirs is better” or “ours is better,” which needed to meet consortium cataloging principles and required communication with other the institutions that created the records. Acquisitions and cataloging engaged in cross-training in functional areas (e.g., cataloging taught acquisitions how to find good OCLC records so fewer dummy records got created), created documentation and templates for each other, and created forms to facilitate work of other departments.
Recent work: Incorporating linked data into individually brought-in records had to fit into workflows. GMU decided to do this at the point of import from OCLC, which required adjusting of Alma ‘roles’ to allow bulk sets of records for those with cataloger roles. In the consortium, George Washington and GMU committed to including and collaborating on linked data, beginning with finding a way to maintain linked data in records that already have it. The current communication method is a Google form to submit the Alma record ID, then periodic retrieval, addition, and upload.
Takeaways: flexibility and adaptability, communications, accept mistakes will happen, teamwork.
faceted subject access interest group
Meeting date: Saturday, January 26. Roundtable discussions followed a presentation.
Part 1: Presentation, Judy Jeng (co-chair, FPOC)
Jeng detailed the establishment of the FAST Policy and Outreach Committee (FPOC) as a body and highlighted aspects of FAST (Faceted Application of Subject Terminology). The committee was formally established September 2018 with 12 members (4 from OCLC, 8 implementers), and one of its early tasks was to craft a vision statement and articulate the role of the committee. The role of the committee (quoted from their regulations) will be to:
- Serve as an outreach and advisory body
- Establish editorial policies regarding terms, including scope, definitions and application
- Oversee the community engagement, term contribution and procedures
- Set priorities for ongoing development of the vocabulary
- Engage the community in developing and maintaining documentation; oversee the process
- Facilitate the dissemination and collocation of training resources, documentation, and guidance for use
- Establish framework for testing of interfaces and services by the community
- Recommend and prioritize directions and goals for development/improvements
- Provide recommendations for tools and conversion services when appropriate
- Inform strategic directions to support the FAST business model
- Provide or facilitate outreach, promotion and marketing to different communities as appropriate
- Establish and oversee working groups as needed to accomplish these or other tasks
Activities so far include meeting with the OCLC FAST team, learning more about delivery, maintenance, and conversion; taking a field trip to LC with a goal of learning about LCSH/SACO and the potential bidirectional relationship between FAST and LCSH, and outreach to ALA and IFLA.
FAST has been considered experimental for 20 years. The target for a production server is now March 2019 (production mode offers 24/7 support from OCLC). A FAST advantage is the ability for implementers to select the desired level of granularity in a vocabulary compatible with LCSH. Current implementations include the British Library for materials too expensive to do LCSH work, Brown University for dissertations, and University of Michigan for subject knowledge cards. Desires from the implementation community thus far include the capability to propose new terms or add scope notes, an enhanced update mechanism, the ability to insert headings, and convertibility into/from other vocabs. Next steps include developing features and priorities to use as a business case for future development of FAST, and developing a list of requirements for a minimally viable product. Attendees are encourage to get involved by joining the email list, submitting comments, or applying for a working group or even the FPOC.
Part 2: Roundtables
Following the update, the audience split up into small groups to participate in roundtable discussions. Groups then summarized their discussions for the reconvened full audience. Discussions raised more questions than they answered.
- Prompt: Faceted vocabularies, do they matter to serials?
- Not consistent: some institutions add for serials, some do not use for serials
- Discovery layer does not necessarily mean serials LCGFT are useful, because discovery layer already provides access to serials; others maintain and use
- If LCSH $v and LCGFT that are the same, do catalogers apply both?
- Prompt: Implementing faceted vocabularies in digital repositories
- Varies by institution, platform, planned conversion or not to MARC
- Considerations of what is a user-friendly display (facet vs full string)
- Prompt: Evaluating the use of faceted subject terminology in a cataloging environment that lacks a discovery layer
- What does it mean to “turn on” access to facets/terms?
- Indexing and display discussions instead of discussing facet configuration
- Retrospective addition can seem fruitless but if you migrate to a rich discovery layer, the data will be there
- Vendors can provide some terms
- Prompt: Practical and scalable approaches to implement new facets in discovery systems
- What is the threshold of records at which point it makes sense to turn on?
- Retrospective options?
- Potential algorithms by discipline
- PCC/OCLC cooperation could offer potential to work through OCLC
- U Chicago: adding FAST retrospectively to local cat (Kuali OLE); adding and using as facet options but not displayed because subject specialists did not want to see terms duplicated in single bibliographic records
- Prompt: Genre form terms in cataloging
- Multiple vocabs (LCGFT, RBMS, AAT, etc.): How do libraries prioritize usage?
- Question about overarching policies
- Most commonly used: LCGFT, AAT, FAST
- Discussion of duplication between FAST and LCGFT; depends on discovery layer
- Adding terms is easy and helpful, depending on the discipline
- Public libraries use local terms for collections, e.g., anime or manga
- Decisions about which thesaurus to use; depends on collection
- Decisions about retrospective work: if, how much, impact of turning on display, facet, etc.
- Everything about FAST
- Mostly questions for Judy Jeng
- What does “production” mean to OCLC?
- How does it work for specialized libraries like art, medical?
- Can you convert in both directions?
- Does FAST inherit philosophy of LCSH related to specificity?
- Prompt: Retrospective conversion of FAST/Adding FAST to local catalogs
- Using FAST converter locally vs updating SH in OCLC for future algorithm to generate properly (local vs. network level decisions)
- Display and indexing decisions – institutional approaches vary
- Delete vs. leave in
- You can obtain MARC records from HathiTrust by changing the end of a record URL to “.mrc” (e.g. https://catalog.hathitrust.org/Record/0123456/Home to
- Article recommendation: Nelson & Turney, “What’s in a Word: Rethinking Facet Headings in a Discovery service,” ITAL 34/2 (2015). https://doi.org/10.6017/ital.v34i2.5629
ALCTS Camms heads of cataloging departments interest group
Meeting date: Monday, January 28. There were two presentations. As of this writing, slides have not yet been posted to ALA Connect.
“One Record at a Time: Simply Starting Linked Data at a Mid-Sized University”
Jodene R. Pappas, Stephen F. Austin State University
The main message is: you can start from zero. Pappas described a learning-by-doing journey to apply linked data using a local collection from the perspective of a traditional cataloger at a mid-sized university library. The papers of former U.S. Representative Charles Wilson were selected for the linked data pilot project. In order to underscore the importance of the project (and metadata in genera), Pappas matched the project to goals in the university’s strategic plan, specifically, fostering innovation in reaching students, increasing connections for a stellar learning experience, and embracing strengths–in this case, the library. Increased discoverability meets searchers where they begin (i.e., the web), keeps the library relevant, and connections collections to students and faculty.
The Wilson collection contains unique materials, which may benefit the most from increased access through enhanced data. Basic conceptual questions framed this project: Why do we organize things, or, why catalog? Where do our users search first? If users aren’t finding our resources, what is the point? The collection includes data in multiple formats and locations: MARC and finding aids, the university’s catalog and the public library’s (in a shared catalog). Beginning steps were to transform MARC to MARC XML, then using the LC BIBFRAME editor to transform the MARC XML records and the EAD XML records. Participants then explored hosting possibilities and links to URIs. Future work will get these records to the web. Assessment is included in this project: they will compare usage of the pilot collection before, after, and over time. Each step afforded learning opportunities and challenges. The Linked Data Competency Framework and the Competency Index for Linked Data served as learning tools, and they used relevant ALA e-courses. Challenges including time, the learning curve, money, and a lack of urgency. Recommendations and steps are: 1) start where you are, 2) focus on learning, 3) choose a unique collection, 4) try your own project, and 5) figure out the answers.
“African Academic Print Journal Project: Producing and Sharing Article-Level Metadata for Print-Only African Academic Journals”
Erin Freas-Smith, Library of Congress
The Africa Section of the African, Latin American & Western European Division, Acquisitions and Bibliographic Access Directorate at the Library of Congress embarked upon a pilot project in 2018 to explore the value and practicality of creating article-level metadata for the Library’s vast collection of print-only African journals. A 2017 study by MSU indicated that article usage in Africa remains primarily print-based. Many published articles contain local case studies and are rich research resources, but the only way to identify and locate them is by examining an entire print run. The utility of databases may be diminished by technology trends in Africa toward handheld devices.
Similar projects and predecessors to LC’s project include LAPTOC (Latin American Periodicals Tables of Contents), now part of LAARP (Latin Americanist Research Resources Project), archived at Vanderbilt; SALToC (South Asian Language Journals Cooperative Table of Contents Project), hosted at NYU; the Quarterly Index to African Periodical Literature (ceased 2011); and the LC Handbook of Latin American Literature.
The Library’s holdings represent 54 countries (24 with former forms of name) and approximately 36,000 journal titles, primarily from Nigeria, South Africa, Egypt, and Kenya. Journals handled by the DC office were selected for the project, reducing the scope to approximately 13,000 titles form 15 countries. Publication irregularities affect statistics, with many journal titles having fewer than 4 issues. Irregular schedules and numbering inconsistencies further complicate matters. New materials coming in to the office are flagged in MARC 955$a with text indicating the project.
Once received, the work is done by interns, primarily undergraduate students. They are responsible for:
- Scanning covers and tables of contents pages
- Typing article metadata into a spreadsheet
- Maintaining a master list of titles and volumes held and completed, which will be posted online, where other institutions can fill in gaps
Intern training takes approximately a half-day per student, and students are able to customize their workflows as long as basic parameters (such as file naming) are met. Students are encouraged to suggest further training ideas or needs. Students work independently with a librarian on hand for assistance. Lessons learned from the first group of students are that students need to physically move (breaks from spreadsheets) and that librarians had to check work more often than they had anticipated. A resulting benefit was enabling the serials cataloger to fully catalog titles using the information input by the students. There will be four interns working in the summer of 2019, continuing the work from the previous year.
The question of access remains. Without a database, what are options for providing this information? Perhaps ILS integration for ceased titles, the MSU database, or something yet to be discovered. Backup or alternate capture methods, such as RefWorks, Mendeley, etc. are under consideration.
Q: NACO work for authors?
A: No. These articles are often the authors’ only publication, NACO requires much work.
Q: Have you considered serving as an agent for DOI registration? That would get info into an ILS or CrossRef.
A: Wil investigate. LC did receive an inquiry from CRL to serve their data, but LC does not want a paywall. Researchers in Africa should be able to access the data.
Q: Did OCR work?
A: That was the first thing they tried. Technology isn’t where it needs to be to work. Problems worse because of poor paper quality and different scripts. They are scanning TOCs in hopes that it will work someday.
Q: Challenges with foreign language materials or diacritics?
A: For now, working on English. Anything Unicode works, including Arabic.
Q: Criteria for hiring interns?
A: Interest in Africa and/or technical aptitude.
Q: Unexpected suggestions from interns?
A: Mostly tech-related, mostly small suggestions that they will incorporate next year.
Angela Kinney (LC), Interest Group co-chair, closed the session with a tantalizing list of activities LC is preparing for ALA Annual 2019. Anyone with suggested topics for Annual should contact Angela, or David Van Kleeck. Slides from this session will be posted to ALA Connect.
SUBJECT ANALYSIS COMMITTEE FORUM
Meeting date: Monday, January 28. There were two presentations. As of this writing, slides have not yet been posted to ALA Connect.
“Intersection of Subject Headings and Linked Data”
Jodi Williamschen, Library of Congress
RDA, even the Beta version, does not offer much guidance on subject. However, one can relate the four recording methods to subject elements in MARC:
- Unstructured. MARC-653–any uncontrolled term
- Structured. Authorized Access Point in MARC with an indication of thesaurus
- Identifier. MARC with URI in $0
- IRI. $1 in MARC
LC subjects are available in linked data using MADS schema, obtainable from id.loc.gov. Since records include export options including RDF-XML, users can also bulk export the entire file.
BIBFRAME defines “subject” as “subject describing a resource.” It also includes a separate set of properties and classes for genre/form. The conversion software generally maps MARC 650 to subject and MARC 655 to genre/form, and maps byte values in MARC 007 and 008 bytes to individual genre forms, e.g. 008/33 to fiction. BIBFRAME editor offers three options for subject input/output:
- Subject string lookup to search LCSH and NAF; searches as you start typing; punctuation is indexed per wishes of the developers. Output: one URI.
- Search each component individually; form is searching subdivision records not indexed in BIBFRAME. Output: multiple URIs; order is preserved.
- Directly input a label. Output: text string, no URI.
Future considerations now that editor, ontology, and pilot have reached a point of stability at LC:
- Geographic headings. The same record with the same URI could display as “Oakland (Calif.)” or “California–Oakland”. Is this a change on the input side? Display change? Better left for post-MARC?
- Faceting and loss of context. For example, a set of four facets from a single LCSH string can have different meanings, and without a meaningful order, it is unclear. Unordered, these could represent a history of Germany law dictionaries or a dictionary of German law history:
- Dictionaries | German | History | Law
- Form and genre as subdivisions. Do they remain subject strings or separate headings?
- Agents as subjects
Q: Can you represent component concepts in single URI?
A: This will come up in BIBFRAME to MARC conversion, finding a happy medium between some that need to be one string, others that can be represented by components. Will be an interesting challenge.
“Improving Subject Access to Resources on Oregon Indian Tribes”
Richard E. Sapon-White, Oregon State University Libraries
Inspired by the 2017 ALCTS preconference on incorporating diversity, equity, and inclusion into all aspects of technical services, Oregon State University (OSU) embarked on a project to ensure that library resources about Oregon Indian tribes are discoverable in online systems. The project has two parts: preparing LC subject proposals to establish headings, and analyzing subject headings in bibliographic records for resources about Oregon Indian tribes for accuracy and specificity to add or remove headings as appropriate.
Background information: There is a difference between pre-contact tribes (number approximately 500) and federally recognized tribes (nine, some of which consist of many tribes). Federally recognized tribes are established as corporate names in the Name Authority File; historical tribes are established as ethnic groups in LCSH as “[Tribe] Indians.”
To identify candidate names, OSU catalogers compiled a list of 55 tribes and bands based on reference sources, dating from the 1907 Handbook of American Indians North of Mexico through Wikipedia entries. Of those 55, 19 were not established in LCSH. Next, catalogers searched WorldCat to see if there were works about those groups. Challenges included location of term (keyword/title), variant spellings, and recall for names that are also names of geographic areas. The goal was to find at least one record for each. All retrieved records were printed to prepare for the second phase of adding and possibly deleting subject headings. From their WorldCat search, they identified 13 names with sufficient warrant for LCSH proposals. Many have extensive UF terms, with 25 for the Cathlamet Indians. Preempting a frequently asked question, OSU did contact several tribes and some responses were cited in the proposals. In one case, the name used by the tribe in the proposal was not retained as the preferred heading (Applegate Indians vs. LCSH Dakubetede Indians). There are six remaining proposals, including for the Willamette Valley Indians (Kalapuya Indians), where some names will be proposed as established narrower terms instead of UF references, and the Indians on the Warm Springs reservation (Tenino).
The second phase, record review, enriched 41 records. Many lacked specificity, using “Indians of North America– Oregon” to describe works about one specific tribe. Others had errors for related tribes. Records were corrected and headings added as they were established. Next steps, they will do another round of searches for resources not found the first time, and revise bib records for the newer proposals. With the procedure established, they can repeat the process for other states or countries.
- This sounds like the Wikimedia movement’s Wikidata project on indigenous peoples.
- Language proposals will be coming from University of Washington. Another project possibility might be on languages in the Pacific Northwest.
- There is a similar project at University of Alberta.
- Plug for the LIAPA [Latin American and Indigenous Peoples of the Americas] SACO funnel, a model where people with subject expertise will help catalogers and vice versa.
- Janis Young, LC: PSD at LC encourages people embarking on projects to please let LC know [ahead] so they [PSD] understand context and sources. This helps them look at proposals in total. Proposals that are submitted as a batch can end up being received separately. Please collaborate with LC if you are working on projects, it makes them easier to approve. Generally, email email@example.com and indicate it is a project that you want PSD to be aware of.
Q: Have you collaborated with collection development people to add material to collections?
A: Impression is that not much else has been written but can pursue or correspond with tribal libraries.
Q: How successful were you getting answers from tribes?
A: The Confederated Tribes of Grand Ronde sent a reply from their cultural affairs department that supplied the official names of component tribes. Inquiries to other tribes, no responses. Some reliance on tribes’ websites as indication of preferred names.
Q: Any concerns about using older resources for names?
A: If it is the older handbook and there is additional justification from websites or other current information from the tribe, not a concern.
Q: What happens when this gets really complicated?
A: Some are saved for later. Names and organizational characteristics vary. There is not always a clear distinction between dialects and languages, the idea of boundaries pre-contact are not the same, and historically a name might have been a village and a language.