Batch Processes introduction & links
First very rough draft of general links and tasks useful to Batch & Sierra regularly occurring duties
started 9/17/2023; updating continually
See also: Working schedule of tasks
General Principles
Most batch files are for e-resources, but some batch records for tangibles are also loaded: ((print DDA), vendor shelf-ready books (usually loaded by Acquisitions staff), government documents and map files, (loaded by Government Information staff)) |
|
|
|
|
|
|
|
Notes for general principles:
Track each file as soon as possible
Use the number of records in a file as a quality check for each batch file for quality control
if that number changes after a batch processes, then something probably went wrong, and the file or the process needs to be checked for problems
These can be as simple as duplicate bibs in the file, but at other times, may indicate one or more broken fields/records, missing fields/subfields, extra subfield marker(s), random dollar signs, and other more tricky to find errors,
Avoid at all costs, loading files out of order (generally, this is by the date in the file name but some dates will need to be added as you are downloading them)
If possible, keep up with the weekly loads, to keep the possibility or error to a minimum
For most collection batch files, when the files have the same date; first load new/additions files, then delete files (there are exceptions noted in the these workflows); but, always check the date, and load files with earliest date before the latest.
When loading into Sierra: always Use Review Files; then Test the batch, and if all looks well, then Load.
001 fields are important in Sierra as a management tool: each MARC record in a batch file needs some kind of unique control number that is not duplicated
(we may need to overlay or pull entire collections out) - some collections will already have good 001’s as as supplied by the vendor; and all current batch collections should already have documented methods to make unique 001 numbers.
For any completely new batch collection, where the 001 fields are missing or contain bare numbers, we will need to make a method for attaching a new, non-duplicating prefix or create a method for a 001; and add it to a new workflow for the collection.
Hooks: a series of fields and codes are applied to each batch collection, making them easier to batch edit or for other manipulations, or to remove if we loose access to the group
The current master document of hooks is kept in USU’s Box files; updates will need to be added to the document, and distributed to various areas of the library (such cataloging, LIT, etc.), as old batch collections are changed, deaccessioned, or new ones are added.
The most current Master List E-Resource Hooks is kept here: https://usu.app.box.com/file/1326778691763.
Field 856 URL will usually need a special prepend added (exceptions are noted in the workflows) and a public note (as co-developed by our Public Services Librarians and Cheryl)
General useful links:
LC’s Understanding MARC: https://www.loc.gov/marc/umb/um01to06.html https://www.loc.gov/marc/umb/um07to10.html https://www.loc.gov/marc/umb/um11to12.html
OCLC MARC: https://www.oclc.org/bibformats/en.html
MARC21 Library of Congress: https://www.loc.gov/marc/bibliographic/
MarcEdit general tutorials: https://marcedit.reeset.net/tutorials
MarcEdit 101 CARLI tutorials: https://marcedit.reeset.net/marcedit-101-workshop
Training plan for Discovery Services work:
9/17/2023-12/20/2023
Book Copy Cataloging and MARC refresher:
Copy Cataloging manual: Copy Cataloging Training (Circulating Materials – Library of Congress Classification)
Sierra overview: Sierra Procedures
Start with single record cataloging for Wageningen titles, then weekly record loads of Cambridge EBA, Taylor & Francis EBA, YBP discovery records and delete files, and Safari/O'Reilly subscription, and then move on to the more complex Academic Complete update records and etc.
(One thing to remember is to watch the dates on the ebook discovery files and deletes and make sure you load them in order
Also, with 1 exception, don't load deletes until you have loaded the new discovery records with earlier/identical dates)
Make sure you use the tracking sheet versions in the “Current” folder
Preliminary training list with trackers in Box and workflows in Confluence:
Wageningen: cataloged individually & tracked against the purchase invoice & Wageningen site itself.
* a-Cambridge EBA: https://usu.app.box.com/file/1318202855178
workflow:2-Cambridge EBA YBP/GOBI (weekly) workflows 2
b-Taylor & Francis EBA: https://usu.app.box.com/file/1318195064664
YBP JSTOR/Proquest discovery records (and deletes): https://usu.app.box.com/file/1321317102096
workflow:
Deletes for GOBI-YBP-JSTOR or Proquest (some overlap with 4, below)
pDDA deletes via email from Kevin and Tyler (Cheryl hasn’t tracked; I’ve been just keeping delete statistics counted as single items)
Safari/O’Reilly subscription: https://usu.app.box.com/file/1318206846659
Streaming video emails from Gaby: https://usu.app.box.com/file/1323615493212 (keep up with regular emails of adds/edits/deletes)
workflow started here: Streaming Video Record (multiple) processes
Academic Complete update records--part of https://usu.app.box.com/file/1321317102096
University Press of Colorado and USU press ebooks: https://usu.app.box.com/file/1321432292725
Follett (Moore books; create lists used to create order records by Acq staff) - https://usu.app.box.com/file/1323623260743
other, more difficult batch processing and duties: https://usu.app.box.com/folder/227422955517
Further note from Cheryl: just fyi, I'm trying to include any tracking files you might need eventually. A lot of what I'm adding currently are NOT pressing and could be ignored until you have a day where you want a challenge.
WEST archived and non-archived annual uploads - tracking in Workflow
upload workflow: https://usulibrary.atlassian.net/wiki/spaces/ULC/pages/1668153345
archived edits workflow: https://usulibrary.atlassian.net/wiki/spaces/ULC/pages/1670512662
Sierra weeding, moving, and batch cataloging projects
sample workflow uploaded: https://usulibrary.atlassian.net/wiki/spaces/ULC/pages/1728512020
see also batch projects for examples
see also list of tasks above (some will be LIT tasks--discuss?)
see also Annual Sierra reports (Discovery Librarian or LIT discussion?)
The former Integrated Systems librarian, Cheryl’s, documentation and Tracking of files will remain in Box here: https://usu.app.box.com/folder/227209648067?s=q531024vtone9ogkwa6cv2uh41sq56qx and here for less common files and tasks: https://usu.app.box.com/folder/227422955517
FTP folders overview:
There are two different logins for ftp://ftp.ybp.com and you should see different folders in each.
For UALC, there are just 2 folders:
(login ualc, P#37sb)
dda (ebook discovery records for the UALC demand-driven acquisitions we share with the U of Utah)
ecat (MARC records for UALC dda purchased titles. Not usually needed because we already have the discovery record in Sierra and we just edit the hooks in that record)
For USU (utahstate), there are more folders but just two that you will use
(login utahstate, M&45rt)
dda (ebook discovery records for the USU-only demand-driven acquisitions)
eba (ebook discovery records for the Cambridge and the Taylor&Francis evidence-based acquisitions)
These other folders are used by Tyler only, with one exception:
ecat
export - this was for the print demand-driven acquisitions program that has been halted since Robert left
orders
ordrsp
Cheryl Adams Batch/Sierra list of tasks:
Original Integrated Systems Librarian’s list of tasks
Cheryl Adams – Integrated Systems Librarian work and responsibility list, October 31, 2023
Essential skills:
Communication and collaboration
Extreme logical thinking and attention to detail
Knowledge of cataloging and MARC records and the structure of the Sierra database
Routinely work closely with CMS/Cataloging, CMRS, Circulation, and often Government Information.
Manage MARC records in the Sierra database for numerous resources.
Includes initial configuration/customization, ongoing batch loading of discovery records and files of deleted titles, processing records for purchased titles for some services.
A Discovery Services Librarian will be much more focused on dealing with small to large-bulk records for purchases, usage from Sierra, and Sierra batch weeding, moving, and retro-batch cataloging projects (depending on skill set)
Demand-driven/patron-driven/evidence-based acquisitions
Ebook EDA, DDA via YBP/GOBI: discovery records for Cambridge/Taylor&Francis, USU (Proquest and JSTOR) and UALC shared DDA (ProQuest) and purchased titles from UALC account.
Project MUSE ebooks: discovery records and purchased titles
Print DDA via YBP/GOBI: discovery records and occasional manual deletion
Purchased/perpetual ebooks and print books
Follett MARC records for Edith Bowen shelf ready.
University Press of Colorado ebooks (MARC files from CSU Library)
Subscription collections
ProQuest Academic Complete ebooks: monthly additions and deletions files
AVON streaming video: monthly additions & deletions files and annual purchased titles
http://Psychotherapy.net streaming videos
Licensed streaming videos
Manage Sierra catalog records for streaming videos. Work closely with Gaby on all streaming video titles.
Add records from multiple vendors for new purchases, track license expirations, delete/suppress expired titles.
Edit renewed titles.
Generate reports of expiration dates for Gaby.
Provide data from the Sierra database
Weekly SQL report of items checked out in Sierra from SCA. (Tracks usage for Jennifer.)
SQL report for Circulation staff of in-office renewal due dates and individual faculty checkout reports.
Annually, run Sierra Reports to pull extensive collection data for the ACRL/IPEDS survey. Manipulate and compile complex data to meet established criteria.
Annually, provide check-out data from the entire Sierra database. Includes system-wide data and numerous individual reports as requested by branch libraries and individual collection librarians.
Create reports such as shelf lists for various collections, circulation data, format-based reports. Craft complex queries to pull the requested data, often requiring deep knowledge of the MARC record structure and local cataloging practices and the details of the Sierra database.
Create detailed reports as above for collection weeding and moving projects. Collaborate with Circulation, Cataloging, and Collections on moving/weeding projects. Batch delete weeded titles, batch update moved collections, provide extensive documentation for the entire project.
Sierra/WebPac/Encore management (some or all of this will be now be done by LIT systems)
Sys-Admin level annual Rapid Update for roll-over of Year-To-Date checkout numbers to Last-Year-Circ field. (Provide circulation reports as mentioned above as part of this process.)--Joe or LIT?
Create, edit, and delete Sierra user accounts. LIT
Edit load tables when needed. (Rarely needed now.)
Occasional tweak to WebPac configuration (and very occasionally, Encore.)LIT
Monitor the Sierra discussion list?
Open Innovative support tickets when issues or questions arise.LIT
Identify and correct issues and problems in the database. Provide solutions for requested improvements and changes.LIT
What statistics should new Discovery Services Librarian (temp. and permanent) collect:
From: Liz Woolcott
To: Melanie Shaw; Becky Skeen
snip>>>>>>>
Yes, Cheryl was reluctant to put stats in our joint base – mostly because she kept them differently. I am very appreciative that you are thinking about how to build them in. It has been the missing piece we have needed for a long while to demonstrate the cataloging work done in the library.
I’ve included my suggested answers to your questions below in red. Becky, what do you think?
Thanks again for all your deep thinking on this Melanie!
Liz
From: Melanie Shaw <melanie.shaw@usu.edu>
Sent: Thursday, November 2, 2023 2:50 PM
To: Liz Woolcott <liz.woolcott@usu.edu>; Becky Skeen <becky.skeen@usu.edu>
Subject: Discovery Services batch statistics
I asked Cheryl about this, but it turns out she did not keep these kind of stats <<<
I have been keeping my own statistics of all the batch files, which come in New, Overlay (changes), and Deletes, usually (with the Deletes being just a variation of the Overlay load, that I then use to Batch Delete the records – usually “discovery” eBooks that we never owned, from the patron driven delivery loads).
So, there’s several questions I have about what statistics the Discovery Services Librarian (and me, in the meantime) should keep.
First, I have been counting the Delete loads as records loaded, but I then remove them. So, maybe a new category? Or possibly not, since we no longer base our numbers of things in the catalog from our cataloging stats. Yes! These definitely need to be a different category. You can use the “MARC Batch-Deleted Records” in the stats. That is what Carol has been using for her batch removed records.
I also have gotten a steady stream of emails with single deletions to be done manually, where I simply erase a PDDA record that we have gotten in another format. I’ve counted those as part of the batch stuff, since again, I haven’t cataloged them individually (just deleted them). This one is a hard one. The closest thing we have is the “Sierra-Items Deleted” column. A while back, Barb made the call to collapse the “titles deleted” and the “items deleted” into one category to make it easier to record weeding projects. We decided the larger number was probably more useful. The PDDA records are odd because we never owned the item, it was always just a discovery record. So, you have a few options to consider: 1) Could make a new category for “records deleted” so that the PDDA number could be reported more accurately (but it might be complicated by the fact that Barb doesn’t report records deleted for her work – and assumptions might be made that this field includes all records deleted) or 2) you could use the “Sierra-Items Deleted” for these few one offs (complication is that this could be interpreted as materials deaccessioned) or 3) could create its own category for “PDDA records deleted” or 4) Option I can’t think of at the moment… 😊
I also download individual records from the streaming video sites for Gaby, again these are vendor records, and I edited them the same as the batch records. So, I could count them as individually cataloged titles, but I don’t do much check (except for the link) and I add in the “hooks”, so I’ve been uncertain how to count them. Right now they are in my Batch stats, because, vendor records and process is the same. I would recommend “Genstacks-CopyCat-Electronic” for this. As long as you are doing them one-by-one that is. If you are pulling them individually but compiling them into groups to manipulate and load records, then “GenStacks-Batch-Modified”.
Melanie
Melanie will add her statistics method to each workflow as soon as possible… but we should probably also discuss with new hire for ideas, methods, etc.
For historical use & older tracking:
Trello website: Historical procedures & notes, but contains earliest logs of Batch record loads
https://trello.com/b/CWYisa6s/cheryl-and-melanie-batch-record-loading