/
Batch Processes introduction & links

Batch Processes introduction & links

First very rough draft of general links and tasks useful to Batch & Sierra regularly occurring duties

started 9/17/2023; updating continually

See also: Working schedule of tasks

General Principles

Most batch files are for e-resources, but some batch records for tangibles are also loaded: ((print DDA), vendor shelf-ready books (usually loaded by Acquisitions staff), government documents and map files, (loaded by Government Information staff))

  1. Track each batch file as early in the processes as possible

    • I copy/paste each title into the tracker as soon as I download them

  1. Use the number of records in a file as a quality check thru out the processing and loading

  1. Avoid loading a vendors collection’s files out of date order:

    • Most collections need New files loaded first, then Deletes (if date is the same), but exceptions are noted in their workflows

  1. In Sierra, always: lfts, prep, Use Review Files, then Test, and only if it looks okay: Load

  1. Each vendor/collection needs a unique 001 field for future manipulation as a group

  1. Hooks: each collection has a set of fields and codes

  1. Field 856 will usually need a special prepend added to the URL (exceptions noted in workflows) and a public note (as noted in workflows)

Notes for general principles:

  1. Track each file as soon as possible

  2. Use the number of records in a file as a quality check for each batch file for quality control

    • if that number changes after a batch processes, then something probably went wrong, and the file or the process needs to be checked for problems

    • These can be as simple as duplicate bibs in the file, but at other times, may indicate one or more broken fields/records, missing fields/subfields, extra subfield marker(s), random dollar signs, and other more tricky to find errors,

  3. Avoid at all costs, loading files out of order (generally, this is by the date in the file name but some dates will need to be added as you are downloading them)

    • If possible, keep up with the weekly loads, to keep the possibility or error to a minimum

    • For most collection batch files, when the files have the same date; first load new/additions files, then delete files (there are exceptions noted in the these workflows); but, always check the date, and load files with earliest date before the latest.

  4. When loading into Sierra: always Use Review Files; then Test the batch, and if all looks well, then Load.

  5. 001 fields are important in Sierra as a management tool: each MARC record in a batch file needs some kind of unique control number that is not duplicated

    1. (we may need to overlay or pull entire collections out) - some collections will already have good 001’s as as supplied by the vendor; and all current batch collections should already have documented methods to make unique 001 numbers.

    2. For any completely new batch collection, where the 001 fields are missing or contain bare numbers, we will need to make a method for attaching a new, non-duplicating prefix or create a method for a 001; and add it to a new workflow for the collection.

  6. Hooks: a series of fields and codes are applied to each batch collection, making them easier to batch edit or for other manipulations, or to remove if we loose access to the group

    • The current master document of hooks is kept in USU’s Box files; updates will need to be added to the document, and distributed to various areas of the library (such cataloging, LIT, etc.), as old batch collections are changed, deaccessioned, or new ones are added.

    • The most current Master List E-Resource Hooks is kept here: https://usu.app.box.com/file/1326778691763.

  7. Field 856 URL will usually need a special prepend added (exceptions are noted in the workflows) and a public note (as co-developed by our Public Services Librarians and Cheryl)


General useful links:

LC’s Understanding MARC: https://www.loc.gov/marc/umb/um01to06.html https://www.loc.gov/marc/umb/um07to10.html https://www.loc.gov/marc/umb/um11to12.html

OCLC MARC: https://www.oclc.org/bibformats/en.html

MARC21 Library of Congress: https://www.loc.gov/marc/bibliographic/

MarcEdit general tutorials: https://marcedit.reeset.net/tutorials

MarcEdit 101 CARLI tutorials: https://marcedit.reeset.net/marcedit-101-workshop


Training plan for Discovery Services work:

9/17/2023-12/20/2023

Book Copy Cataloging and MARC refresher:

Copy Cataloging manual: Copy Cataloging Training (Circulating Materials – Library of Congress Classification)

Sierra overview: Sierra Procedures

Start with single record cataloging for Wageningen titles, then weekly record loads of Cambridge EBA, Taylor & Francis EBA, YBP discovery records and delete files, and Safari/O'Reilly subscription, and then move on to the more complex Academic Complete update records and etc.

  • (One thing to remember is to watch the dates on the ebook discovery files and deletes and make sure you load them in order

  • Also, with 1 exception, don't load deletes until you have loaded the new discovery records with earlier/identical dates) 

  • Make sure you use the tracking sheet versions in the “Current” folder

Preliminary training list with trackers in Box and workflows in Confluence:

  1. Wageningen: cataloged individually & tracked against the purchase invoice & Wageningen site itself.

  2. * a-Cambridge EBA: https://usu.app.box.com/file/1318202855178

  3. YBP JSTOR/Proquest discovery records (and deletes): https://usu.app.box.com/file/1321317102096

    • workflow:

    • Deletes for GOBI-YBP-JSTOR or Proquest (some overlap with 4, below)

  4. pDDA deletes via email from Kevin and Tyler (Cheryl hasn’t tracked; I’ve been just keeping delete statistics counted as single items)

  5. Safari/O’Reilly subscription: https://usu.app.box.com/file/1318206846659

  6. Streaming video emails from Gaby: https://usu.app.box.com/file/1323615493212 (keep up with regular emails of adds/edits/deletes)

  7. Academic Complete update records--part of https://usu.app.box.com/file/1321317102096

  8. University Press of Colorado and USU press ebooks: https://usu.app.box.com/file/1321432292725

  9. Follett (Moore books; create lists used to create order records by Acq staff) - https://usu.app.box.com/file/1323623260743

  10. other, more difficult batch processing and duties: https://usu.app.box.com/folder/227422955517

    • Further note from Cheryl: just fyi, I'm trying to include any tracking files you might need eventually.  A lot of what I'm adding currently are NOT pressing and could be ignored until you have a day where you want a challenge. 

  11. WEST archived and non-archived annual uploads - tracking in Workflow

  12. Sierra weeding, moving, and batch cataloging projects


FTP folders overview:

 

There are two different logins for ftp://ftp.ybp.com and you should see different folders in each.

 For UALC, there are just 2 folders:

  • (login ualc, P#37sb)

  •  dda (ebook discovery records for the UALC demand-driven acquisitions we share with the U of Utah)

  •  ecat (MARC records for UALC dda purchased titles.  Not usually needed because we already have the discovery record in Sierra and we just edit the hooks in that record)

 

For USU (utahstate), there are more folders but just two that you will use

  • (login utahstate, M&45rt)

  • dda (ebook discovery records for the USU-only demand-driven acquisitions)

  • eba (ebook discovery records for the Cambridge and the Taylor&Francis evidence-based acquisitions)

 These other folders are used by Tyler only, with one exception:

  • ecat

  • export - this was for the print demand-driven acquisitions program that has been halted since Robert left

  • orders

  • ordrsp

 

 

Cheryl Adams Batch/Sierra list of tasks:

Original Integrated Systems Librarian’s list of tasks

  • Cheryl Adams – Integrated Systems Librarian work and responsibility list, October 31, 2023

Essential skills:

  • Communication and collaboration

  • Extreme logical thinking and attention to detail

  • Knowledge of cataloging and MARC records and the structure of the Sierra database

  • Routinely work closely with CMS/Cataloging, CMRS, Circulation, and often Government Information.

Manage MARC records in the Sierra database for numerous resources. 

Includes initial configuration/customization, ongoing batch loading of discovery records and files of deleted titles, processing records for purchased titles for some services.

  • A Discovery Services Librarian will be much more focused on dealing with small to large-bulk records for purchases, usage from Sierra, and Sierra batch weeding, moving, and retro-batch cataloging projects (depending on skill set)

  1. Demand-driven/patron-driven/evidence-based acquisitions

    1. Ebook EDA, DDA via YBP/GOBI: discovery records for Cambridge/Taylor&Francis, USU (Proquest and JSTOR) and UALC shared DDA (ProQuest) and purchased titles from UALC account.

    2. Project MUSE ebooks: discovery records and purchased titles

    3.  Print DDA via YBP/GOBI: discovery records and occasional manual deletion

  2.  Purchased/perpetual ebooks and print books

    1. Follett MARC records for Edith Bowen shelf ready.

    2. University Press of Colorado ebooks (MARC files from CSU Library)

  3.  Subscription collections

    1.  ProQuest Academic Complete ebooks: monthly additions and deletions files

    2. AVON streaming video: monthly additions & deletions files and annual purchased titles

    3. http://Psychotherapy.net streaming videos

  4. Licensed streaming videos

    1. Manage Sierra catalog records for streaming videos.  Work closely with Gaby on all streaming video titles.

    2. Add records from multiple vendors for new purchases, track license expirations, delete/suppress expired titles. 

    3. Edit renewed titles. 

    4. Generate reports of expiration dates for Gaby.

  5. Provide data from the Sierra database

    1. Weekly SQL report of items checked out in Sierra from SCA.  (Tracks usage for Jennifer.)

    2. SQL report for Circulation staff of in-office renewal due dates and individual faculty checkout reports.

    3. Annually, run Sierra Reports to pull extensive collection data for the ACRL/IPEDS survey.  Manipulate and compile complex data to meet established criteria.

    4. Annually, provide check-out data from the entire Sierra database.  Includes system-wide data and numerous individual reports as requested by branch libraries and individual collection librarians.

    5. Create reports such as shelf lists for various collections, circulation data, format-based reports.  Craft complex queries to pull the requested data, often requiring deep knowledge of the MARC record structure and local cataloging practices and the details of the Sierra database. 

    6. Create detailed reports as above for collection weeding and moving projects.  Collaborate with Circulation, Cataloging, and Collections on moving/weeding projects.  Batch delete weeded titles, batch update moved collections, provide extensive documentation for the entire project. 

  6. Sierra/WebPac/Encore management (some or all of this will be now be done by LIT systems)

    1. Sys-Admin level annual Rapid Update for roll-over of Year-To-Date checkout numbers to Last-Year-Circ field. (Provide circulation reports as mentioned above as part of this process.)--Joe or LIT?

    2. Create, edit, and delete Sierra user accounts. LIT

    3. Edit load tables when needed. (Rarely needed now.)

    4. Occasional tweak to WebPac configuration (and very occasionally, Encore.)LIT

    5. Monitor the Sierra discussion list?

    6. Open Innovative support tickets when issues or questions arise.LIT

    7. Identify and correct issues and problems in the database.  Provide solutions for requested improvements and changes.LIT



What statistics should new Discovery Services Librarian (temp. and permanent) collect:

From: Liz Woolcott

To: Melanie Shaw; Becky Skeen

snip>>>>>>>

Yes, Cheryl was reluctant to put stats in our joint base – mostly because she kept them differently.  I am very appreciative that you are thinking about how to build them in.  It has been the missing piece we have needed for a long while to demonstrate the cataloging work done in the library.

I’ve included my suggested answers to your questions below in red. Becky, what do you think?

Thanks again for all your deep thinking on this Melanie!

Liz

From: Melanie Shaw <melanie.shaw@usu.edu>
Sent: Thursday, November 2, 2023 2:50 PM
To: Liz Woolcott <liz.woolcott@usu.edu>; Becky Skeen <becky.skeen@usu.edu>
Subject: Discovery Services batch statistics

 

I asked Cheryl about this, but it turns out she did not keep these kind of stats <<<

I have been keeping my own statistics of all the batch files, which come in New, Overlay (changes), and Deletes, usually (with the Deletes being just a variation of the Overlay load, that I then use to Batch Delete the records – usually “discovery” eBooks that we never owned, from the patron driven delivery loads).

 So, there’s several questions I have about what statistics the Discovery Services Librarian (and me, in the meantime) should keep.

First, I have been counting the Delete loads as records loaded, but I then remove them.  So, maybe a new category? Or possibly not, since we no longer base our numbers of things in the catalog from our cataloging stats. Yes!  These definitely need to be a different category.  You can use the “MARC Batch-Deleted Records” in the stats.  That is what Carol has been using for her batch removed records.

I also have gotten a steady stream of emails with single deletions to be done manually, where I simply erase a PDDA record that we have gotten in another format. I’ve counted those as part of the batch stuff, since again, I haven’t cataloged them individually (just deleted them).  This one is a hard one.  The closest thing we have is the “Sierra-Items Deleted” column.  A while back, Barb made the call to collapse the “titles deleted” and the “items deleted” into one category to make it easier to record weeding projects.  We decided the larger number was probably more useful.  The PDDA records are odd because we never owned the item, it was always just a discovery record.  So, you have a few options to consider: 1) Could make a new category for “records deleted” so that the PDDA number could be reported more accurately (but it might be complicated by the fact that Barb doesn’t report records deleted for her work – and assumptions might be made that this field includes all records deleted) or 2) you could use the “Sierra-Items Deleted” for these few one offs (complication is that this could be interpreted as materials deaccessioned)  or 3) could create its own category for “PDDA records deleted” or 4) Option I can’t think of at the moment… 😊

 I also download individual records from the streaming video sites for Gaby, again these are vendor records, and I edited them the same as the batch records. So, I could count them as individually cataloged titles, but I don’t do much check (except for the link) and I add in the “hooks”, so I’ve been uncertain how to count them. Right now they are in my Batch stats, because, vendor records and process is the same. I would recommend “Genstacks-CopyCat-Electronic” for this.  As long as you are doing them one-by-one that is.  If you are pulling them individually but compiling them into groups to manipulate and load records, then “GenStacks-Batch-Modified”.

Melanie

Melanie will add her statistics method to each workflow as soon as possible… but we should probably also discuss with new hire for ideas, methods, etc.


For historical use & older tracking:

Trello website: Historical procedures & notes, but contains earliest logs of Batch record loads

https://trello.com/b/CWYisa6s/cheryl-and-melanie-batch-record-loading

 

Related content

Academic Complete Proquest (monthly) USU ebookcentral workflow
Academic Complete Proquest (monthly) USU ebookcentral workflow
Read with this
4-Safari O'Reilly Books Online (weekly) workflow
4-Safari O'Reilly Books Online (weekly) workflow
Read with this
1-Wageningen eBook Packages (non-batch): OCLC Cataloging Workflow
1-Wageningen eBook Packages (non-batch): OCLC Cataloging Workflow
More like this
DISCOVERY SERVICES LIBRARIAN: Cataloging Workflows
DISCOVERY SERVICES LIBRARIAN: Cataloging Workflows
Read with this