Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Collection Group Designation

CGD

Shareable

Retention Commitment

Notes

Committed

Shareable

Yes

  • CGD not changed by Matching Algorithm.
  • These are items purchased via cooperative collection agreements, identified by the vendor.
  • Do not expect that two items will match and both have the CGD of Committed.
  • If two items match and are submitted as Committed, both will retain the Committed designation.
  • If a Committed item matches a
    Shared item, both will retain their designations as Committed and Shared.

Shared

Shareable

Yes


Open

Shareable

No

  • When submitted as Open, not changed by Matching Algorithm.
  • It is not expected that items will be submitted as Open going forward, but it has been done in the past.

Uncommittable

Shareable

No

  • CGD not changed by Matching Algorithm.

Private

Not shareable

No

  • CGD not changed by Matching Algorithm.

Ongoing Matching Algorithm

The ongoing matching algorithm is run on items with the same Material Type when:

  • new items are accessioned to SCSB with a CGD of “Shared”
  • existing items are updated to a CGD of “Shared” and in the “cgd_no_protection” folder


The new/updated item is compared to all the existing items in the database with a CGD of Shared or Committed.


When two items match, the item that was accessioned earlier will retain the Shared status, while the newer item will be bumped to Open.


Items are considered a “match” when two of the following points match:

  • ISBN
  • ISSN
  • LCCN
  • OCLC
  • Title (first 4 words) 

Normalization

  • ISBN, ISSN, LCCN, OCLC - non numeric characters are removed
  • Title - diacritics are removed

Exceptions

  1. Matching copies from a single institution are not compared. It is up to the submitting institutions to only submit one copy of matching items to SCSB as “Shared,” and the rest as “Open.”
  2. When two items have a single number match but having different bib levels; one is a monograph and the other a serial. That is when the material type does not match, the matching algorithm is not applied.
  3. When the CGD of an item is changed manually to Shared using the SCSB UI, the matching algorithm is not run on the item.  This means that there could be two items with the Shared CGD.

Reports

The following reports are generated after each run of the matching algorithm:

  • Matching Summary Report
  • Matching Serial MVM Report
  • Title Exception Report
  • CGD Round Trip Report

Matching Summary Report

  • Example file name: MatchingSummaryReport-27Jul2021080127.csv
  • Columns: Institution, Total Bibs, Total Items, Shared Items Before Matching, Shared Items After Matching, Difference of Shared Items, Open Items Before Matching, Open Items After Matching, Difference of Open Items
  • Rows:PUL, CUL, NYPL, HL
  • Example data:

Matching Serial MVM Report

  • Includes items that were changed from Shared to Open for Serial and MVM material types
  • All items that match are changed from Shared to Open
  • Example file name: MatchingSerialMvmReport-27Jul2021080114.csv
  • Columns: OwningInstitutionId, Title, Summary Holdings, Volume Part Year, Use Restriction, BibId, OwningInstitutionBibId, Barcode
  • Example data:

Title Exception Report

  • Items that match on only one number and not the first four words of the normalized title will be included in this report.  Previously, these items were considered a match and the CGD was affected, but as of v4.3 (July 2021), they are not considered matched.
  • Columns: OwningInstitution, BibId, OwningInstitutionBibId, MaterialType, OCLCNumber, ISBN, ISSN, LCCN, Title1, Title2, Title3, Title4, Title5, Title6, Title7, Title8, Title9, Title10, Title11, Title12, Title13, Title14, Title15, Title16, Title17, Title18, Title19
  • Example data:

CGD Round Trip Report

  • A report will be created if any item’s CGD is changed by the matching algorithm.
  • All the items with a change to the CGD will be included in the report.
  • The report will be written to the SCSB AWS S3 bucket.
  • The report is institution specific and will be put into the corresponding directory for that report and institution.
  • The directory in the S3 bucket will be:
  • reports/cgd-round-trip/<institution>/
  • The name of the report will be CGD_RoundTripReport_<timestamp>.csv
  • ex: CGD_RoundTripReport_20210322_185905.csv
  • Columns: Item Barcode, Old CGD, CGD, Date of Action
  • Example data:

Prior to v4.3

  1. The matching algorithm will no longer consider Use Restrictions as of v4.3 (July 2021) and beyond.


The technical Documentation for matching algorithm - Technical Documentation for Matching Algorithm 

  • No labels