...
Step 3: ILS sends the bib-records in MARCXML or SCSBXML format
SCSB Ongoing Accession / Submit Collection (through API)
Step 4: SCSB will create a bib record tree in the SCSB Database
...
Submit Collection Process:
The following process describes the Submit Collection Process
...
Submit Collection process is used to update metadata of already existing bibliographic records in SCSB. SCSB Provides two different methods to update the existing collection.
Method 1: Institutions prepare MARCXML or SCSBXML of bibliographic records and place them at appropriate AWS S3 drive allocated to them. SCSB nightly job process these files and update the records as provided. Institutions will find 2 different folders a) protection b) no_protection . Institutions requested to place .xml files or .gz files (compressed .xml files) with bibliographic records in “protection“ folder if they don’t want to modify CGD of the records and place files in “no_protection“ folders if they want to update CGD as given in the files. Files should not contain more than 4000 bib records per file if they are placing files in the “no_protection“ folder.
This is the most preferred method for updating the SCSB bibliographic records on a daily basis.
Following is the sequence diagram for the this method.
...
Method 2: Institutions can update the bibliographic records of SCSB by calling “SubmitCollection“ API. Details are given at Submit Collection API. This method is primarily used to update the records immediately under rare situations.
Advantages of Method 1:
Records are processed during off hours, so load on the servers is not felt during the day time.
Submitcollection reports are generated for each file, so there will be limited number of reports
Partners are able to see their processed files on the S3 location, so it is easy for partners to go back and review what hey have summited
Disadvantages of Method 1:
Records are processed over night, so the changes are available next day morning.
Advantages of Method 2:
Records are processed real time and changes are available immediately.
Disadvantages of Method 2:
As records are processed real time, based on the number of records processed it offers significant load on the servers
As Submitcollection report is generated for each API call, there will be millions of report files on S3 drive over a short period of time.
Partners have to maintain the history of the Submitcollection submissions
Based on the above analysis, Method 1 is the most preferred approach.