AWS S3

Amazon Web Services Simple Storage Service (AWS S3) is the Storage used by SCSB for storing the files generated for Each partner in separate folders.

On AWS S3 , Left side of the below image, we have AWS Buckets, one for each environment : scsb-dev, scsb-test, scsb-uat, and scsb-prod. User can access all the folders under each environment bucket. All buckets are stored in the format ‘scsb-,<environment name>'

 

 

 

Each environment have four folders -archival, data exports, data feeds, and reports

Archival - This folder contains all the old files which are older than 90 days and can be discarded if required.
path for this folder is <scsb-env>/archival/...

 

Data exports - This folder stores Full and Incremental export files for Each Institution separately. SCSB generates and stores the files in this location daily.
path for this folder is <scsb-env>/data-exports/...

 

Data feeds - This folder is for partners and IMS institutions to place feeds for SCSB applications, i.e. Initial accession and Daily Submit Collection files. Each Institution has separate path for the files to access.
path for this folder is <scsb-env>/data-feed/

Submit collection -

/scsb-{env}/data-feed/submitcollections/{institution}/cgd_protection. This is the location where the Submit Collection API polls for  cgd protected files to process.

/scsb-{env}/data-feed/submitcollections/{institution}/cgdn_no_protection. This is the location where the Submit Collection API polls for cgd not protected files to process.

/scsb-{env}/data-feed/request-initial-data .This is the location where the SCSB process files for request initial data for corresponding institutions.

Note: Files deposited in these folders should be prefixed with “scsb_“ (lowercase)

Configuration Parameter : 

Parameter "submit.collection.cgd.noprotection.input.limit" used to set the limit for number of bibliographic records in a submit collection file uploaded in cgd_no_protection folder in AWS. The value is set at 4000, which is configurable. The files containing more than the set limit of bibliographic records will not be processed and exception message sent via email.

Reconciliation -

/scsb-{env}/data-feed/ims/accession-reconciliation. This is the location from which reconciliation files for accession are taken for processing.

/scsb-{env}/data-feed/ims/daily-reconciliation. This is the location from which daily reconciliation files are taken for processing.

/scsb-{env}/data-feed/ims/monthly-reconciliation. This is the location from which monthly reconciliation files are taken for processing.

Note: The files deposited in these folders should be prefixed with “scsb_“ (lowercase)

Reports - This folder stores all the reports SCSB generates i.e. Collection, Data dump reports, Data export reports, ETL reports, Matching reports, Title Exception report, reconciliation reports, request initial data and solr reports.
path for this folder is <scsb-uat>/reports/...


Matching Algorithm
/scsb-{env}/reports/matching-reports - This is the location where matching algorithm related reports are stored. There is a new report named ‘CGD change round trip report’ has been added to this location where User can see the changed CGD status report. Whenever the CGD status of any Item gets changed after the Matching Algorithm process, a report is generated in the AWS S3 server, and the User gets a notification through email.

path for this folder is /scsb-{env}/reports/matching-reports/cgd-round-trip/Institution

 

 

Ongoing Accession
/scsb-{env}/reports/collection/ongoingAccession - This is the location where ongoing accession related reports are stored.

Solr
/scsb-{env}/reports/solr-reports- This is the location where solr index reports are stored.

ETL
/scsb-{env}/reports/etl-reports - This is the location where the ETL (extract, transform, load) related reports are stored.