XML S3 Bucket Pusher
Area | greenfield |
Language | Java |
Description | Tool to push XML clob/unixsd to S3 bucket |
Quality | No Sentry, no SONAR |
Upstream services | |
Upstream data | |
Downstream data | |
Packages |
|
Source Code |
Running the tool
The tool needs to be built with the standard ant compile
or ./bin/nbant compile
on developer machine.
It needs a default deployment property, and is set to use the developer myalter-1 property file (may need to revisit)
From standard developer environment, use the command line:
bin/j --cp ./java/org/crossref/qs/crmds/ org.crossref.qs.crmds.S3ClobPusher
to run the tool using all the defaults, for the entire DOI corpus.
Parameter options:
--endId # [default=0]
--startId # [default=500000000]
--awsSecretKey AWS Secret Key [default is read from AWS_KEY QS_CRMDS_XML_BUCKET_PUSHER_AWS_SECRET_KEY]
--awsAccessKey AWS Access Key [default is read from QS_CRMDS_XML_BUCKET_PUSHER_AWS_ACCESS_KEY]
--blockSize blockSize [default=1000]
--bucket s3 bucket name to use [default=api-metadata-repository-staging]
--concurrency # [default=10]
--help
Parameters can be supplied immediately after the class to run:
bin/j --cp ./java/org/crossref/qs/crmds/ org.crossref.qs.crmds.S3ClobPusher --startId 0 --endId 1000
Running the cbc pusher
There is a second tool for the Cited By Count pushes that is seperate from the XML pushes. It runs similarly:
From standard developer environment, use the command line:
bin/j --cp ./java/org/crossref/qs/crmds/ org.crossref.qs.crmds.S3CbcPusher
to run the tool using all the defaults, for the entire DOI corpus.
Parameter options:
--awsSecretKey secret key
--awsAccessKey access key
--s3Bucket specify bucket
--startId citation id to start with
--endId citation id to end on
Also you can use tee
for outputing to screen and log by adding at the end of the command: 2>&1 | tee -a 52-60Mrun.log