EM Data Processing Portal

Image source: Dr Matthew Weyland, Monash University Centre for Electron Microscopy - TEM FEI Tecnai T20 @ MCEM

This service provides access to high-performance GPU clusters for processing electron microscopy data using CryoSPARC and LiberTEM services.

The Portal consists of 2 nodes: one GPU cluster at QCIF (Queensland Cyber Infrastructure Foundation) and another GPU cluster at Monash University.

The Electron Microscopy Data Processing Portal

Notice

The service of the EM Data Processing Portal hosted at MeRC (Monash University) will be decommissioned on 31 January 2024. As a result, all jobs queuing or running at MeRC after this date will be cancelled and all data will be deleted.

From 01 February 2024, the EM Data Processing Portal will run from QCIF only.

The service:

The “EM Data Processing Portal” is designed to offer data processing to the research-based electron microscopy community by offering access to CryoSPARC and LiberTEM.

  1. Request access to the service

  2. If running CryoSPARC, obtain a license from CryoSPARC.

    • the license is used to install your own version of CryoSPARC.

    • install CryoSPARC

  3. Start CryoSPARC or LiberTEM.

  4. Copy in your primary data using Globus.

  5. Process your data.

  6. Copy out your processed data using Globus.

  7. Using Globus, delete your primary and processed data so the storage can be freed up for another user.

Condition of use of the EM Data Processing Portal:

1. Once a CryoSPARC or LiberTEM service has been started, it can run continuously for 14 days. CryoSPARC or LiberTEM will then be automatically shut down after 14 days.

2. Available storage in the Portal is limited. After a CryoSPARC or LiberTEM service has stopped, users have 7 days to transfer their data to their home organisations. After 7 days, all data in the account will be automatically deleted without any possible recovery.

3. Each account is provided with 4.8 TB of storage. This quota includes the CryoSPARC installation and both primary (input) data uploaded by the user and data created by CryoSPARC or LiberTEM.

Please note:

  • the EM Data Processing Portal is a data processing service, not a storage service.

  • there is no backup provided for data kept on this service.

  • no backups of your CryoSPARC database are made.

*Acknowledging the Portal

Please acknowledge the EM Data-Processing Portal in your publications including journal articles, seminars, conference posters, book chapters and theses, with the following sentence:

 “This work was undertaken with the assistance of the EM Data Processing Portal. The portal is supported by the Australian Research Data Commons (ARDC), Monash University and the Queensland Cyber Infrastructure Foundation (QCIF).”

Access is open to all staff and students at Australian universities as well as the CSIRO and other government research agencies that are members of the Australian Access Federation.

Currently, eligible users of the Portal must be at organisations that are also members of the Educational Global Authentication Infrastructure (eduGAIN) This restriction exists as Globus v5 is used to enable data movement in and out of the Portal. If you are unsure of whether your organisation is a member of the eduGAIN global network, please contact your organisation’s IT or eResearch service.

If you do not have a user account on the system, please contact us using your institutional email address. Ensure your subject is “EM Data Processing Portal - Access”.

To assist us, please attach a screenshot of your attributes from the AAF Validator: This ensures we configure the service using the correct email address.

Once your account has been provisioned you will be notified by email.

  • Go to the ‘GPU eResearch Platform’ (GeRP):

  • Choose your service (your account has been provisioned at one of these sites)

    • EM Data Processing Portal - QCIF

    • EM Data Processing Portal - MeRC

  • Click on ‘Login’

    • Select ‘AAF or ‘Google’.

    • If you selected ‘AAF’, search for your home institution, select it, and click on ‘Continue to your organisation.

  • Sign in using your credentials

  • Welcome to the ‘EM Data Processing Portal’


1. Access

  • On the left-hand side, click on ‘Install CryoSPARC’

  • Enter your CryoSPARC license ID, first name and last name.

    • Note: Please do not use spaces in your names. This currently causes an error.

  • Click on ‘Launch’

  • This will submit a job to the cluster and perform a full installation of CryoSPARC into your home account, create a user account in CryoSPARC and configure it for submitting jobs to the cluster.

  • Please only run this once. It takes approximately 1 hour to run.

  • Once the job has been completed successfully, continue below.

  • While this job is running, consider copying in your primary data using Globus, see below.


2. Installing CryoSPARC

  • On the left-hand side, click on ‘CryoSPARC’

  • Click on ‘Launch’

  • When the job starts running, please wait about 2 minutes for CryoSPARC to start properly.

    • There are a few steps to start CryoSPARC. If you wish to monitor the progress, click on ‘View Log’. You will see CryoSPARC started once to update the port it runs on and then restarted again. Once this has been completed you can connect.

  • Click on ‘Connect’

  • At the bottom of the screen, copy the username and password (e.g. Ctrl+c, cmd+c), and click ‘Dismiss.

  • A new tab will appear in your browser. (if not, check your browser settings)

  • Using the username and password, login to CryoSPARC


3. Running CryoSPARC

  • On the left-hand side, click on ‘LiberTEM’

  • Click on ‘Launch’

  • When the job starts running, please wait about 1 minute for LiberTEM to start properly.

    • If you wish to monitor the progress, click on ‘View Log’.

  • Click on ‘Connect’

  • A new tab will appear in your browser (if not, check your browser settings)


4. Running LiberTEM

Your account has 4.8 TB of storage space.

This is for your CryoSPARC installation, plus the primary and processed data.

You can perform data movement using Globus.

Please refer to these instructions for using Globus on MASSIVE.

https://docs.massive.org.au/M3/transferring-files.html#globus

This system is completely separate from MASSIVE but these instructions can be used.

  • For step 4, search for the collection where your account is (MeRC or QCIF):

    • EM Data Processing Portal - QCIF 

    • EM Data Processing Portal - MeRC

  • You will not need to perform step 6.

Copy your primary data into your home folder: /~

You will then be able to access the primary data from within CryoSPARC or LiberTEM.

Once processing has been completed, use Globus again to copy out the processed data.

This is a processing service, not a data storage service.

For users at the University of Queensland:

Your RDM collections are not added to Globus by default. You need to submit a ticket to rcc-support@uq.edu.au

Then you will need to agree to a few conditions. e.g.

“Please note, accessing your RDM collections via Globus applies to all datasets or no datasets. You cannot select certain directories. Access requires a UQ login so if a user wants others to access data via Globus they need a UQ login and the owner of the RDM needs to add them via the RDM portal as a collaborator.”

Only then will your collections be added to Globus.


5. Data Movement

  • You have started CryoSPARC, and clicked on ‘Connect’ but are unable to access CryoSPARC.

  • Did a new browser tab open? If not, check your browser settings and allow this.

  • Did CryoSPARC start?

    • Did you wait for it to start or click ‘Connect’ straight away? CryoSPARC takes about 2 mins to startup, wait and then try connecting again.

    • Still not running click on ‘View Log’. This will show you the log file for the start-up process. Does the log contain any errors? If yes, contact us for assistance.

  • Your CryoSPARC installation finished very quickly. Check the log file, and ensure there are no spaces in the first name or last name fields.

  • You receive an error when trying to access the ‘EM Data Processing Portal’ collection using Globus. Contact us.


6. Troubleshooting


ImagingTools is an initiative of the Australian Characterisation Commons at Scale (ACCS) Project. ACCS is supported by the Australian Research Data Commons (ARDC) and the following partners. The ARDC is enabled by NCRIS.

*Acknowledging the Portal

Please acknowledge the EM Data-Processing Portal in your publications including journal articles, seminars, conference posters, book chapters and theses, with the following sentence:

 “This work was undertaken with the assistance of the EM Data Processing Portal. Which is supported by the Australian Research Data Commons (ARDC), Monash University and the Queensland Cyber Infrastructure Foundation (QCIF).”