HPC Research Community,
Next Monday January 23, 2017, we plan to perform crucial maintenance on the Infiniband infrastructure that supports high speed communication on Talon2. The maintenance is required in order to finalize the upgrade to Talon3. Unfortunately Talon2 and it's Infiniband network must be temporarily taken offline to perform the maintenance. Compute services will be unavailable from Monday morning on January 23, 2017 until Wednesday afternoon January 25, 2017.
To help minimize disruption, we will pause all jobs and restart any resumable jobs after the maintenance is complete. Note: since the Infiniband network must be stopped, it may not be possible to resume all parallel jobs or jobs that are actively writing to /storage/scratch2. In other words, jobs not finished before Monday morning can possibly fail. Data loss to existing data stored on /storage/scratch2 is highly improbable.
This is one of the final bits to enable bringing Talon3 online. We are quite excited to get Talon3 into production to provide you increased compute capacity among other enhancements. Additional details will be provided as they are available. If you have concerns or questions, please send a message to firstname.lastname@example.org.