HPC Storage - File Systems

Overview

Talon 3.0 has a variety of different file systems.  It is crucial that you understand the proper usage of these filesystems for best performance and stewardship of computational resources.  Misuse of the filesystems could mean the termination of your jobs or loss of you data.

Summary of File Systems

Name Path Talon 1 equiv. Type
homedir /home /users ext4
local /storage/local /data/scratch ext4
remote /storage/remote ** /storage NFS
research /storage/research ** /research NFS
scratch2 /storage/scratch2 /scratch2 Lustre

** NOTE: All files from the Research SAN are automounted.  This means they are mapped when you access your directory.  Not before.

Details of File Systems

Home directoryWhen you login your home directory (aka $HOME) is set to your EUID as such /home/euid123.  This ext4 filesystem is local only to talon3 login and all home directories are made available to each compute node, via NFS mount.  Quotas are set in /home, use the "quota" command to check your quota.  Home directories are for storing, input, output, codes, scripts, ... and should never be used for runtime temporary/scratch files, as this causes undue traffic to the talon3 login node.  Additionally each PI research group has a shared filed space in /home/share/pi_euid123 where the EUID is that of their PI group.  The default unix group for each user is their PI group and the default file permissions are set to read/write for the user and read-only for their PI group, to make it easier to share files within the same PI group.  For more on file permissions see the online UNIX manuals (aka man page) on chown, chgrp, chmod.

Scratch2 directoryFor storage the HPC Team has deployed an updated high-speed parallel filesystem affectionately named scratch2. It is running Lustre 2.5.2 on a DDN SFA7700X storage appliance. This filesystem has 1.4PB total available and can sustain speeds of up to 10GB/s over FDR Infiniband. Each compute node has the file system mounted as /storage/scratch2. As with scratch, if your  job creates many small scratch files (< 1MB each), then the parallelism of Lustre will actually cause a performance loss and you should use the local storage on the compute node (/storage/local) with 300GB/node.  Again, all scratch files are subject to being removed after each run, so please copy any data you wish to save back to your home directory.

Alternative Storage: In addition we have 575TB of storage available via an EMC Isilon X series SAN. This mountpoint contains both /storage/remote and /storage/research over a 10gigE NFS connection. This storage is intended for researchers in need of archiving data sets too large to immediately transferring out.