The server-side encryption can either be turned on at the namespace level, where all buckets in a namespace are required to be encrypted, or on the bucket level for namespaces that do not have 'encryption required' turned on.  If you want namespace level encryption, please inform the HPC admins when requesting a namespace.  The encryption choice must be made at namespace or bucket creation time, and cannot be changed afterwards.  (One can copy data from an unencrypted bucket to an encrypted one, then destroy the unencrypted bucket, or visa versa, should a change be needed after bucket creation.)


Before requesting access to ROSS, in addition to reading this article, please also read the Access to the Research Object Store System (ROSS) article on this wiki, since it describes key concepts needed to understand how storage is laid out and accessed. 

To initiate access to ROSS for a new personal or group (e.g. project, lab, department) namespace, please send a request via e-mail to  In your request,

  • Please indicate whether this is for a personal namespace (e.g. primary storage for processing data on Grace), or for a group (shared storage). 
    • For a personal namespace, please indicate
      • your name
      • your e-mail
      • your departmental affiliation
      • what your login name (not your password) and domain you will use to access ROSS's administrative interface, e.g. (for personal namespaces we prefer that you use your HPC username,)
    • For a group namespace, please give
      • a name and brief description for the group
      • the primary e-mail contact for the group
      • the departmental affiliation of the group
      • who will be the namespace administrators - we need
        • their names
        • their e-mail address
        • their login name (not their password) and domain e.g. (for group namespaces we generally prefer campus Active Directory usernames (e.g. the name the namespace administrator might use to login to Outlook Mail))
        • you may ask for more than one namespace administrator
        • if all the members of a particular campus AD group should be namespace administrators, you could also just give us the name and domain of the group instead of their individual names
  • Please estimate approximately how much storage you or your group intend to use in ROSS for the requested namepace, divided into local and replicated amounts.  The HPC Admins will use this information in setting the initial quotas and for capacity planning.  You will be allowed to request increases in quota if needed and space is available. 
  • We would also appreciate a brief description of what you will be using the storage for.  The "what it is used for" assists us in drumming up support (and possibly dollars) for expanding the system. 

A particular username@domain can only be a namespace administrator for one namespace.  If a particular person needs to be the namespace administrator in more than one namespace, for example a personal namespace as well as a group namespace, they must use different login names and domains.  This is why we suggest using your HPC credentials for personal namespaces, and other credentials for group namespaces.

Don't forget that a namespace administrator is not the same as an object user.  See Access to the Research Object Store System (ROSS) for the difference between a namespace administrator and an object user.  An object user has a different API-specific set of credentials.

If you wish to access an existing group namespace as an object user, please contact the namespace administrator for that namespace and ask to be added as an object user for that namespace.

Using ROSS

In experiments we have notices that write times to ROSS are considerably slower than read times, and slower than many POSIX file systems.  However, read times are significantly faster than write times on ROSS.  In other words, it takes longer to store new data into ROSS than to pull existing data out of ROSS.  We also notice (as is typically of most file systems) that transfers of large objects can go significantly faster than tiny objects.  Please keep these facts in mind when planning your use of ROSS - ROSS favors reads over writes and big things over little things.

In theory, a bucket can hold millions of files.  We have noticed, though, that with path prefix searching the more objects that have a particular prefix, the longer the search takes.  While this is not dissimilar to POSIX filesystems, where the more files there are in a directory. the longer it takes to do a directory lookup, on an object store, being a flat namespace (no directory hierarchy), such searches can be much slower.  The total number of objects in a bucket also has a mild, though not dramatic, impact on the look up speed particular objects in a bucket.  


If you are are good with the data restrictions and prepared to cover costs that might be incurred for using ROSS, there are steps that must be  completed before you can actually move data between ROSS and Grace.  First, you must request or be assigned to a namespace.  Second, decide which APIs you might use. Third, your namespace administrator using the ECS Portal may wish to pre-create the buckets that you might use.  Creating buckets in the ECS Portal can be a more convenient than creating them using APIs or tools.  Finally, get access credentials for the APIs that you might use.

What tools do you recommend for accessing ROSS

The bottom line - any tool that uses any of the supported protocols theoretically could work with ROSS.  There are a number of free and commercial tools that work nicely with object stores like ROSS.  Like most things in life, there are pros and cons to different tools.  Here is some guidance as to what we look for in tools.

Being inherently parallel accessible (remember the 23+ storage nodes), the best performance is gain when operations proceed in parallel.  Keep that in mind if building your own scripts (e.g. using ECS's variant of s3curl) or when comparing tools.  More parallelism (i.e. multiple transfers happening simultaneously across multiple storage nodes) generally yields better performance, up to a limit.  Eventually the parallel transfers become limited by other factors such as memory or network bandwidth constraints.  Generally there is a 'sweet spot' in the number of parallel transfer threads that can run before stepping on each other's toes. 

The S3 protocol includes support for multipart upload of objects, where a large object upload can be broken into multiple pieces and transferred in parallel.  Tools that support multipart upload likely will have better performance than tools that don't.  In addition, tools that support moving entire directory trees and that work in parallel will have better performance than tools that move files one object at a time.

After evaluating several tools, the HPC admins settled on 2 tools, ecs-sync and rclone, as 'best of breed' for moving data between ROSS and Grace's cluster storage system, where the /home, /scratch, and /storage directory live.  The ecs-sync program is the more efficient and the fastest of the two for bulk data moves.  It consumes fewer compute resources and less memory than rclone.  Yet when properly tweaked for number of threads (i.e. when the sweet spot is found) it moves data significantly faster than rclone.  The rclone program has more features than ecs-sync, including ways to browse data in ROSS, to mount a ROSS bucket as if it were a POSIX file system, and to synchronize content using a familiar rsync-like command syntax.  While ecs-sync is great for fast, bulk moves, rclone works very well for nuanced access to ROSS and small transfers.


The ecs-sync program is specifically designed for moving data from one storage technology to another.  It comes from the Dell/EMC support labs, and is what EMC support engineers use for migrating data.  Due to security issues, we do not offer the job-oriented service mode described in the ecs-sync documentation.  (If the ecs-sync maintainers ever fixed the security holes, we could reconsider.)  Instead we only support running ecs-sync in what its documentation calls "Alternate (legacy) CLI execution", where a user runs ecs-sync as a command, instead of queuing up a job.  This allows the command to be run in the context of the user's login, honoring permissions.  One side effect is that the copied data could end up being owned by the user running the command, regardless of who owned the source data, which might not be desirable for all cases.

A command alias exists on Grace's login and compute nodes for running ecs-sync.  The alias actually runs the command on a data transfer node, so does not bog down the node on which the command is issued.  In other words, feel free to use ecs-sync from the login node, from the command prompt in Grace's Open OnDemand portal, or even from a compute job, if needed, though is it somewhat wasteful of resources to run ecs-sync from a compute job. 

The syntax for calling ecs-sync interactively on Grace is:

Code Block
titleInteractive ecs-sync command
ecs-sync --xml-config <config-file>.xml

The <config-file>.xml are the set of instructions of what is to move where in how many threads using which storage nodes.  Of course, replace <config-file> with a name of your choosing.  We will describe the XML config contents content later.

In running interactively, you see all the messages coming back from ecs-sync as it does its job, giving instant feedback.  But if the shell dies, the command may stop.  To avoid this, you can use nohup to run ecs-sync as a background job that continues when you log out and redirect standard out/err into a file, for example by using the following syntax:

Code Block
titleRunning ecs-sync in the background
nohup ecs-sync --xml-config <config-file>.xml > <log-file>.log &

Again, replace <config-file> and <log-file> with any name you wish.

The most complicated task in using ecs-sync is setting up the xml config file.  The config file consists of three parts: 

  • the global settings define various parameters of how the sync should be run, such as buffer sizes, number of retries, number of parallel threads, etc. that are not specific to either the source or the target
  • the source settings define where the data to be copied is coming from, including the type of storage, the location of the data, and access information
  • the target settings defing where the data being copied should be stored, including the type of storage, the destination bucket or directory, and other access related information

The basic layout for the xml config file is as follows:

Code Block
titleOverview of config file
<syncConfig xmlns="">  
        <!--- ... the global settings ... --->

            <!--- ... file copy settings ... --->

            <!--- ... ecs s3 copy settings ... --->

Note that the contents of <source> and <target> are interchangeable, and the order of the blocks doesn't matter.  So the direction of the copy can be switched by simply swapping the <source> and <target> keywords.

Although ecs-sync knows abaoaut other types of storage, since this guide is for moving data between Grace's storage and ROSS, we will only describe the filesystem (for Grace's storage) and ecsS3 (for ROSS's storage) types, since those are the 2 needed.  If you are interested in other storage types, please consult the ecs-sync documentation.

The easiest way to create an xml config file is to copy an exiting template, and just change the pertinent values.  To help guide, here are examples of the 3 sections with descriptions of the parameters

Global Options

This example shows available global parameter options.  These are typical values, and may be changed.  The important options are deleteSource, fourceSync, recursive, retryAttempts, threadcount, verify and verifyOnly,

            <vdcs>uams(,,,,,,,,,,,,,,</vdcs>         <smartClientEnabled>true</smartClientEnabled>