9  File Transfer

Learning Objectives
  • Move files in and out of the HPC storage using Filezilla or rsync/scp.

9.1 Moving Files

There are several options to move data between your local computer and a remote server. We will cover three possibilities in this section, which vary in their ease of use.

A quick summary of these tools is given in the table below.

Filezilla SCP Rsync
Interface GUI Command Line Command Line
Data synchronisation yes no yes

9.1.1 Filezilla (GUI)

This program has a graphical interface, for those that prefer it and its use is relatively intuitive.

To connect to the remote server (see Figure 3):

  1. Fill in the following information on the top panel:
  • Host: login.hpc.cam.ac.uk
  • Username: your HPC username
  • Password: your HPC password
  • Port: 22
  1. Click “Quickconnect” and the files on your “home” should appear in a panel on right side.
  2. Navigate to your desired location by either clicking on the folder browser or typing the directory path in the box “Remote site:”.
  3. You can then drag-and-drop files between the left side panel (your local filesystem) and the right side panel (the HPC filesystem), or vice-versa.

Example of a Filezilla session. Arrows in red highlight: the connection panel, on the top; the file browser panels, in the middle; the transfer progress panel on the bottom.

9.1.2 scp (command line)

This is a command line tool that can be used to copy files between two servers. One thing to note is that it always transfers all the files in a folder, regardless of whether they have changed or not.

The syntax is as follows:

# copy files from the local computer to the HPC
scp -r path/to/source_folder <user>@login.hpc.cam.ac.uk:path/to/target_folder

# copy files from the HPC to a local directory
scp -r <user>@login.hpc.cam.ac.uk:path/to/source_folder path/to/target_folder

The option -r ensures that all sub-directories are copied (instead of just files, which is the default).

9.1.3 rsync (command line)

This program is more advanced than scp and has options to synchronise files between two directories in multiple ways. The cost of its flexibility is that it can be a little harder to use.

The most common usage is:

# copy files from the local computer to the HPC
rsync -auvh --progress path/to/source_folder <user>@login.hpc.cam.ac.uk:path/to/target_folder

# copy files from the HPC to a local directory
rsync -auvh --progress <user>@login.hpc.cam.ac.uk:path/to/source_folder path/to/target_folder
  • the options -au ensure that only files that have changed and are newer on the source folder are transferred
  • the options -vh give detailed information about the transfer and human-readable file sizes
  • the option --progress shows the progress of each file being transferred
Warning

When you specify the source directory as path/to/source_folder/ (with / at the end) or path/to/source_folder (without / at the end), rsync will do different things:

  • path/to/source_folder/ will copy the files within source_folder but not the folder itself
  • path/to/source_folder will copy the actual source_folder as well as all the files within it
Tip

To check what files rsync would transfer but not actually transfer them, add the --dry-run option. This is useful to check that you’ve specified the right source and target directories and options.

9.2 Summary

Key Points
  • To transfer files to/from the HPC we can use Filezilla, which offers a user-friendly interface to synchronise files between your local computer and a remote server.
    • Transfering files can also be done from the command line, using tools such as scp and rsync (this is the most flexible tool but also more advanced).