brainsteam.co.uk/brainsteam/content/posts/2024/01/03/Migrating Users Across Serv...

9.7 KiB

categories date draft tags title type
Software Development
2024-01-03 14:33:37 false
linux
rsync
Migrating Users Across Servers With RSync posts

I recently needed to migrate some user data from one Ubuntu server to another. It was not possible for me to clone the full disk so I opted to copy the user data and re-create the user accounts on the other machine.

I used rsync to copy all the user data and preserve all permissions on the files. I needed sudo access on both sides.

In this article I refer to the new machine as the target onto which we want to copy our data and the old machine as the source of the data we want to copy.

Preparing The Data

Firstly, check what data you want to clone. I made liberal use of du -sh /home/* to see how much space each of the affected user directories were taking up and worked with them to tidy up their local directories where necessary (lots of junk in hidden places like .local and .cache). A couple of the users had large projects that they were able to purge before we did the copy so I was able to significantly reduce the amount of data I needed to transfer.

Create the Users

For each of the users on the old machine, I created a new account using sudo useradd -m - if there are any special groups like sudo or docker you can add them at this point, e.g. sudo useradd -m james -G sudo,docker

The -m flag creates the user home directory so if you do an ls /home you should see one directory per user in there.

Set Up Passwordless Sudo rsync

In order to have permission to copy users we need to be able to operate as root on both the new machine and the old machine. We will run the sync command from the new machine with sudo and we can enter the password but that will SSH to the remote system and attempt to sudo and will likely fail if we don't do this next step.

We don't want to give blanket permission for the user to sudo without password auth - that's a pretty big security risk and also an accident waiting to happen (sudo rm -rf / anyone?). Instead, we will edit the /etc/sudoers file and add special permission for our current user (let's say james) to run the rsync command without asking for a password.

We add a couple of new lines to the bottom of the file like so:

# User privilege specification
root    ALL=(ALL:ALL) ALL

# Members of the admin group may gain root privileges
%admin ALL=(ALL) ALL

# Allow members of group sudo to execute any command
%sudo   ALL=(ALL:ALL) ALL

# Custom user privileges
james   ALL=(ALL) NOPASSWD: /usr/bin/rsync

The key thing here is the NOPASSWD: directive, after which we put the path to the rsync binary. We should also remove this again once the sync is successful.

Set up and test the connection

From the new machine we can test our ability to rsync with the old machine. We can use the --dry-run flag to avoid actually copying any data.

The test command might like something like this:

sudo rsync -Porgzl --dry-run --rsync-path="sudo /usr/bin/rsync" james@target.domain.com:/home/james /home

You will want to run rsync via sudo on the new machine AND the old machine (via the config above) to ensure that we have permission to read and write other users' data.

--dry-run prevents rsync from actually copying any data, it just makes a list of files it would copy if it was run without this option.

--rsync-path - this is the command that the old server will use when it is looking for files to grab and send to the new machine. We prepended the command with sudo and because the user that we're sshing as has the NOPASSWD configured for /usr/bin/rsync it should allow the user to run this without any issue.

-P displays progress of the transfer and also enables partial transfer (if the command is interrupted at any point it will resume any partly-transferred files)

-o and -g preserve the permissions on the files -o for owner and -g for groups. This ensure that the old user and group are correctly set in the new location. This is why it is important that we created the affected users and groups before we initiated the transfer.

-r for recursive - copy folders and their contents recursively, without this, rsync will just stop without copying anything

-z for compression - this compresses data on the old machine before it is sent and decompresses it when it is received on the new machine. This saves bandwidth and these days with fancy, powerful CPUs, the chances are that you'll be able to compress/decompress the data faster than it can be transferred over the net so this is likely to be helpful.

-l allows rsync to copy symlinks - this may be important if you are copying things like conda environments (since library structures often contain multiple links to each other libexample.so.1.2.3 -> libexample.so.1 -> libexample.so).

If all goes well you should see a list of files flash up the screen - this is the list of files that would be copied from the old machine to the new machine if --dry-run wasn't enabled.

If you get any permission errors, double check that you have the right permissions set up on the old machine, make sure that the user you are SSHing as is the same as the user in the sudoers file.

Run the Sync

If your dry run succeeded, you can now execute the full copy and transfer. I'm going to add a couple of additional exclusions with the --exclude operator. We can use wildcards to apply these exclusions to all of our users:

sudo rsync \
  -Porgzl --rsync-path="sudo /usr/bin/rsync" \
  --exclude "*/.local/lib" \
  --exclude "*/.cache" \
  --exclude '*/.vscode-server' \
  --exclude "*/miniconda/pkgs" \
  james@target.domain.com:/home/* \
  /home

The command is pretty much the same as the previous one with the --dry-run flag turned off, with an --exclude for each of the directories we don't care about and with a wildcard in the home directory so that we copy all users rather than just james.

I highly recommend executing long-running commands like this inside tmux so that if your connection from your workstation to the new machine goes down, the process continues.

If your connection does get interrupted and you need to restart, you can run this command with --ignore-existing to have rsync skip any files that were already copied during the failed run.

Finally: Undo the Sudoers Change

Remove the line that you added to the sudoers file that allowed you to run rsync without a password on the old server!