brainsteam.co.uk/brainsteam/content/posts/2024/01/03/Migrating Users Across Serv...

189 lines
9.7 KiB
Markdown
Raw Normal View History

2024-09-08 15:00:57 +01:00
---
categories:
- Software Development
date: '2024-01-03 14:33:37'
draft: false
tags:
- linux
- rsync
title: Migrating Users Across Servers With RSync
type: posts
---
<!-- wp:paragraph -->
<p>I recently needed to migrate some user data from one Ubuntu server to another. It was not possible for me to clone the full disk so I opted to copy the user data and re-create the user accounts on the other machine.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>I used rsync to copy all the user data and preserve all permissions on the files. I needed sudo access on both sides.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>In this article I refer to the <strong>new machine</strong> as the target onto which we want to copy our data and the <strong>old machine</strong> as the source of the data we want to copy.</p>
<!-- /wp:paragraph -->
<!-- wp:heading -->
<h2 class="wp-block-heading">Preparing The Data</h2>
<!-- /wp:heading -->
<!-- wp:paragraph -->
<p>Firstly, check what data you want to clone. I made liberal use of <code>du -sh</code> /home/* to see how much space each of the affected user directories were taking up and worked with them to tidy up their local directories where necessary (lots of junk in hidden places like <code>.local</code> and <code>.cache</code>). A couple of the users had large projects that they were able to purge before we did the copy so I was able to significantly reduce the amount of data I needed to transfer.</p>
<!-- /wp:paragraph -->
<!-- wp:heading -->
<h2 class="wp-block-heading">Create the Users</h2>
<!-- /wp:heading -->
<!-- wp:paragraph -->
<p>For each of the users on the old machine, I created a new account using <code>sudo useradd -m <username></code> - if there are any special groups like <code>sudo</code> or <code>docker</code> you can add them at this point, e.g. <code>sudo useradd -m james -G sudo,docker</code></p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>The <code>-m</code> flag creates the user home directory so if you do an <code>ls /home</code> you should see one directory per user in there.</p>
<!-- /wp:paragraph -->
<!-- wp:heading -->
<h2 class="wp-block-heading">Set Up Passwordless Sudo rsync</h2>
<!-- /wp:heading -->
<!-- wp:paragraph -->
<p>In order to have permission to copy users we need to be able to operate as root on both the new machine and the old machine. We will run the sync command from the new machine with sudo and we can enter the password but that will SSH to the remote system and attempt to sudo and will likely fail if we don't do this next step. </p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>We don't want to give blanket permission for the user to sudo without password auth - that's a pretty big security risk and also an accident waiting to happen (<code>sudo rm -rf /</code> anyone?). Instead, we will edit the <code>/etc/sudoers</code> file and add special permission for our current user (let's say james) to run the rsync command without asking for a password.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>We add a couple of new lines to the bottom of the file like so:</p>
<!-- /wp:paragraph -->
<!-- wp:enlighter/codeblock -->
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># User privilege specification
root ALL=(ALL:ALL) ALL
# Members of the admin group may gain root privileges
%admin ALL=(ALL) ALL
# Allow members of group sudo to execute any command
%sudo ALL=(ALL:ALL) ALL
# Custom user privileges
james ALL=(ALL) NOPASSWD: /usr/bin/rsync
</pre>
<!-- /wp:enlighter/codeblock -->
<!-- wp:paragraph -->
<p>The key thing here is the NOPASSWD: directive, after which we put the path to the rsync binary. We should also remove this again once the sync is successful.</p>
<!-- /wp:paragraph -->
<!-- wp:heading -->
<h2 class="wp-block-heading">Set up and test the connection</h2>
<!-- /wp:heading -->
<!-- wp:paragraph -->
<p>From the new machine we can test our ability to rsync with the old machine. We can use the <code>--dry-run</code> flag to avoid actually copying any data.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p></p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>The test command might like something like this:</p>
<!-- /wp:paragraph -->
<!-- wp:enlighter/codeblock {"language":"bash"} -->
<pre class="EnlighterJSRAW" data-enlighter-language="bash" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sudo rsync -Porgzl --dry-run --rsync-path="sudo /usr/bin/rsync" james@target.domain.com:/home/james /home</pre>
<!-- /wp:enlighter/codeblock -->
<!-- wp:paragraph -->
<p>You will want to run rsync via <code>sudo</code> on the new machine AND the old machine (via the config above) to ensure that we have permission to read and write other users' data. </p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p><code>--dry-run</code> prevents rsync from actually copying any data, it just makes a list of files it would copy if it was run without this option.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p><code>--rsync-path</code> - this is the command that the old server will use when it is looking for files to grab and send to the new machine. We prepended the command with <code>sudo</code> and because the user that we're sshing as has the <code>NOPASSWD</code> configured for <code>/usr/bin/rsync</code> it should allow the user to run this without any issue.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p><code>-P</code> displays progress of the transfer and also enables partial transfer (if the command is interrupted at any point it will resume any partly-transferred files)</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p><code>-o</code> and <code>-g</code> preserve the permissions on the files <code>-o</code> for owner and <code>-g</code> for groups. This ensure that the old user and group are correctly set in the new location. This is why it is important that we created the affected users and groups before we initiated the transfer.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p><code>-r</code> for recursive - copy folders and their contents recursively, without this, rsync will just stop without copying anything</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p><code>-z</code> for compression - this compresses data on the old machine before it is sent and decompresses it when it is received on the new machine. This saves bandwidth and these days with fancy, powerful CPUs, the chances are that you'll be able to compress/decompress the data faster than it can be transferred over the net so this is likely to be helpful.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p><code>-l</code> allows rsync to copy symlinks - this may be important if you are copying things like conda environments (since library structures often contain multiple links to each other libexample.so.1.2.3 -> libexample.so.1 -> libexample.so).</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>If all goes well you should see a list of files flash up the screen - this is the list of files that would be copied from the old machine to the new machine if <code>--dry-run</code> wasn't enabled.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>If you get any permission errors, double check that you have the right permissions set up on the old machine, make sure that the user you are SSHing as is the same as the user in the sudoers file. </p>
<!-- /wp:paragraph -->
<!-- wp:heading -->
<h2 class="wp-block-heading">Run the Sync</h2>
<!-- /wp:heading -->
<!-- wp:paragraph -->
<p>If your dry run succeeded, you can now execute the full copy and transfer. I'm going to add a couple of additional exclusions with the --exclude operator. We can use wildcards to apply these exclusions to all of our users:</p>
<!-- /wp:paragraph -->
<!-- wp:enlighter/codeblock {"language":"bash"} -->
<pre class="EnlighterJSRAW" data-enlighter-language="bash" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sudo rsync \
-Porgzl --rsync-path="sudo /usr/bin/rsync" \
--exclude "*/.local/lib" \
--exclude "*/.cache" \
--exclude '*/.vscode-server' \
--exclude "*/miniconda/pkgs" \
james@target.domain.com:/home/* \
/home</pre>
<!-- /wp:enlighter/codeblock -->
<!-- wp:paragraph -->
<p>The command is pretty much the same as the previous one with the <code>--dry-run</code> flag turned off, with an <code>--exclude</code> for each of the directories we don't care about and with a wildcard in the home directory so that we copy all users rather than just <code>james</code>. </p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>I highly recommend executing long-running commands like this inside <a href="https://github.com/tmux/tmux/wiki">tmux</a> so that if your connection from your workstation to the new machine goes down, the process continues.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p>If your connection does get interrupted and you need to restart, you can run this command with <code>--ignore-existing</code> to have rsync skip any files that were already copied during the failed run.</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p></p>
<!-- /wp:paragraph -->
<!-- wp:heading -->
<h2 class="wp-block-heading">Finally: Undo the Sudoers Change</h2>
<!-- /wp:heading -->
<!-- wp:paragraph -->
<p>Remove the line that you added to the sudoers file that allowed you to run rsync without a password on the old server!</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p></p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p></p>
<!-- /wp:paragraph -->