notes-computer-backups

Difference between revision 20 and current revision

No diff available.

Commands to backup to another hard drive

sudo rsync -axW / /l/backup/sda/date

while excluding some directories (for example, those that themselves contain backups):

sudo rsync -ax --progress --exclude=backup /media/t1/ /media/t3
sudo rsync -ax --exclude=/l2/backup /media/l2 /media/l3/backup/l2/date
sudo rsync -ax --exclude=/l/backup --exclude=/l/autobackup /media/l /media/l3/backup/l/date
sudo rsync -axW / /l/backup/sda/date --progress --exclude=backup --exclude=big


sudo du -schx /home/bshanks/* --exclude=/proc --exclude=/sys

sudo du -schx /home/bshanks/.[^.]* --exclude=/proc --exclude=/sys
sort -h
sort -h

while excluding some directories (for example, those that themselves contain backups):

sudo rsync -ax --exclude=/home/bshanks/aba --no-specials --no-devices --no-links /home/bshanks /media/bshanks/backup/bshanks

from EXT to VFAT:

sudo mkdir /media/Eposix
sudo mount.posixovl -S /media/E /media/Eposix/
sudo rsync -vv --modify-window=2 -axui --progress --timeout=240 --outbuf=N --exclude=/home/bshanks/aba --no-devices --no-specials --no-links /home/bshanks /media/Eposix/backup/

if you are backing up this way you lose information about the creator of the old files. You can do:

(find / -xdev -type f -exec ls -l {} \;) > /tmp/dirlist.txt

to at least have a list of these.


for f in *; do (sudo tar --create $f > $f.tar); done for f in *.tar; do xz $f; done


DVDs

You need to have some OFFLINE backups (e.g. to a DVD) in addition to backups to hard drives which are always connected (like external USB drives including Time Machine).

I'm going to try the 25GB M-Disc Blu-ray. Unlike pressed DVDs that you buy in stores, most consumer-writable DVDs (i assume this includes blu-rays) can decay over short periods of time (a few years). M-Disc is supposedly a consumer-writable DVD that won't decay so easily. It has both DVD and Blu-ray variants.

note: DO NOT WRITE OVER THE DATA STORAGE SPACE OF CDS OR DVDS, EVEN WITH A CD/DVD-SAFE MARKER. Write only in the non-data-storing centre


Checking hard drives for errors

Timing out

When you try to do something on the hard drive and it times out, often it's because there's a problem. Look at dmesg.

Looking thru dmesg

'dmesg' is a log from the Linux kernel. It only lasts until you reboot.

To check dmesg, do:

sudo dmesg | less

When i had errors, they looked like this:

[  150.775878] sd 5:0:0:0: [sdb] Unhandled sense code
[  150.775891] sd 5:0:0:0: [sdb]  
[  150.775896] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  150.775901] sd 5:0:0:0: [sdb]  
[  150.775906] Sense Key : Medium Error [current] 
[  150.775914] sd 5:0:0:0: [sdb]  
[  150.775921] Add. Sense: Unrecovered read error
[  150.775926] sd 5:0:0:0: [sdb] CDB: 
[  150.775929] Read(10): (omitted)
[  150.775945] end_request: critical medium error, dev sdb, sector 2621445056

I think the crucial part is probably 'critical medium error'; so you may want to do just:

sudo dmesg

grep 'critical medium error'

To save the current dmesg to a file:

sudo dmesg > file_name.txt

Checking a partition with fsck

See the fsck section below, because fsck can also do some repairs.

Some seagate enclosures don't work with uas and hence SMART , so blacklist them

look in /var/log/syslog (or mb lsusb) to get the "ID", then do:

cat /sys/module/usb_storage/parameters/quirks

  1. make sure the result is empty b/c the next command will overwrite it

echo "0x0bc2:0x2344:u" > /sys/module/usb_storage/parameters/quirks

(from https://www.smartmontools.org/wiki/SAT-with-UAS-Linux )

(note: after unplugging such a drive, before doing the quirks thingee, i still see it in lsusb, i couldn't easily find a way short of rebooting to make it disappear) (via https://forums.linuxmint.com/viewtopic.php?t=322252 )

to make it permanent,

sudo vi /etc/modprobe.d/disable_uas.conf

and add line like:

options usb-storage quirks=0bc2:2344:u

then rebuilding your initramdisk. On Pop OS:

sudo update-initramfs -u sudo kernelstub -k /boot/vmlinuz -i /boot/initrd.img

Checking a drive with SMART

Print all SMART information about a drive:

sudo smartctl -a /dev/sda | less

(for even more, do -xa instead of -a)

NOTE: If you get 'Unknown USB bridge" "Please specify device type with the -d option.", try '-d sat' and '-d scsi'.

NOTE: there is a line in this like "SMART overall-health self-assessment test result: PASSED"; i have seen "PASSED" even on a drive that seemed to me to be pretty obviously failing (a ton of bad blocks), so i would take PASSED with a big grain of salt, and look at the most important SMART attributes, too:

Print just the SMART attributes, highlighting the most important ones for spinning disks (see below) (thanks Benjamin Schweizer):

sudo smartctl -A /dev/sda | grep -E --color "^( *5| *10|184|187|188|197|198|232|233).*|"

Doing SMART tests (warning, on some drives sometimes this seems to reset some of the SMART attributes). Only one test can be run at a time. Tests can usually be run simultaneously with using the drive (although sometimes using the drive may abort the test?). The smartctl -a listing says how long each test will take.

sudo smartctl -t short /dev/sdb
sudo smartctl -t conveyance /dev/sdb
sudo smartctl -t offline /dev/sdb
sudo smartctl -t long /dev/sdb

Backblaze's suggestions about which SMART attributes to watch

https://www.backblaze.com/blog/hard-drive-smart-stats/ suggests paying attention to these 5 stats (the above command highlights them for you):

SMART 5 – Reallocated_Sector_Count. SMART 187 – Reported_Uncorrectable_Errors. SMART 188 – Command_Timeout. SMART 197 – Current_Pending_Sector_Count. SMART 198 – Offline_Uncorrectable

Backblaze shares some data on the statistics of these: https://www.backblaze.com/blog-smart-stats-2014-8.html#S5R https://www.backblaze.com/blog-smart-stats-2014-8.html#S5N https://www.backblaze.com/blog-smart-stats-2014-8.html#S187R https://www.backblaze.com/blog-smart-stats-2014-8.html#S187N https://www.backblaze.com/blog-smart-stats-2014-8.html#S188R https://www.backblaze.com/blog-smart-stats-2014-8.html#S188N https://www.backblaze.com/blog-smart-stats-2014-8.html#S197R https://www.backblaze.com/blog-smart-stats-2014-8.html#S197N https://www.backblaze.com/blog-smart-stats-2014-8.html#S198R https://www.backblaze.com/blog-smart-stats-2014-8.html#S198N

https://kb.acronis.com/ and some other sites explain what these mean:

in summary:

they also mention this one, but it's harder to interpret imo, i just see '1 1 1' for all of my drives which report it at all:

that being said, i have 4 hard drives, only one of which appears to be failing, but ALL of them have at least one non-zero value in one or more of the four attributes listed above that should be 0 (5, 187, 197, 198).

for charts involving other SMART attributes, see https://www.backblaze.com/blog-smart-stats-2014-8.html , and for a walkthrough of these sorts of charts, see https://www.backblaze.com/blog/hard-drive-smart-stats/

I don't know which attributes are most important for SSDs. Anyone know of an analysis like this for SSDs? My best guess is that raw "5 Reallocated_Sector_Ct" and normalized "233 Media_Wearout_Indicator" are the ones to watch. http://serverfault.com/a/283272/86461 also mentions 184 End-to-End_Error and 232 Available_Reservd_Space.

defns of these:

In addition to the ones mentioned by Backblaze, https://kb.acronis.com/content/9264 thinks the following are 'critical': 10 - Spin Retry Count, 184 - End-to-End Error, 196 - Reallocation Event Count (and 201 - Soft Read Error Rate, but my drives dont report that and it's not in Backblaze's dataset). Backblaze did not observe nonzero Spin Retry Counts, did not observe non-zero raw End-to-End errors and found an unclear connection between normalized End-to-End errors and failure, found that Reallocation Event Count has a relatively moderate correlation to failure, and doesn't include 232 or 233 in their dataset (presumably these are SSD only).

more informations about the various attributes are at: https://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes

If errors are found

if a drive has media errors, use ddrescue on it, don't wear it further by trying to repair it:

https://www.gnu.org/software/ddrescue/manual/ddrescue_manual.html

If it's just filesystem errors (eg not hardware errors), then you can try to use fsck to repair it:

Repairing with fsck

As noted above, if the drive is failing, don't repair with fsck! The more you use a failing drive, the worse it gets. Use ddrescue to copy the partition to another drive, and then repair the copy.

On ext2/ext3/ext4:

sudo e2fsck -f -y -v /dev/sda1

(the -f option just means to go ahead and fsck even if the drive is marked 'clean'; the -y option means to try and repair things without asking further; the -v is just for verbose)

(of course, replace /dev/sda1 with the path to the partition you want to check)

You might want to ask fsck to scan for bad blocks, and to tell the filesystem where they are if it finds any:

sudo e2fsck -f -kc -y -v /dev/sda1

(this does only reads while looking for bad blocks; the 'k' means to also remember any bad blocks that were previously in the case. If you want to do 'non-destructive' writes, use -cc instead of -c; i think that 'non-destructive' writes may destroy if bad blocks are found, though [6])

warning, this takes forever! You can quit in the middle with cntl-C, though. I'm not sure if this is worth the extra wear it puts on the drive ( http://superuser.com/a/525868/33599 ).


Compression

use 'lzip' for long-term archiving


more notes on smartctl:

if a drive isn't recognized, try -d sat or -d scsi

if -d scsi works but -t short gives 'Short offline self test failed [unsupported field in scsi command]', then try disabling uas. First, to get the vendor and product ID, use -d scsi and write down the serial number. Then, use 'usb-devices' and find the device with that serial number, and write down the listed vendor and product ID. Now, umount all usb storage drives, then do:

sudo modprobe -r uas
sudo modprobe -r usb-storage

sudo modprobe usb-storage quirks=0bc2:3312:u

but where 0bc2 and 3312 are replaced by your vendor and product id.

(thanks [7])

---

sudo diff -qr --no-dereference FILEPATH1 FILEPATH2

grep -v 'is a socket'grep -v 'is a fifo'grep -v 'Only in .*: backup'

---

to create a (potentially future bootable), drive for backup with an encrypted main partition (WORK IN PROGRESS)

GParted

Create:

Click the checkbox and apply. Take note of the "/dev" name of the big partition. Exit Gparted.

To encrypt your new partition and choose your initial encryption password, enter the following commands, but if your big partition "/dev" name is other than /dev/sdc4, replace that with your "/dev" name. And replace "cryptdata1" and "vgrp1" with a unique name:s s sudo -i cryptsetup luksFormat /dev/sdc4 cryptsetup open /dev/sdc4 cryptdata1 pvcreate /dev/mapper/cryptdata1 vgcreate vgrp1 /dev/mapper/cryptdata1 lvcreate -L 32G vgrp1 -n swap lvcreate -l 100%FREE vgrp1 -n root mkfs.ext4 /dev/vgrp1/root mkswap /dev/vgrp1/swap

to mount on future boots, (assuming that on future boots the OS is identifying this drive as /dev/sdb instead of /dev/sdc), you do:

mkdir /media/t2 # only have to do this the first time

  1. sudo cryptsetup open /dev/sdb4 cryptdata1 sudo cryptsetup open /dev/sdc4 cryptdata1 sudo mount /dev/vgrp1/root /media/t2
  2. to unmount: sudo umount /media/t2 sudo vgchange -a n vgrp1 sudo cryptsetup close cryptdata1

--

to restore from backup (todo process these notes)

install pop_os onto replacement drive ("clean install") using an updated official installer on a flash drive

install OS updates using the pop shop app. Restart.

Alt-f2; gnone-terminal; sudo apt install gparted; sudo gparted # figure out the device holding the encrypted volume with your home folder backup, for me this was /dev/sda3; sudo mkdir /media/b0; sudo cryptsetup open /dev/sda3 cryptdata1; (You may need to do sudo vgs; sudo vgrename ; sudo vgscan; sudo lvscan; sudo lvchange -ay /dev/datab0/root in case the volume group on the volume that we're calling b0 has the same name as the root install, that is, data; You can still boot off the other drive that way, You don't have to do anything to get it to recognize the name change) Sudo mount /dev/datab0/root /media/b0; sudo cp -ra /media/b0/home/bshanks /home/bshanks-real. Now reboot onto the backup drive, then sudo mkdir /media/a0; sudo cryptsetup open /dev/nvme0n1p3; sudo mount /dev/data/root /media/a0; sudo mv /media/a0/home/bshanks /media/a0/home/bshanks-empty; sudo mv /media/a0/home/bshanks-real /media/a0/home/bshanks; now restart into the new drive again

sudo apt install fasd emacs xterm neomutt wmctrl apt-file python3 ipython3 notmuch python3-notmuch python3-googleapi python3-oauth2client python3-tqdm plocate python3-pip gimp imagemagick urlview postfix qiv python-is-python3 pwgen vlc hugo zathura default-jre screen tmux gparted ncdu cp -ra ~/.local/lib/python3.9/site-packages/lieer NEW; resync email (cd ~/Mail/bsgmail; gmi sync; cd ~/Mail/bscovgmail; gmi sync;); cp -ra old /etc/postfix dir

accept the defaults for the configuration options for installing post fix. mount /media/b0 as before, then sudo mv /etc/postfix /etc/postfix-empty; sudo cp -ra /media/b0/etc/postfix /etc/

sudo apt install libpango1.0-0 # libpango1.0-0 needed for dropbox sudo dpkg -i dropbox_2020.03.04_amd64.deb

  1. note: use that, or download a newer version from the dropbox website, rather than installing nautilus-dropbox

mv Dropbox Dropbox-old. Now install Dropbox as above (do the above steps only after doing this mv) and then when the web page pops up asking you if you want to connect dropbox, do.

  1. upon upgrading, maybe something like: pip install google-api-python-client notmuch oauth2client tqdm cp -ar ~/.local/lib/python3.9/site-packages/lieer ~/.local/lib/python3.10/site-packages/

Reboot

test if email sending works

---