Difference between revision 20 and current revision
No diff available.sudo rsync -axW / /l/backup/sda/date
while excluding some directories (for example, those that themselves contain backups):
sudo rsync -ax --progress --exclude=backup /media/t1/ /media/t3 sudo rsync -ax --exclude=/l2/backup /media/l2 /media/l3/backup/l2/date sudo rsync -ax --exclude=/l/backup --exclude=/l/autobackup /media/l /media/l3/backup/l/date sudo rsync -axW / /l/backup/sda/date --progress --exclude=backup --exclude=big
sudo du -schx /home/bshanks/* --exclude=/proc --exclude=/sys
sort -h |
sort -h |
while excluding some directories (for example, those that themselves contain backups):
sudo rsync -ax --exclude=/home/bshanks/aba --no-specials --no-devices --no-links /home/bshanks /media/bshanks/backup/bshanks
from EXT to VFAT:
sudo mkdir /media/Eposix sudo mount.posixovl -S /media/E /media/Eposix/ sudo rsync -vv --modify-window=2 -axui --progress --timeout=240 --outbuf=N --exclude=/home/bshanks/aba --no-devices --no-specials --no-links /home/bshanks /media/Eposix/backup/
if you are backing up this way you lose information about the creator of the old files. You can do:
(find / -xdev -type f -exec ls -l {} \;) > /tmp/dirlist.txt
to at least have a list of these.
for f in *; do (sudo tar --create $f > $f.tar); done for f in *.tar; do xz $f; done
You need to have some OFFLINE backups (e.g. to a DVD) in addition to backups to hard drives which are always connected (like external USB drives including Time Machine).
I'm going to try the 25GB M-Disc Blu-ray. Unlike pressed DVDs that you buy in stores, most consumer-writable DVDs (i assume this includes blu-rays) can decay over short periods of time (a few years). M-Disc is supposedly a consumer-writable DVD that won't decay so easily. It has both DVD and Blu-ray variants.
note: DO NOT WRITE OVER THE DATA STORAGE SPACE OF CDS OR DVDS, EVEN WITH A CD/DVD-SAFE MARKER. Write only in the non-data-storing centre
When you try to do something on the hard drive and it times out, often it's because there's a problem. Look at dmesg.
'dmesg' is a log from the Linux kernel. It only lasts until you reboot.
To check dmesg, do:
sudo dmesg | less
When i had errors, they looked like this:
[ 150.775878] sd 5:0:0:0: [sdb] Unhandled sense code [ 150.775891] sd 5:0:0:0: [sdb] [ 150.775896] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 150.775901] sd 5:0:0:0: [sdb] [ 150.775906] Sense Key : Medium Error [current] [ 150.775914] sd 5:0:0:0: [sdb] [ 150.775921] Add. Sense: Unrecovered read error [ 150.775926] sd 5:0:0:0: [sdb] CDB: [ 150.775929] Read(10): (omitted) [ 150.775945] end_request: critical medium error, dev sdb, sector 2621445056
I think the crucial part is probably 'critical medium error'; so you may want to do just:
sudo dmesg
grep 'critical medium error' |
To save the current dmesg to a file:
sudo dmesg > file_name.txt
See the fsck section below, because fsck can also do some repairs.
look in /var/log/syslog (or mb lsusb) to get the "ID", then do:
cat /sys/module/usb_storage/parameters/quirks
echo "0x0bc2:0x2344:u" > /sys/module/usb_storage/parameters/quirks
(from https://www.smartmontools.org/wiki/SAT-with-UAS-Linux )
(note: after unplugging such a drive, before doing the quirks thingee, i still see it in lsusb, i couldn't easily find a way short of rebooting to make it disappear) (via https://forums.linuxmint.com/viewtopic.php?t=322252 )
to make it permanent,
sudo vi /etc/modprobe.d/disable_uas.conf
and add line like:
options usb-storage quirks=0bc2:2344:u
then rebuilding your initramdisk. On Pop OS:
sudo update-initramfs -u sudo kernelstub -k /boot/vmlinuz -i /boot/initrd.img
Print all SMART information about a drive:
sudo smartctl -a /dev/sda | less
(for even more, do -xa instead of -a)
NOTE: If you get 'Unknown USB bridge" "Please specify device type with the -d option.", try '-d sat' and '-d scsi'.
NOTE: there is a line in this like "SMART overall-health self-assessment test result: PASSED"; i have seen "PASSED" even on a drive that seemed to me to be pretty obviously failing (a ton of bad blocks), so i would take PASSED with a big grain of salt, and look at the most important SMART attributes, too:
Print just the SMART attributes, highlighting the most important ones for spinning disks (see below) (thanks Benjamin Schweizer):
sudo smartctl -A /dev/sda | grep -E --color "^( *5| *10|184|187|188|197|198|232|233).*|"
Doing SMART tests (warning, on some drives sometimes this seems to reset some of the SMART attributes). Only one test can be run at a time. Tests can usually be run simultaneously with using the drive (although sometimes using the drive may abort the test?). The smartctl -a listing says how long each test will take.
sudo smartctl -t short /dev/sdb sudo smartctl -t conveyance /dev/sdb sudo smartctl -t offline /dev/sdb sudo smartctl -t long /dev/sdb
https://www.backblaze.com/blog/hard-drive-smart-stats/ suggests paying attention to these 5 stats (the above command highlights them for you):
SMART 5 – Reallocated_Sector_Count. SMART 187 – Reported_Uncorrectable_Errors. SMART 188 – Command_Timeout. SMART 197 – Current_Pending_Sector_Count. SMART 198 – Offline_Uncorrectable
Backblaze shares some data on the statistics of these: https://www.backblaze.com/blog-smart-stats-2014-8.html#S5R https://www.backblaze.com/blog-smart-stats-2014-8.html#S5N https://www.backblaze.com/blog-smart-stats-2014-8.html#S187R https://www.backblaze.com/blog-smart-stats-2014-8.html#S187N https://www.backblaze.com/blog-smart-stats-2014-8.html#S188R https://www.backblaze.com/blog-smart-stats-2014-8.html#S188N https://www.backblaze.com/blog-smart-stats-2014-8.html#S197R https://www.backblaze.com/blog-smart-stats-2014-8.html#S197N https://www.backblaze.com/blog-smart-stats-2014-8.html#S198R https://www.backblaze.com/blog-smart-stats-2014-8.html#S198N
https://kb.acronis.com/ and some other sites explain what these mean:
in summary:
they also mention this one, but it's harder to interpret imo, i just see '1 1 1' for all of my drives which report it at all:
that being said, i have 4 hard drives, only one of which appears to be failing, but ALL of them have at least one non-zero value in one or more of the four attributes listed above that should be 0 (5, 187, 197, 198).
for charts involving other SMART attributes, see https://www.backblaze.com/blog-smart-stats-2014-8.html , and for a walkthrough of these sorts of charts, see https://www.backblaze.com/blog/hard-drive-smart-stats/
I don't know which attributes are most important for SSDs. Anyone know of an analysis like this for SSDs? My best guess is that raw "5 Reallocated_Sector_Ct" and normalized "233 Media_Wearout_Indicator" are the ones to watch. http://serverfault.com/a/283272/86461 also mentions 184 End-to-End_Error and 232 Available_Reservd_Space.
defns of these:
In addition to the ones mentioned by Backblaze, https://kb.acronis.com/content/9264 thinks the following are 'critical': 10 - Spin Retry Count, 184 - End-to-End Error, 196 - Reallocation Event Count (and 201 - Soft Read Error Rate, but my drives dont report that and it's not in Backblaze's dataset). Backblaze did not observe nonzero Spin Retry Counts, did not observe non-zero raw End-to-End errors and found an unclear connection between normalized End-to-End errors and failure, found that Reallocation Event Count has a relatively moderate correlation to failure, and doesn't include 232 or 233 in their dataset (presumably these are SSD only).
more informations about the various attributes are at: https://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes
if a drive has media errors, use ddrescue on it, don't wear it further by trying to repair it:
https://www.gnu.org/software/ddrescue/manual/ddrescue_manual.html
If it's just filesystem errors (eg not hardware errors), then you can try to use fsck to repair it:
As noted above, if the drive is failing, don't repair with fsck! The more you use a failing drive, the worse it gets. Use ddrescue to copy the partition to another drive, and then repair the copy.
On ext2/ext3/ext4:
sudo e2fsck -f -y -v /dev/sda1
(the -f option just means to go ahead and fsck even if the drive is marked 'clean'; the -y option means to try and repair things without asking further; the -v is just for verbose)
(of course, replace /dev/sda1 with the path to the partition you want to check)
You might want to ask fsck to scan for bad blocks, and to tell the filesystem where they are if it finds any:
sudo e2fsck -f -kc -y -v /dev/sda1
(this does only reads while looking for bad blocks; the 'k' means to also remember any bad blocks that were previously in the case. If you want to do 'non-destructive' writes, use -cc instead of -c; i think that 'non-destructive' writes may destroy if bad blocks are found, though [6])
warning, this takes forever! You can quit in the middle with cntl-C, though. I'm not sure if this is worth the extra wear it puts on the drive ( http://superuser.com/a/525868/33599 ).
use 'lzip' for long-term archiving
more notes on smartctl:
if a drive isn't recognized, try -d sat or -d scsi
if -d scsi works but -t short gives 'Short offline self test failed [unsupported field in scsi command]', then try disabling uas. First, to get the vendor and product ID, use -d scsi and write down the serial number. Then, use 'usb-devices' and find the device with that serial number, and write down the listed vendor and product ID. Now, umount all usb storage drives, then do:
sudo modprobe -r uas sudo modprobe -r usb-storage sudo modprobe usb-storage quirks=0bc2:3312:u
but where 0bc2 and 3312 are replaced by your vendor and product id.
(thanks [7])
---
sudo diff -qr --no-dereference FILEPATH1 FILEPATH2
grep -v 'is a socket' | grep -v 'is a fifo' | grep -v 'Only in .*: backup' |
---
to create a (potentially future bootable), drive for backup with an encrypted main partition (WORK IN PROGRESS)
GParted
Create:
Click the checkbox and apply. Take note of the "/dev" name of the big partition. Exit Gparted.
To encrypt your new partition and choose your initial encryption password, enter the following commands, but if your big partition "/dev" name is other than /dev/sdc4, replace that with your "/dev" name. And replace "cryptdata1" and "vgrp1" with a unique name:s s sudo -i cryptsetup luksFormat /dev/sdc4 cryptsetup open /dev/sdc4 cryptdata1 pvcreate /dev/mapper/cryptdata1 vgcreate vgrp1 /dev/mapper/cryptdata1 lvcreate -L 32G vgrp1 -n swap lvcreate -l 100%FREE vgrp1 -n root mkfs.ext4 /dev/vgrp1/root mkswap /dev/vgrp1/swap
to mount on future boots, (assuming that on future boots the OS is identifying this drive as /dev/sdb instead of /dev/sdc), you do:
mkdir /media/t2 # only have to do this the first time
--
to restore from backup (todo process these notes)
install pop_os onto replacement drive ("clean install") using an updated official installer on a flash drive
install OS updates using the pop shop app. Restart.
Alt-f2; gnone-terminal; sudo apt install gparted; sudo gparted # figure out the device holding the encrypted volume with your home folder backup, for me this was /dev/sda3; sudo mkdir /media/b0; sudo cryptsetup open /dev/sda3 cryptdata1; (You may need to do sudo vgs; sudo vgrename ; sudo vgscan; sudo lvscan; sudo lvchange -ay /dev/datab0/root in case the volume group on the volume that we're calling b0 has the same name as the root install, that is, data; You can still boot off the other drive that way, You don't have to do anything to get it to recognize the name change) Sudo mount /dev/datab0/root /media/b0; sudo cp -ra /media/b0/home/bshanks /home/bshanks-real. Now reboot onto the backup drive, then sudo mkdir /media/a0; sudo cryptsetup open /dev/nvme0n1p3; sudo mount /dev/data/root /media/a0; sudo mv /media/a0/home/bshanks /media/a0/home/bshanks-empty; sudo mv /media/a0/home/bshanks-real /media/a0/home/bshanks; now restart into the new drive again
sudo apt install fasd emacs xterm neomutt wmctrl apt-file python3 ipython3 notmuch python3-notmuch python3-googleapi python3-oauth2client python3-tqdm plocate python3-pip gimp imagemagick urlview postfix qiv python-is-python3 pwgen vlc hugo zathura default-jre screen tmux gparted ncdu cp -ra ~/.local/lib/python3.9/site-packages/lieer NEW; resync email (cd ~/Mail/bsgmail; gmi sync; cd ~/Mail/bscovgmail; gmi sync;); cp -ra old /etc/postfix dir
accept the defaults for the configuration options for installing post fix. mount /media/b0 as before, then sudo mv /etc/postfix /etc/postfix-empty; sudo cp -ra /media/b0/etc/postfix /etc/
sudo apt install libpango1.0-0 # libpango1.0-0 needed for dropbox sudo dpkg -i dropbox_2020.03.04_amd64.deb
mv Dropbox Dropbox-old. Now install Dropbox as above (do the above steps only after doing this mv) and then when the web page pops up asking you if you want to connect dropbox, do.
Reboot
test if email sending works
---