Posted on Leave a comment

Incremental backup with Butterfly Backup

Introduction

This article explains how to make incremental or differential backups, with a catalog available to restore (or export) at the point you want, with Butterfly Backup.

Requirements

Butterfly Backup is a simple wrapper of rsync written in python; the first requirement is python3.3 or higher (plus module cryptography for init action). Other requirements are openssh and rsync (version 2.5 or higher). Ok, let’s go!

[Editors note: rsync version 3.2.3 is already installed on Fedora 33 systems]

$ sudo dnf install python3 openssh rsync git
$ sudo pip3 install cryptography

Installation

After that, installing Butterfly Backup is very simple by using the following commands to clone the repository locally, and set up Butterfly Backup for use:

$ git clone https://github.com/MatteoGuadrini/Butterfly-Backup.git
$ cd Butterfly-Backup
$ sudo python3 setup.py
$ bb --help
$ man bb

To upgrade, you would use the same commands too.

Example

Butterfly Backup is a server to client tool and is installed on a server (or workstation). The restore process restores the files into the specified client. This process shares some of the options available to the backup process.

Backups are organized accord to precise catalog; this is an example:

$ tree destination/of/backup
.
├── destination
│ ├── hostname or ip of the PC under backup
│ │ ├── timestamp folder
│ │ │ ├── backup folders
│ │ │ ├── backup.log
│ │ │ └── restore.log
│ │ ├─── general.log
│ │ └─── symlink of last backup
│
├── export.log
├── backup.list
└── .catalog.cfg

Butterfly Backup has six main operations, referred to as actions, you can get information about them with the –help command.

$ bb --help
usage: bb [-h] [--verbose] [--log] [--dry-run] [--version] {config,backup,restore,archive,list,export} ... Butterfly Backup optional arguments: -h, --help show this help message and exit --verbose, -v Enable verbosity --log, -l Create a log --dry-run, -N Dry run mode --version, -V Print version action: Valid action {config,backup,restore,archive,list,export} Available actions config Configuration options backup Backup options restore Restore options archive Archive options list List options export Export options

Configuration

Configuration mode is straight forward; If you’re already familiar with the exchange keys and OpenSSH, you probably won’t need it. First, you must create a configuration (rsa keys), for instance:

$ bb config --new
SUCCESS: New configuration successfully created!

After creating the configuration, the keys will be installed (copied) on the hosts you want to backup:

$ bb config --deploy host1
Copying configuration to host1; write the password:
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/arthur/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
arthur@host1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'arthur@host1'"
and check to make sure that only the key(s) you wanted were added. SUCCESS: Configuration copied successfully on host1!

Backup

There are two backup modes: single and bulk.
The most relevant features of the two backup modes are the parallelism and retention of old backups. See the two parameters –parallel and –retention in the documentation.

Single backup

The backup of a single machine consists in taking the files and folders indicated in the command line, and putting them into the cataloging structure indicated above. In other words, copy all file and folders of a machine into a path.

This is an examples:

$ bb backup --computer host1 --destination /mnt/backup --data User Config --type Unix
Start backup on host1
SUCCESS: Command rsync -ah --no-links arthur@host1:/home :/etc /mnt/backup/host1/2020_09_19__10_28

Bulk backup

Above all, bulk mode backups share the same options as single mode, with the difference that they accept a file containing a list of hostnames or ips. In this mode backups will performed in parallel (by default 5 machines at a time). Above all, if you want to run fewer or more machines in parallel, specify the –parallel parameter.

Incremental of the previous backup, for instance:

$ cat /home/arthur/pclist.txt
host1
host2
host3
$ bb backup --list /home/arthur/pclist.txt --destination /mnt/backup --data User Config --type Unix
ERROR: The port 22 on host2 is closed!
ERROR: The port 22 on host3 is closed!
Start backup on host1
SUCCESS: Command rsync -ahu --no-links --link-dest=/mnt/backup/host1/2020_09_19__10_28 arthur@host1:/home :/etc /mnt/backup/host1/2020_09_19__10_50

There are four backup modes, which you specify with the –mode flag: Full (backup all files) , Mirror (backup all files in mirror mode), Differential (is based on the latest Full backup) and Incremental (is based on the latest backup).
The default mode is Incremental; Full mode is set by default when the flag is not specified.

Listing catalog

The first time you run backup commands, the catalog is created. The catalog is used for future backups and all the restores that are made through Butterfly Backup. To query this catalog use the list command.
First, let’s query the catalog in our example:

$ bb list --catalog /mnt/backup BUTTERFLY BACKUP CATALOG Backup id: aba860b0-9944-11e8-a93f-005056a664e0
Hostname or ip: host1
Timestamp: 2020-09-19 10:28:12 Backup id: dd6de2f2-9a1e-11e8-82b0-005056a664e0
Hostname or ip: host1
Timestamp: 2020-09-19 10:50:59

Press q for exit and select a backup-id:

$ bb list --catalog /mnt/backup --backup-id dd6de2f2-9a1e-11e8-82b0-005056a664e0
Backup id: dd6de2f2-9a1e-11e8-82b0-005056a664e0
Hostname or ip: host1
Type: Incremental
Timestamp: 2020-09-19 10:50:59
Start: 2020-09-19 10:50:59
Finish: 2020-09-19 11:43:51
OS: Unix
ExitCode: 0
Path: /mnt/backup/host1/2020_09_19__10_50
List: backup.log
etc
home

To export the catalog list use it with an external tool like cat, include the log flag:

$ bb list --catalog /mnt/backup --log
$ cat /mnt/backup/backup.list

Restore

The restore process is the exact opposite of the backup process. It takes the files from a specific backup and push it to the destination computer.
This command perform a restore on the same machine of the backup, for instance:

$ bb restore --catalog /mnt/backup --backup-id dd6de2f2-9a1e-11e8-82b0-005056a664e0 --computer host1 --log
Want to do restore path /mnt/backup/host1/2020_09_19__10_50/etc? To continue [Y/N]? y
Want to do restore path /mnt/backup/host1/2020_09_19__10_50/home? To continue [Y/N]? y
SUCCESS: Command rsync -ahu -vP --log-file=/mnt/backup/host1/2020_09_19__10_50/restore.log /mnt/backup/host1/2020_09_19__10_50/etc arthur@host1:/restore_2020_09_19__10_50
SUCCESS: Command rsync -ahu -vP --log-file=/mnt/backup/host1/2020_09_19__10_50/restore.log /mnt/backup/host1/2020_09_19__10_50/home/* arthur@host1:/home

Without specifying the “type” flag that indicates the operating system on which the data are being retrieved, Butterfly Backup will select it directly from the catalog via the backup-id.

Archive old backup

Archive operations are used to store backups by saving disk space.

$ bb archive --catalog /mnt/backup/ --days 1 --destination /mnt/archive/ --verbose --log
INFO: Check archive this backup f65e5afe-9734-11e8-b0bb-005056a664e0. Folder /mnt/backup/host1/2020_09_18__17_50
INFO: Check archive this backup 4f2b5f6e-9939-11e8-9ab6-005056a664e0. Folder /mnt/backup/host1/2020_09_15__07_26
SUCCESS: Delete /mnt/backup/host1/2020_09_15__07_26 successfully.
SUCCESS: Archive /mnt/backup/host1/2020_09_15__07_26 successfully.
$ ls /mnt/archive
host1
$ ls /mnt/archive/host1
2020_09_15__07_26.zip

After that, look in the catalog and see that the backup was actually archived:

$ bb list --catalog /mnt/backup/ -i 4f2b5f6e-9939-11e8-9ab6-005056a664e0
Backup id: 4f2b5f6e-9939-11e8-9ab6-005056a664e0
Hostname or ip: host1
Type: Incremental
Timestamp: 2020-09-15 07:26:46
Start: 2020-09-15 07:26:46
Finish: 2020-09-15 08:43:45
OS: Unix
ExitCode: 0
Path: /mnt/backup/host1/2020_09_15__07_26
Archived: True

Conclusion

Butterfly Backup was born from a very complex need; this tool provides superpowers to rsync, automates the backup and restore process. In addition, the catalog allows you to have a system similar to a “time machine”.

In conclusion, Butterfly Backup is a lightweight, versatile, simple and scriptable backup tool.

One more thing; Easter egg: bb -Vv

Thank you for reading my post.

Full documentation: https://butterfly-backup.readthedocs.io/
Github: https://github.com/MatteoGuadrini/Butterfly-Backup


Photo by Manu M on Unsplash.

Posted on Leave a comment

Web of Trust, Part 2: Tutorial

The previous article looked at how the Web of Trust works in concept, and how the Web of Trust is implemented at Fedora. In this article, you’ll learn how to do it yourself. The power of this system lies in everybody being able to validate the actions of others—if you know how to validate somebody’s work, you’re contributing to the strength of our shared security.

Choosing a project

Remmina is a remote desktop client written in GTK+. It aims to be useful for system administrators and travelers who need to work with lots of remote computers in front of either large monitors or tiny netbooks. In the current age, where many people must work remotely or at least manage remote servers, the security of a program like Remmina is critical. Even if you do not use it yourself, you can contribute to the Web of Trust by checking it for others.

The question is: how do you know that a given version of Remmina is good, and that the original developer—or distribution server—has not been compromised?

For this tutorial, you’ll use Flatpak and the Flathub repository. Flatpak is intentionally well-suited for making verifiable rebuilds, which is one of the tenets of the Web of Trust. It’s easier to work with since it doesn’t require users to download independent development packages. Flatpak also uses techniques to prevent in‑flight tampering, using hashes to validate its read‑only state. As far as the Web of Trust is concerned, Flatpak is the future.

For this guide, you use Remmina, but this guide generally applies to every application you use. It’s also not exclusive to Flatpak, and the general steps also apply to Fedora’s repositories. In fact, if you’re currently reading this article on Debian or Arch, you can still follow the instructions. If you want to follow along using traditional RPM repositories, make sure to check out this article.

Installing and checking

To install Remmina, use the Software Center or run the following from a terminal:

flatpak install flathub org.remmina.Remmina -y

After installation, you’ll find the files in:

/var/lib/flatpak/app/org.remmina.Remmina/current/active/files/ 

Open a terminal here and find the following directories using ls -la:

total 44
drwxr-xr-x. 2 root root 4096 Jan 1 1970 bin
drwxr-xr-x. 3 root root 4096 Jan 1 1970 etc
drwxr-xr-x. 8 root root 4096 Jan 1 1970 lib
drwxr-xr-x. 2 root root 4096 Jan 1 1970 libexec
-rw-r--r--. 2 root root 18644 Aug 25 14:37 manifest.json
drwxr-xr-x. 2 root root 4096 Jan 1 1970 sbin
drwxr-xr-x. 15 root root 4096 Jan 1 1970 share

Getting the hashes

In the bin directory you will find the main binaries of the application, and in lib you find all dependencies that Remmina uses. Now calculate a hash for ./bin/remmina:

sha256sum ./bin/*

This will give you a list of numbers: checksums. Copy them to a temporary file, as this is the current version of Remmina that Flathub is distributing. These numbers have something special: only an exact copy of Remmina can give you the same numbers. Any change in the code—no matter how minor—will produce different numbers.

Like Fedora’s Koji and Bodhi build and update services, Flathub has all its build servers in plain view. In the case of Flathub, look at Buildbot to see who is responsible for the official binaries of a package. Here you will find all of the logs, including all the failed builds and their paper trail.

Illustration image, which shows the process-graph of Buildbot on Remmina.

Getting the source

The main Flathub project is hosted on GitHub, where the exact compile instructions (“manifest” in Flatpak terms) are visible for all to see. Open a new terminal in your Home folder. Clone the instructions, and possible submodules, using one command:

git clone --recurse-submodules https://github.com/flathub/org.remmina.Remmina

Developer tools

Start off by installing the Flatpak Builder:

sudo dnf install flatpak-builder

After that, you’ll need to get the right SDK to rebuild Remmina. In the manifest, you’ll find the current SDK is.

 "runtime": "org.gnome.Platform", "runtime-version": "3.38", "sdk": "org.gnome.Sdk", "command": "remmina",

This indicates that you need the GNOME SDK, which you can install with:

flatpak install org.gnome.Sdk//3.38

This provides the latest versions of the Free Desktop and GNOME SDK. There are also additional SDK’s for additional options, but those are beyond the scope of this tutorial.

Generating your own hashes

Now that everything is set up, compile your version of Remmina by running:

flatpak-builder build-dir org.remmina.Remmina.json --force-clean

After this, your terminal will print a lot of text, your fans will start spinning, and you’re compiling Remmina. If things do not go so smoothly, refer to the Flatpak Documentation; troubleshooting is beyond the scope of this tutorial.

Once complete, you should have the directory ./build-dir/files/, which should contain the same layout as above. Now the moment of truth: it’s time to generate the hashes for the built project:

sha256sum ./bin/*
Illustrative image, showing the output of sha256sum. To discourage copy-pasting old hashes, they are not provided as in-text.

You should get exactly the same numbers. This proves that the version on Flathub is indeed the version that the Remmina developers and maintainers intended for you to run. This is great, because this shows that Flathub has not been compromised. The web of trust is strong, and you just made it a bit better.

Going deeper

But what about the ./lib/ directory? And what version of Remmina did you actually compile? This is where the Web of Trust starts to branch. First, you can also double-check the hashes of the ./lib/ directory. Repeat the sha256sum command using a different directory.

But what version of Remmina did you compile? Well, that’s in the Manifest. In the text file you’ll find (usually at the bottom) the git repository and branch that you just used. At the time of this writing, that is:

 "type": "git", "url": "https://gitlab.com/Remmina/Remmina.git", "tag": "v1.4.8", "commit": "7ebc497062de66881b71bbe7f54dabfda0129ac2"

Here, you can decide to look at the Remmina code itself:

git clone --recurse-submodules https://gitlab.com/Remmina/Remmina.git cd ./Remmina git checkout tags/v1.4.8

The last two commands are important, since they ensure that you are looking at the right version of Remmina. Make sure you use the corresponding tag of the Manifest file. you can see everything that you just built.

What if…?

The question on some minds is: what if the hashes don’t match? Quoting a famous novel: “Don’t Panic.” There are multiple legitimate reasons as to why the hashes do not match.

It might be that you are not looking at the same version. If you followed this guide to a T, it should give matching results, but minor errors will cause vastly different results. Repeat the process, and ask for help if you’re unsure if you’re making errors. Perhaps Remmina is in the process of updating.

But if that still doesn’t justify the mismatch in hashes, go to the maintainers of Remmina on Flathub and open an issue. Assume good intentions, but you might be onto something that isn’t totally right.

The most obvious upstream issue is that Remmina does not properly support reproducible builds yet. The code of Remmina needs to be written in such a way that repeating the same action twice, gives the same result. For developers, there is an entire guide on how to do that. If this is the case, there should be an issue on the upstream bug-tracker, and if it is not there, make sure that you create one by explaining your steps and the impact.

If all else fails, and you’ve informed upstream about the discrepancies and they to don’t know what is happening, then it’s time to send an email to the Administrators of Flathub and the developer in question.

Conclusion

At this point, you’ve gone through the entire process of validating a single piece of a bigger picture. Here, you can branch off in different directions:

  • Try another Flatpak application you like or use regularly
  • Try the RPM version of Remmina
  • Do a deep dive into the C code of Remmina
  • Relax for a day, knowing that the Web of Trust is a collective effort

In the grand scheme of things, we can all carry a small part of responsibility in the Web of Trust. By taking free/libre open source software (FLOSS) concepts and applying them in the real world, you can protect yourself and others. Last but not least, by understanding how the Web of Trust works you can see how FLOSS software provides unique protections.

Posted on Leave a comment

systemd-resolved: introduction to split DNS

Fedora 33 switches the default DNS resolver to systemd-resolved. In simple terms, this means that systemd-resolved will run as a daemon. All programs wanting to translate domain names to network addresses will talk to it. This replaces the current default lookup mechanism where each program individually talks to remote servers and there is no shared cache.

If necessary, systemd-resolved will contact remote DNS servers. systemd-resolved is a “stub resolver”—it doesn’t resolve all names itself (by starting at the root of the DNS hierarchy and going down label by label), but forwards the queries to a remote server.

A single daemon handling name lookups provides significant benefits. The daemon caches answers, which speeds answers for frequently used names. The daemon remembers which servers are non-responsive, while previously each program would have to figure this out on its own after a timeout. Individual programs only talk to the daemon over a local transport and are more isolated from the network. The daemon supports fancy rules which specify which name servers should be used for which domain names—in fact, the rest of this article is about those rules.

Split DNS

Consider the scenario of a machine that is connected to two semi-trusted networks (wifi and ethernet), and also has a VPN connection to your employer. Each of those three connections has its own network interface in the kernel. And there are multiple name servers: one from a DHCP lease from the wifi hotspot, two specified by the VPN and controlled by your employer, plus some additional manually-configured name servers. Routing is the process of deciding which servers to ask for a given domain name. Do not mistake this with the process of deciding where to send network packets, which is called routing too.

The network interface is king in systemd-resolved. systemd-resolved first picks one or more interfaces which are appropriate for a given name, and then queries one of the name servers attached to that interface. This is known as “split DNS”.

There are two flavors of domains attached to a network interface: routing domains and search domains. They both specify that the given domain and any subdomains are appropriate for that interface. Search domains have the additional function that single-label names are suffixed with that search domain before being resolved. For example, a lookup for “server” is treated as a lookup for “server.example.com” if the search domain is “example.com.” In systemd-resolved config files, routing domains are prefixed with the tilde (~) character.

Specific example

Now consider a specific example: your VPN interface tun0 has a search domain private.company.com and a routing domain ~company.com. If you ask for mail.private.company.com, it is matched by both domains, so this name would be routed to tun0.

A request for www.company.com is matched by the second domain and would also go to tun0. If you ask for www, (in other words, if you specify a single-label name without any dots), the difference between routing and search domains comes into play. systemd-resolved attempts to combine the single-label name with the search domain and tries to resolve www.private.company.com on tun0.

If you have multiple interfaces with search domains, single-label names are suffixed with all search domains and resolved in parallel. For multi-label names, no suffixing is done; search and routing domains are are used to route the name to the appropriate interface. The longest match wins. When there are multiple matches of the same length on different interfaces, they are resolved in parallel.

A special case is when an interface has a routing domain ~. (a tilde for a routing domain and a dot for the root DNS label). Such an interface always matches any names, but with the shortest possible length. Any interface with a matching search or routing domain has higher priority, but the interface with ~. is used for all other names. Finally, if no routing or search domains matched, the name is routed to all interfaces that have at least one name server attached.

Lookup routing in systemd-resolved

Domain routing

This seems fairly complex, partially because of the historic names which are confusing. In actual practice it’s not as complicated as it seems.

To introspect a running system, use the resolvectl domain command. For example:

$ resolvectl domain
Global:
Link 4 (wlp4s0): ~.
Link 18 (hub0):
Link 26 (tun0): redhat.com

You can see that www would resolve as www.redhat.com. over tun0. Anything ending with redhat.com resolves over tun0. Everything else would resolve over wlp4s0 (the wireless interface). In particular, a multi-label name like www.foobar would resolve over wlp4s0, and most likely fail because there is no foobar top-level domain (yet).

Server routing

Now that you know which interface or interfaces should be queried, the server or servers to query are easy to determine. Each interface has one or more name servers configured. systemd-resolved will send queries to the first of those. If the server is offline and the request times out or if the server sends a syntactically-invalid answer (which shouldn’t happen with “normal” queries, but often becomes an issue when DNSSEC is enabled), systemd-resolved switches to the next server on the list. It will use that second server as long as it keeps responding. All servers are used in a round-robin rotation.

To introspect a running system, use the resolvectl dns command:

$ resolvectl dns
Global:
Link 4 (wlp4s0): 192.168.1.1 8.8.4.4 8.8.8.8
Link 18 (hub0):
Link 26 (tun0): 10.45.248.15 10.38.5.26

When combined with the previous listing, you know that for www.redhat.com, systemd-resolved will query 10.45.248.15, and—if it doesn’t respond—10.38.5.26. For www.google.com, systemd-resolved will query 192.168.1.1 or the two Google servers 8.8.4.4 and 8.8.8.8.

Differences from nss-dns

Before going further detail, you may ask how this differs from the previous default implementation (nss-dns). With nss-dns there is just one global list of up to three name servers and a global list of search domains (specified as nameserver and search in /etc/resolv.conf).

Each name to query is sent to the first name server. If it doesn’t respond, the same query is sent to the second name server, and so on. systemd-resolved implements split-DNS and remembers which servers are currently considered active.

For single-label names, the query is performed with each of the the search domains suffixed. This is the same with systemd-resolved. For multi-label names, a query for the unsuffixed name is performed first, and if that fails, a query for the name suffixed by each of the search domains in turn is performed. systemd-resolved doesn’t do that last step; it only suffixes single-label names.

A second difference is that with nss-dns, this module is loaded into each process. The process itself communicates with remote servers and implements the full DNS stack internally. With systemd-resolved, the nss-resolve module is loaded into the process, but it only forwards the query to systemd-resolved over a local transport (D-Bus) and doesn’t do any work itself. The systemd-resolved process is heavily sandboxed using systemd service features.

The third difference is that with systemd-resolved all state is dynamic and can be queried and updated using D-Bus calls. This allows very strong integration with other daemons or graphical interfaces.

Configuring systemd-resolved

So far, this article talked about servers and the routing of domains without explaining how to configure them. systemd-resolved has a configuration file (/etc/systemd/resolv.conf) where you specify name servers with DNS= and routing or search domains with Domains= (routing domains with ~, search domains without). This corresponds to the Global: lists in the two listings above.

In this article’s examples, both lists are empty. Most of the time configuration is attached to specific interfaces, and “global” configuration is not very useful. Interfaces come and go and it isn’t terribly smart to contact servers on an interface which is down. As soon as you create a VPN connection, you want to use the servers configured for that connection to resolve names, and as soon as the connection goes down, you want to stop.

How does then systemd-resolved acquire the configuration for each interface? This happens dynamically, with the network management service pushing this configuration over D-Bus into systemd-resolved. The default in Fedora is NetworkManager and it has very good integration with systemd-resolved. Alternatives like systemd’s own systemd-networkd implement similar functionality. But the interface is open and other programs can do the appropriate D-Bus calls.

Alternatively, resolvectl can be used for this (it is just a wrapper around the D-Bus API). Finally, resolvconf provides similar functionality in a form compatible with a tool in Debian with the same name.

Scenario: Local connection more trusted than VPN

The important thing is that in the common scenario, systemd-resolved follows the configuration specified by other tools, in particular NetworkManager. So to understand how systemd-resolved names, you need to see what NetworkManager tells it to do. Normally NM will tell systemd-resolved to use the name servers and search domains received in a DHCP lease on some interface. For example, look at the source of configuration for the two listings shown above:

There are two connections: “Parkinson” wifi and “Brno (BRQ)” VPN. In the first panel DNS:Automatic is enabled, which means that the DNS server received as part of the DHCP lease (192.168.1.1) is passed to systemd-resolved. Additionally. 8.8.4.4 and 8.8.8.8 are listed as alternative name servers. This configuration is useful if you want to resolve the names of other machines in the local network, which 192.168.1.1 provides. Unfortunately the hotspot DNS server occasionally gets stuck, and the other two servers provide backup when that happens.

The second panel is similar, but doesn’t provide any special configuration. NetworkManager combines routing domains for a given connection from DHCP, SLAAC RDNSS, and VPN, and finally manual configuration and forward this to systemd-resolved. This is the source of the search domain redhat.com in the listing above.

There is an important difference between the two interfaces though: in the second panel, “Use this connection only for resources on its network” is checked. This tells NetworkManager to tell systemd-resolved to only use this interface for names under the search domain received as part of the lease (Link 26 (tun0): redhat.com in the first listing above). In the first panel, this checkbox is unchecked, and NetworkManager tells systemd-resolved to use this interface for all other names (Link 4 (wlp4s0): ~.). This effectively means that the wireless connection is more trusted.

Scenario: VPN more trusted than local network

In a different scenario, a VPN would be more trusted than the local network and the domain routing configuration reversed. If a VPN without “Use this connection only for resources on its network” is active, NetworkManager tells systemd-resolved to attach the default routing domain to this interface. After unchecking the checkbox and restarting the VPN connection:

$ resolvectl domain
Global:
Link 4 (wlp4s0):
Link 18 (hub0):
Link 28 (tun0): ~. redhat.com
$ resolvectl dns
Global:
Link 4 (wlp4s0):
Link 18 (hub0):
Link 28 (tun0): 10.45.248.15 10.38.5.26

Now all domain names are routed to the VPN. The network management daemon controls systemd-resolved and the user controls the network management daemon.

Additional systemd-resolved functionality

As mentioned before, systemd-resolved provides a common name lookup mechanism for all programs running on the machine. Right now the effect is limited: shared resolver and cache and split DNS (the lookup routing logic described above). systemd-resolved provides additional resolution mechanisms beyond the traditional unicast DNS. These are the local resolution protocols MulticastDNS and LLMNR, and an additional remote transport DNS-over-TLS.

Fedora 33 does not enable MulticastDNS and DNS-over-TLS in systemd-resolved. MulticastDNS is implemented by nss-mdns4_minimal and Avahi. Future Fedora releases may enable these as the upstream project improves support.

Implementing this all in a single daemon which has runtime state allows smart behaviour: DNS-over-TLS may be enabled in opportunistic mode, with automatic fallback to classic DNS if the remote server does not support it. Without the daemon which can contain complex logic and runtime state this would be much harder. When enabled, those additional features will apply to all programs on the system.

There is more to systemd-resolved: in particular LLMNR and DNSSEC, which only received brief mention here. A future article will explore those subjects.

Posted on Leave a comment

Web of Trust, Part 1: Concept

Every day we rely on technologies who nobody can fully understand. Since well before the industrial revolution, complex and challenging tasks required an approach that broke out the different parts into smaller scale tasks. Each resulting in specialized knowledge used in some parts of our lives, leaving other parts to trust in skills that others had learned. This shared knowledge approach also applies to software. Even the most avid readers of this magazine, will likely not compile and validate every piece of code they run. This is simply because the world of computers is itself also too big for one person to grasp.

Still, even though it is nearly impossible to understand everything that happens within your PC when you are using it, that does not leave you blind and unprotected. FLOSS software shares trust, giving protection to all users, even if individual users can’t grasp all parts in the system. This multi-part article will discuss how this ‘Web of Trust’ works and how you can get involved.

But first we’ll have to take a step back and discuss the basic concepts, before we can delve into the details and the web. Also, a note before we start, security is not just about viruses and malware. Security also includes your privacy, your economic stability and your technological independence.

One-Way System

By their design, computers can only work and function in the most rudimentary ways of logic: True or false. And or Or. This (boolean logic) is not readily accessible to humans, therefore we must do something special. We write applications in a code that we can (reasonably) comprehend (human readable). Once completed, we turn this human readable code into a code that the computer can comprehend (machine code).

The step of conversion is called compilation and/or building, and it’s a one-way process. Compiled code (machine code) is not really understandable by humans, and it takes special tools to study in detail. You can understand small chunks, but on the whole, an entire application becomes a black box.

This subtle difference shifts power. Power, in this case being the influence of one person over another person. The person who has written the human-readable version of the application and then releases it as compiled code to use by others, knows all about what the code does, while the end user knows a very limited scope. When using (software) in compiled form, it is impossible to know for certain what an application is intended to do, unless the original human readable code can be viewed.

The Nature of Power

Spearheaded by Richard Stallman, this shift of power became a point of concern. This discussion started in the 1980s, for this was the time that computers left the world of academia and research, and entered the world of commerce and consumers. Suddenly, that power became a source of control and exploitation.

One way to combat this imbalance of power, was with the concept of FLOSS software. FLOSS Software is built on 4-Freedoms, which gives you a wide array of other ‘affiliated’ rights and guarantees. In essence, FLOSS software uses copyright-licensing as a form of moral contract, that forces software developers not to leverage the one-way power against their users. The principle way of doing this, is with the the GNU General Public Licenses, which Richard Stallman created and has since been promoting.

One of those guarantees, is that you can see the code that should be running on your device. When you get a device using FLOSS software, then the manufacturer should provide you the code that the device is using, as well as all instructions that you need to compile that code yourself. Then you can replace the code on the device with the version you can compile yourself. Even better, if you compare the version you have with the version on the device, you can see if the device manufacturer tried to cheat you or other customers.

This is where the web of Trust comes back into the picture. The Web of Trust implies that even if the vast majority of people can’t validate the workings of a device, that others can do so on their behalf. Journalists, security analysts and hobbyists, can do the work that others might be unable to do. And if they find something, they have the power to share their findings.

Security by Blind Trust

This is of course, if the application and all components underneath it, are FLOSS. Proprietary software, or even software which is merely Open Source, has compiled versions that nobody can recreate and validate. Thus, you can never truly know if that software is secure. It might have a backdoor, it might sell your personal data, or it might be pushing a closed ecosystem to create a vendor-lock. With closed-source software, your security is as good as the company making the software is trustworthy.

For companies and developers, this actually creates another snare. While you might still care about your users and their security, you’re a liability: If a criminal can get to your official builds or supply-chain, then there is no way for anybody to discover that afterwards. An increasing number of attacks do not target users directly, but instead try to get in, by exploiting the trust the companies/developers have carefully grown.

You should also not underestimate pressure from outside: Governments can ask you to ignore a vulnerability, or they might even demand cooperation. Investment firms or shareholders, may also insist that you create a vendor-lock for future use. The blind trust that you demand of your users, can be used against you.

Security by a Web of Trust

If you are a user, FLOSS software is good because others can warn you when they find suspicious elements. You can use any FLOSS device with minimal economic risk, and there are many FLOSS developers who care for your privacy. Even if the details are beyond you, there are rules in place to facilitate trust.

If you are a tinkerer, FLOSS is good because with a little extra work, you can check the promises of others. You can warn people when something goes wrong, and you can validate the warnings of others. You’re also able to check individual parts in a larger picture. The libraries used by FLOSS applications, are also open for review: It’s “Trust all the way down”.

For companies and developers, FLOSS is also a great reassurance that your trust can’t be easily subverted. If malicious actors wish to attack your users, then any irregularity can quickly be spotted. Last but not least, since you also stand to defend your customers economic well-being and privacy, you can use that as an important selling point to customers who care about their own security.

Fedora’s case

Fedora embraces the concept of FLOSS and it stands strong to defend it. There are comprehensive legal guidelines, and Fedora’s principles are directly referencing the 4-Freedoms: Freedom, Friends, Features, and First

Fedora's Foundation logo, with Freedom highlighted. Illustrative.

To this end, entire systems have been set up to facilitate this kind of security. Fedora works completely in the open, and any user can check the official servers. Koji is the name of the Fedora Buildsystem, and you can see every application and it’s build logs there. For added security, there is also Bohdi, which orchestrates the deployment of an application. Multiple people must approve it, before the application can become available.

This creates the Web of Trust on which you can rely. Every package in the repository goes through the same process, and at every point somebody can intervene. There are also escalation systems in place to report issues, so that issues can quickly be tackled when they occur. Individual contributors also know that they can be reviewed at every time, which itself is already enough of a precaution to dissuade mischievous thoughts.

You don’t have to trust Fedora (implicitly), you can get something better; trust in users like you.

Posted on Leave a comment

Recover your files from Btrfs snapshots

As you have seen in a previous article, Btrfs snapshots are a convenient and fast way to make backups. Please note that these articles do not suggest that you avoid backup software or well-tested backup plans. Their goals are to show a great feature of this file system, snapshots, and to inspire curiosity and invite you to explore, experiment and deepen the subject. Read on for more about how to recover your files from Btrfs snapshots.

A subvolume for your project

Let’s assume that you want to save the documents related to a project inside the directory $HOME/Documents/myproject.

As you have seen, a Btrfs subvolume, as well as a snapshot, looks like a normal directory. Why not use a Btrfs subvolume for your project, in order to take advantage of snapshots? To create the subvolume, use this command:

btrfs subvolume create $HOME/Documents/myproject

You can create an hidden directory where to arrange your snapshots:

mkdir $HOME/.snapshots

As you can see, in this case there’s no need to use sudo. However, sudo is still needed to list the subvolumes, and to use the send and receive commands.

Now you can start writing your documents. Each day (or each hour, or even minute) you can take a snapshot just before you start to work:

btrfs subvolume snapshot -r $HOME/Documents/myproject $HOME/.snapshots/myproject-day1

For better security and consistency, and if you need to send the snapshot to an external drive as shown in the previous article, remember that the snapshot must be read only, using the -r flag.

Note that in this case, a snapshot of the /home subvolume will not snapshot the $HOME/Documents/myproject subvolume.

How to recover a file or a directory

In this example let’s assume a classic error: you deleted a file by mistake. You can recover it from the most recent snapshot, or recover an older version of the file from an older snapshot. Do you remember that a snapshot appears like a regular directory? You can simply use the cp command to restore the deleted file:

cp $HOME/.snapshots/myproject-day1/filename.odt $HOME/Documents/myproject

Or restore an entire directory:

cp -r $HOME/.snapshots/myproject-day1/directory $HOME/Documents/myproject

What if you delete the entire $HOME/Documents/myproject directory (actually, the subvolume)? You can recreate the subvolume as seen before, and again, you can simply use the cp command to restore the entire content from the snapshot:

btrfs subvolume create $HOME/Documents/myproject
cp -rT $HOME/.snapshots/myproject-day1 $HOME/Documents/myproject

Or you could restore the subvolume by using the btrfs snapshot command (yes, a snapshot of a snapshot):

btrfs subvolume snapshot $HOME/.snapshots/myproject-day1 $HOME/Documents/myproject

How to recover btrfs snapshots from an external drive

You can use the cp command even if the snapshot resides on an external drive. For instance:

cp /run/media/user/mydisk/bk/myproject-day1/filename.odt $HOME/Documents/myproject

You can restore an entire snapshot as well. In this case, since you will use the send and receive commands, you must use sudo. In addition, consider that the restored subvolume will be created as read only. Therefore you need to also set the read only property to false:

sudo btrfs send /run/media/user/mydisk/bk/myproject-day1 | sudo btrfs receive $HOME/Documents/
mv Documents/myproject-day1 Documents/myproject
btrfs property set Documents/myproject ro false

Here’s an extra explanation. The command btrfs subvolume snapshot will create an exact copy of a subvolume in the same device. The destination has to reside in the same btrfs device. You can’t use another device as the destination of the snapshot. In that case you need to take a snapshot and use the send and receive commands.

For more information, refer to some of the online documentation:

man btrfs-subvolume
man btrfs-send
man btrfs-receive
Posted on Leave a comment

Use dnsmasq to provide DNS & DHCP services

Many tech enthusiasts find the ability to control their host name resolution important. Setting up servers and services usually requires some form of fixed address, and sometimes also requires special forms of resolution such as defining Kerberos or LDAP servers, mail servers, etc. All of this can be achieved with dnsmasq.

dnsmasq is a lightweight and simple program which enables issuing DHCP addresses on your network and registering the hostname & IP address in DNS. This configuration also allows external resolution, so your whole network will be able to speak to itself and find external sites too.

This article covers installing and configuring dnsmasq on either a virtual machine or small physical machine like a Raspberry Pi so it can provide these services in your home network or lab. If you have an existing setup and just need to adjust the settings for your local workstation, read the previous article which covers configuring the dnsmasq plugin in NetworkManager.

Install dnsmasq

First, install the dnsmasq package:

sudo dnf install dnsmasq

Next, enable and start the dnsmasq service:

sudo systemctl enable --now dnsmasq

Configure dnsmasq

First, make a backup copy of the dnsmasq.conf file:

sudo cp /etc/dnsmasq.conf /etc/dnsmasq.conf.orig

Next, edit the file and make changes to the following to reflect your network. In this example, mydomain.org is the domain name, 192.168.1.10 is the IP address of the dnsmasq server and 192.168.1.1 is the default gateway.

sudo vi /etc/dnsmasq.conf

Insert the following contents:

domain-needed
bogus-priv
no-resolv
server=8.8.8.8
server=8.8.4.4
local=/mydomain.org/
listen-address=::1,127.0.0.1,192.168.1.10
expand-hosts
domain=mydomain.org
dhcp-range=192.168.1.100,192.168.1.200,24h
dhcp-option=option:router,192.168.1.1
dhcp-authoritative
dhcp-leasefile=/var/lib/dnsmasq/dnsmasq.leases

Test the config to check for typos and syntax errors:

$ sudo dnsmasq --test
dnsmasq: syntax check OK.

Now edit the hosts file, which can contain both statically- and dynamically-allocated hosts. Static addresses should lie outside the DHCP range you specified earlier. Hosts using DHCP but which need a fixed address should be entered here with an address within the DHCP range.

sudo vi /etc/hosts

The first two lines should be there already. Add the remaining lines to configure the router, the dnsmasq server, and two additional servers.

127.0.0.1   localhost localhost.localdomain
::1         localhost localhost.localdomain
192.168.1.1    router
192.168.1.10   dnsmasq
192.168.1.20   server1
192.168.1.30   server2

Restart the dnsmasq service:

sudo systemctl restart dnsmasq

Next add the services to the firewall to allow the clients to connect:

sudo firewall-cmd --add-service={dns,dhcp}
sudo firewall-cmd --runtime-to-permanent

Test name resolution

First, install bind-utils to get the nslookup and dig packages. These allow you to perform both forward and reverse lookups. You could use ping if you’d rather not install extra packages. but these tools are worth installing for the additional troubleshooting functionality they can provide.

sudo dnf install bind-utils

Now test the resolution. First, test the forward (hostname to IP address) resolution:

$ nslookup server1
Server:       127.0.0.1
Address:    127.0.0.1#53
Name:    server1.mydomain.org
Address: 192.168.1.20

Next, test the reverse (IP address to hostname) resolution:

$ nslookup 192.168.1.20
20.1.168.192.in-addr.arpa    name = server1.mydomain.org.

Finally, test resolving hostnames outside of your network:

$ nslookup fedoramagazine.org
Server:       127.0.0.1
Address:    127.0.0.1#53
Non-authoritative answer:
Name:    fedoramagazine.org
Address: 35.196.109.67

Test DHCP leases

To test DHCP leases, you need to boot a machine which uses DHCP to obtain an IP address. Any Fedora variant will do that by default. Once you have booted the client machine, check that it has an address and that it corresponds to the lease file for dnsmasq.

From the machine running dnsmasq:

$ sudo cat /var/lib/dnsmasq/dnsmasq.leases
1598023942 52:54:00:8e:d5:db 192.168.1.100 server3 01:52:54:00:8e:d5:db
1598019169 52:54:00:9c:5a:bb 192.168.1.101 server4 01:52:54:00:9c:5a:bb

Extending functionality

You can assign hosts a fixed IP address via DHCP by adding it to your hosts file with the address you want (within your DHCP range). Do this by adding into the dnsmasq.conf file the following line, which assigns the IP listed to any host that has that name:

dhcp-host=myhost

Alternatively, you can specify a MAC address which should always be given a fixed IP address:

dhcp-host=11:22:33:44:55:66,192.168.1.123

You can specify a PXE boot server if you need to automate machine builds

tftp-root=/tftpboot
dhcp-boot=/tftpboot/pxelinux.0,boothost,192.168.1.240

This should point to the actual URL of your TFTP server.

If you need to specify SRV or TXT records, for example for LDAP, Kerberos or similar, you can add these:

srv-host=_ldap._tcp.mydomain.org,ldap-server.mydomain.org,389
srv-host=_kerberos._udp.mydomain.org,krb-server.mydomain.org,88
srv-host=_kerberos._tcp.mydomain.org,krb-server.mydomain.org,88
srv-host=_kerberos-master._udp.mydomain.org,krb-server.mydomain.org,88
srv-host=_kerberos-adm._tcp.mydomain.org,krb-server.mydomain.org,749
srv-host=_kpasswd._udp.mydomain.org,krb-server.mydomain.org,464
txt-record=_kerberos.mydomain.org,KRB-SERVER.MYDOMAIN.ORG

There are many other options in dnsmasq. The comments in the original config file describe most of them. For full details, read the man page, either locally or online.

Posted on Leave a comment

Announcing the release of Fedora 33 Beta

The Fedora Project is pleased to announce the immediate availability of Fedora 33 Beta, the next step towards our planned Fedora 33 release at the end of October.

Download the prerelease from our Get Fedora site:

Or, check out one of our popular variants, including KDE Plasma, Xfce, and other desktop environments, as well as images for ARM devices like the Raspberry Pi 2 and 3:

Beta Release Highlights

BTRFS by default

All of the desktop variants of Fedora 33 Beta – including Fedora Workstation, Fedora KDE, and others – will use BTRFS as the default filesystem. This is a big shift: we’ve been using ext filesystems since Fedora Core 1. BTRFS offers some really compelling features for users, including transparent compression and copy-on-write. For Fedora 33, we’re only defaulting to the basic features of BTRFS, but we’ll build out the default feature set to include more goodies in future releases.

Fedora Workstation

Fedora 33 Workstation Beta includes GNOME 3.38, the newest release of the GNOME desktop environment. It is full of performance enhancements and improvements. GNOME 3.38 now includes a welcome tour after installation to help users learn about all of the great features this desktop environment offers. It also improves screen recording and multi-monitor support. For a full list of GNOME 3.38 highlights, see the release notes.

Fedora 33 Workstation Beta also provides better thermal management and peak performance on Intel CPUs by including thermald in the default install. And because your desktop should be fun to look at as well as easy to use, Fedora 33 Workstation Beta includes animated backgrounds (a time-of-day slideshow with hue changes) by default.

Fedora IoT

With Fedora 33 Beta, Fedora IoT is now an official Fedora Edition. Fedora IoT is geared toward edge devices on a wide variety of hardware platforms. It is based on ostree technology for safe update and rollback. It includes the Platform AbstRaction for SECurity (PARSEC), an open-source initiative to provide a common API to hardware security and cryptographic services in a platform-agnostic way.

Other updates

Fedora 33 Beta defaults to using nano as the editor. nano is a more approachable editor that is more welcoming to new users. Of course, those who want to use vim, emacs, or any other editor are still able to.

Fedora 33 KDE Beta enables earlyOOM by default, as Fedora Workstation did in the previous release. This helps improve system responsiveness on systems that are running out of memory. 

Fedora 33 Beta includes updated versions of many popular packages like Ruby, Python, and Perl. .NET Core will now be available on Fedora on aarch64, in addition to x86_64. We’re also dropping a few older versions: Python 2.6 and Python 3.4 are retired. The httpd module mod_php is also dropped, as php-fpm is a more performant and more secure PHP module.

Testing needed

Since this is a Beta release, we expect that you may encounter bugs or missing features. To report issues encountered during testing, contact the Fedora QA team via the mailing list or in the #fedora-qa channel on Freenode IRC. As testing progresses, common issues are tracked on the Common F33 Bugs page.

For tips on reporting a bug effectively, read how to file a bug.

What is the Beta Release?

A Beta release is code-complete and bears a very strong resemblance to the final release. If you take the time to download and try out the Beta, you can check and make sure the things that are important to you are working. Every bug you find and report doesn’t just help you, it improves the experience of millions of Fedora users worldwide! Together, we can make Fedora rock-solid. We have a culture of coordinating new features and pushing fixes upstream as much as we can. Your feedback improves not only Fedora, but Linux and free software as a whole.

More information

For more detailed information about what’s new on Fedora 33 Beta release, you can consult the Fedora 33 Change set. It contains more technical information about the new packages and improvements shipped with this release.

Posted on Leave a comment

Now available: Fedora on Lenovo laptops!

We’ve been teasing this for a while, but today it’s finally true—Fedora Workstation is now available preinstalled on the Lenovo ThinkPad X1 Carbon Gen 8, ThinkPad P53, and ThinkPad P1 Gen 2 laptops. The ThinkPad X1 Carbon is available today for direct consumer purchase from Lenovo’s online store. The Lenovo ThinkPad P1 Gen 2 and ThinkPad P53 will be available next week via the “Contact Us” icon on Lenovo.com. What’s more, the successor models are in the works for pre-load and online ordering as well!

Lenovo has been a great partner in bringing this to market. Like the Fedora community, they are operating on an “upstream first” model. That’s part of why the only thing you’ll see on the laptop that doesn’t come from an official Fedora repository is a set of PDFs providing documentation and legal notices. Lenovo engineers have been contributing to the Linux kernel, including a patch to enable the “lap mode” sensor, which is already accepted. They have also worked with their vendors to improve Linux support in devices like the fingerprint scanner.

[youtube https://www.youtube.com/watch?v=7BAPQRmElfs?feature=oembed&w=616&h=347]

Of course, you already know that open source is about more than just the technology; the community is what makes it great. Lenovo is a member of Fedora and other communities. In addition to participating in the usual Fedora places (like the devel mailing list), they also were a gold-level sponsor of our Nest With Fedora conference. And they have a dedicated Fedora section on their community forum. Mark Pearson, Senior Linux Developer said “doing open source the right way is important to us” at his Nest With Fedora Q&A session.

This will be a global program, but it will take some time to roll out country-by-country. If it doesn’t appear on the website in your country, call the local sales number for your country to place a phone order. I’m excited to have Lenovo offer Fedora Workstation as a supported choice on their laptops. This is a great opportunity to grow our community.

Posted on Leave a comment

Installing and running Vagrant using qemu-kvm

Vagrant is a brilliant tool, used by DevOps professionals, coders, sysadmins and regular geeks to stand up repeatable infrastructure for development and testing. From their website:

Vagrant is a tool for building and managing virtual machine environments in a single workflow. With an easy-to-use workflow and focus on automation, Vagrant lowers development environment setup time, increases production parity, and makes the “works on my machine” excuse a relic of the past.

If you are already familiar with the basics of Vagrant, the documentation provides a better reference build for all available features and internals.

Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

https://www.vagrantup.com/intro

This guide will walk through the steps necessary to get Vagrant working on a Fedora-based machine.

I started with a minimal install of Fedora Server as this reduces the memory footprint of the host OS, but if you already have a working Fedora machine, either Server or Workstation, then this should still work.

Check the machine supports virtualisation:

$ sudo lscpu | grep Virtualization Virtualization:                  VT-x Virtualization type:             full

Install qemu-kvm:

sudo dnf install qemu-kvm libvirt libguestfs-tools virt-install rsync

Enable and start the libvirt daemon:

sudo systemctl enable --now libvirtd

Install Vagrant:

sudo dnf install vagrant

Install the Vagrant libvirtd plugin:

sudo vagrant plugin install vagrant-libvirt

Add a box

vagrant box add fedora/32-cloud-base --provider=libvirt

Create a minimal Vagrantfile to test

$ mkdir vagrant-test $ cd vagrant-test $ vi Vagrantfile

Vagrant.configure("2") do |config| config.vm.box = "fedora/32-cloud-base" end

Note the capitalisation of the file name and in the file itself.

Check the file:

vagrant status

Current machine states: default not created (libvirt) The Libvirt domain is not created. Run 'vagrant up' to create it.

Start the box:

vagrant up

Connect to your new machine:

vagrant ssh

That’s it – you now have Vagrant working on your Fedora machine.

To stop the machine, use vagrant halt. This simply halts the machine but leaves the VM and disk in place.
To shut it down and delete it use vagrant destroy. This will remove the whole machine and any changes you’ve made in it.

Next steps

You don’t need to download boxes before issuing the vagrant up command – you can specify the box and the provider in the Vagrantfile directly and Vagrant will download it if it’s not already there. Below is an example which also sets the amount memory and number of CPUs:

# -*- mode: ruby -*-
# vi: set ft=ruby : Vagrant.configure("2") do |config| config.vm.box = "fedora/32-cloud-base" config.vm.provider :libvirt do |libvirt| libvirt.cpus = 1 libvirt.memory = 1024 end
end

For more information on using Vagrant, creating your own machines and using different boxes, see the official documentation at https://www.vagrantup.com/docs

There is a huge repository of boxes ready to download and use, and the official location for these is Vagrant Cloud – https://app.vagrantup.com/boxes/search. Some are basic operating systems and some offer complete functionality such as databases, web servers etc.

Posted on Leave a comment

Incremental backups with Btrfs snapshots

Snapshots are an interesting feature of Btrfs. A snapshot is a copy of a subvolume. Taking a snapshot is immediate. However, taking a snapshot is not like performing a rsync or a cp, and a snapshot doesn’t occupy space as soon as it is created.

Editors note: From the BTRFS Wiki – A snapshot is simply a subvolume that shares its data (and metadata) with some other subvolume, using Btrfs’s COW capabilities.

Occupied space will increase alongside the data changes in the original subvolume or in the snapshot itself, if it is writeable. Added/modified files, and deleted files in the subvolume still reside in the snapshots. This is a convenient way to perform backups.

Using snapshots for backups

A snapshot resides on the same disk where the subvolume is located. You can browse it like a regular directory and recover a copy of a file as it was when the snapshot was performed. By the way, a snapshot on the same disk of the snapshotted subvolume is not an ideal backup strategy: if the hard disk broke, snapshots will be lost as well. An interesting feature of snapshots is the ability to send them to another location. The snapshot can be sent to an external hard drive or to a remote system via SSH (the destination filesystems need to be formatted as Btrfs as well). To do this, the commands btrfs send and btrfs receive are used.

Taking a snapshot

In order to use the send and the receive commands, it is important to create the snapshot as read-only, and snapshots are writeable by default.

The following command will take a snapshot of the /home subvolume. Note the -r flag for readonly.

sudo btrfs subvolume snapshot -r /home /.snapshots/home-day1

Instead of day1, the snapshot name can be the current date, like home-$(date +%Y%m%d). Snapshots look like regular subdirectories. You can place them wherever you like. The directory /.snapshots could be a good choice to keep them neat and to avoid confusion.

Editors note: Snapshots will not take recursive snapshots of themselves. If you create a snapshot of a subvolume, every subvolume or snapshot that the subvolume contains is mapped to an empty directory of the same name inside the snapshot.

Backup using btrfs send

In this example the destination Btrfs volume in the USB drive is mounted as /run/media/user/mydisk/bk . The command to send the snapshot to the destination is:

sudo btrfs send /.snapshots/home-day1 | sudo btrfs receive /run/media/user/mydisk/bk

This is called initial bootstrapping, and it corresponds to a full backup. This task will take some time, depending on the size of the /home directory. Obviously, subsequent incremental sends will take a shorter time.

Incremental backup

Another useful feature of snapshots is the ability to perform the send task in an incremental way. Let’s take another snapshot.

sudo btrfs subvolume snapshot -r /home /.snapshots/home-day2

In order to perform the send task incrementally, you need to specify the previous snapshot as a base and this snapshot has to exist in the source and in the destination. Please note the -p option.

sudo btrfs send -p /.snapshot/home-day1 /.snapshot/home-day2 | sudo btrfs receive /run/media/user/mydisk/bk

And again (the day after):

sudo btrfs subvolume snapshot -r /home /.snapshots/home-day3
sudo btrfs send -p /.snapshot/home-day2 /.snapshot/home-day3 | sudo btrfs receive /run/media/user/mydisk/bk

Cleanup

Once the operation is complete, you can keep the snapshot. But if you perform these operations on a daily basis, you could end up with a lot of them. This could lead to confusion and potentially a lot of used space on your disks. So it is a good advice to delete some snapshots if you think you don’t need them anymore.

Keep in mind that in order to perform an incremental send you need at least the last snapshot. This snapshot must be present in the source and in the destination.

sudo btrfs subvolume delete /.snapshot/home-day1
sudo btrfs subvolume delete /.snapshot/home-day2
sudo btrfs subvolume delete /run/media/user/mydisk/bk/home-day1
sudo btrfs subvolume delete /run/media/user/mydisk/bk/home-day2

Note: the day 3 snapshot was preserved in the source and in the destination. In this way, tomorrow (day 4), you can perform a new incremental btrfs send.

As some final advice, if the USB drive has a bunch of space, you could consider maintaining multiple snapshots in the destination, while in the source disk you would keep only the last one.