Posted on Leave a comment

Vagrant beyond the basics

There are, like most things in the Unix/Linux world, many ways of doing things with Vagrant, but here are some examples of ways to grow your Vagrantfile portfolio and increase your knowledge and use.

If you have not yet installed vagrant you can follow the first part of this series.

Some Vagrantfile basics

All Vagrantfiles start with “Vagrant.configure(“2”) do |config|” and finish with a corresponding “end”:

 
Vagrant.configure("2") do |config|
  ...
  ...
end

The “2” represents the version of Vagrant, and is currently either 1 or 2. Unless you need to use the older version simply stick with the latest.

The config structure is broken down into namespaces:

config.vm – modify the configuration of the machine(s) that Vagrant manages.

config.ssh – for configuring how Vagrant will access your machine over SSH.

config.winrm – configuring how Vagrant will access your Windows guest over WinRM.

config.winssh – the WinSSH communicator is built specifically for the Windows native port of OpenSSH.

config.vagrant – modify the behavior of Vagrant itself.

Each line in a namespace begins with the word ‘config’:

config.vm.box = “fedora/32-cloud-base”
config.vm.network “private_network”

There are many options here, and a read of the documentation pages is strongly recommended. They can be found at https://www.vagrantup.com/docs/vagrantfile

Also in this section you can configure provider-specific options. In this case the provider is libvirt, and the specific config looks like this:

 
config.vm.provider :libvirt do |libvirt|
  libvirt.cpus = 1
  libvirt.memory = 512

In the example above, all libvirt VMs will be created with a single CPU and 512Mb of memory unless specifically overridden.

The VM namespace is where you define all machines you want this Vagrantfile to build. Notice that this is still a part of the config section, and lines should therefore begin with ‘config’. All sections or parts of sections have an ‘end’ statement to close them off.

Creating multiple machines at once

Depending on what you need to achieve, this can be a simple loop or multiple machine definitions. To create any number of machines in a series, with the same settings but perhaps different names and/or IP addresses, you can just provide a range as shown here:

 
(1..5).each do |i|
  config.vm.define "server#{i}" do |server|
    server.vm.hostname = "server#{i}.example.com"
  end
end

This will create 5 servers, named server1, server2, server3 etc.

Of note, using Ruby style “for i in 1..3 do” doesn’t work despite Vagrantfile syntax actually being Ruby, so use the method from the example above.

If you need servers with different hostnames, different hardware etc then you’ll need to specify them individually, or at least in groups if the situation lends itself to that. Let’s say you need to create a typical web/db/load balancer infrastructure, with 2 web servers, a single database server and a load balancer for the web traffic. Ignoring the specific software setup for this, to simply create the virtual machines ready for provisioning you could use something like this:

 
# Load Balancer
config.vm.define "loadbal", primary: true do |loadbal|
  loadbal.vm.hostname = "loadbal"
end

# Database
config.vm.define "db", primary: true do |db|
  db.vm.hostname = "db"
end

# Web Servers x2
(1..2).each do |i|
  config.vm.define "web#{i}" do |web|
    web.vm.hostname = "web#{i}"
  end
end

This uses a combination of multiple machine calls and a small loop to build 4 VMs with a single ‘vagrant up’ command.

Networking

Vagrant generally creates its own network for VM access, and you use this with ‘vagrant ssh’. If you create more than one VM then you must use the VM name to identify which one you wish to connect to – vagrant ssh vmname.

There are a number of configuration options available which allow you to interact with your VMs in various ways.

The vagrant-libvirt plugin creates a network for the guests to use. This is automated and will always be present even if you define your own networks. The network is named “vagrant-libvirt” and can be seen either in the Virtual Networks tab of virt-manager’s connection details or by issuing a sudo virsh net-list command.

If you use dhcp for your guests, you can find the individual IP addresses with the virsh net-dhcp-list command: sudo virsh net-dhcp-leases vagrant-libvirt

Port Forwarding

The simplest change to default networking is port forwarding. This uses a simple format like most Vagrant config: config.vm.network “forwarded_port”, guest: 80, host: 8080

This listens to port 8080 on your local machine and forwards connections to port 80 on the Vagrant machine. If you need to use a UDP port, simply add , protocol: “udp” to the end of that line (notice that comma which should come immediately after the second port number).

Obviously for more complex configurations this might not be ideal, as you need to specify every single port you want to forward. If you then add multiple machines the complexity can really become too much.

In addition to this, anyone on your network can access these ports if they know your IP address, so that’s something you should be aware of.

Public Network

This creates a network card for the Vagrant VM which connects to your host network, and will therefore be visible to all machines on that network. As Vagrant is not designed to be secure, you should be aware of any vulnerabilities and take steps to protect against them.

To configure a public network, add config.vm.network “public_network” to your Vagrantfile. This will use DHCP to obtain a network address.

If you wish to assign a static IP address, you can add one to the end of the network declaration: config.vm.network “public_network”, ip: “192.168.0.1”

If you’re creating multiple guests you can put the network configuration in the vm namespace, and even allocate IPs based on iteration too:

 
Vagrant.configure("2") do |config|
  config.vm.box = "centos/8"
  config.vm.provider :libvirt do |libvirt|
    libvirt.qemu_use_session = false
  end

  # Servers x2
  (1..2).each do |i|
    config.vm.define "server#{i}" do |server|
      server.vm.hostname = "server#{i}"
      server.vm.network "public_network", ip: "192.168.122.20#{i}"
    end
  end
end

Private Network

This works very much like the Public Network option, only the network is only available to the host machine and the Vagrant guests. The syntax is almost identical too: config.vm.network “private_network”, type: “dhcp”

 To use a static IP address, simply add it:

 
config.vm.network "private_network", ip: "192.168.50.4"

This will create a new network in libvirt, usually named something like “vagrant-private-dhcp” – you can see this with the command sudo virsh net-list while the VM is running. This network is created and destroyed along with the vagrant guests.

Again, the network config can be specified for all guests, or per guest as shown in the public network example above.

Provisioning

Once you have your VMs defined, you can obviously then do whatever you want with them, but as soon as you issue a ‘vagrant destroy’ command any changes will be lost. This is where automated provisioning comes in.

You can use several methods to provision your machines, from simple file copies to shell scripts, Ansible, Chef and Puppet. Many of the main methods can be used, but I’ll cover the simple ones here – if you need to use something else please read the documentation as it’s all covered.

File uploads

To copy a file to the Vagrant guest, add a line to the Vagrantfile like this:

 
config.vm.provision "file", source: "~/myfile", destination: "myfile"

You can copy directories too:

 
config.vm.provision "file", source: "~/path/to/host/folder", destination: "$HOME/remote/newfolder"

The directory structure should already exist on the Vagrant host, and will be copied in its entirety, including subdirectories and files.

Note: If you add a trailing slash to the destination path, the source path will be placed under this so make sure you only do this if you want that outcome. For example, if the above destination was “$HOME/remote/newfolder/”, then the result would see “$HOME/remote/newfolder/folder” created with the contents of the source placed here.

Shell commands

You can include individual commands, inline scripts or external scripts to perform provisioning tasks.

A single command would take this form, and any valid command line command can be used here: config.vm.provision “shell”, inline: “sudo dnf update -y”

An inline script is less common, and declared at the top of the Vagrantfile then called during provisioning:

 
$script = <<-SCRIPT
echo I am provisioning...
date > /etc/vagrant_provisioned_at
SCRIPT

Vagrant.configure("2") do |config|
  config.vm.provision "shell", inline: $script
end

More common is the external shell script, which gives more flexibility and makes code more modular. Vagrant uploads the file to the guest then executes it. Simply call the script in the provisioning line:

config.vm.provision “shell”, path: “script.sh”

The file need not be local to the Vagrant host either:

config.vm.provision “shell”, path: “https://example.com/provisioner.sh”

Ansible

To use Ansible to provision your VMs you must have it installed on the Vagrant host; see https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html#installing-ansible-on-rhel-centos-or-fedora.

You specify an Ansible playbook to provision your VM in the following way:

 
config.vm.provision "ansible" do |ansible|
  ansible.playbook = "playbook.yml"
end

This then calls the playbook, which will run as any externally-run ansible playbook would.

If you’re building multiple VMs with your Vagrantfile then it’s likely you want different configurations for some of them, and in this case you should provision within the definition of each VM, as shown here:

 
# Web Servers x2
(1..2).each do |i|
  config.vm.define "web#{i}" do |web|
    web.vm.hostname = "web#{i}"
    web.vm.provision "ansible" do |ansible|
      ansible.playbook = "web.yml"
    end
  end
end

Ansible provisioners come in two formats – ansible and ansible_local. The ansible provisioner requires that Ansible is installed on the Vagrant host, and will connect remotely to your guest VMs to provision them. This means all necessary ssh authentication must be in place for it to work. The ansible_local provisioner executes playbooks directly on the guest VMs, which therefore requires Ansible be installed on each of the guests you want to provision. Vagrant will try to install Ansible on the guests in order to do this, (This can be controlled with the install option, but is enabled by default). On RHEL-style systems like Fedora, Ansible is installed from the EPEL repository. Simply use either ansible or ansible_local in the config_vm_provision command to choose the style you need.

Synced Folders

Vagrant allows you to sync folders between your Vagrant host and your guests, allowing access to configuration files, data etc. By default, the folder containing the Vagrant file is shared and mounted under /vagrant on each guest.

To configure additional synced folders, use the config.vm.synced.folder command:

 
config.vm.synced_folder "src/", "/srv/website"

The two parameters are the source folder on the Vagrant host and the mount directory on the guest. The destination folder will be created if it does not exist, recursively if necessary.

Options for synced folders allow you to configure them better, including the option to disable them completely. Other options allow you to specify a group owner of the folder (group), the folder owner (owner), plus mount options. There are others but these are the main ones.

You can disable the default share with the following command:

 
config.vm.synced_folder ".", "/vagrant", disabled: true

Other options are configured as follows:

 
config.vm.synced_folder "src/", "/srv/website",
  owner: "apache", group: "apache"

NFS synced folders

When using Vagrant on a Linux host, synced folders use NFS (with the exception of the default share which uses rsync; see below) so you must have NFS installed on the Vagrant host, and the guests also need NFS support installation. To use NFS with non-Linux hosts, simply specify the folder type as ‘nfs’:

 
config.vm.synced_folder ".", "/vagrant", type: "nfs"

RSync synced folders

These are the easiest to use as they usually work without any intervention on a Linux host. This is a one-way sync from host to guest performed at startup (vagrant up) or after a vagrant reload command is issued. The default share of the Vagrant project directory is done with rsync. To configure a synced folder with rsync, specify the type as ‘rsync’:

 
config.vm.synced_folder ".", "/vagrant", type: "rsync"

Posted on Leave a comment

Getting started with Fedora CoreOS

This has been called the age of DevOps, and operating systems seem to be getting a little bit less attention than tools are. However, this doesn’t mean that there has been no innovation in operating systems. [Edit: The diversity of offerings from the plethora of distributions based on the Linux kernel is a fine example of this.] Fedora CoreOS has a specific philosophy of what an operating system should be in this age of DevOps.

Fedora CoreOS’ philosophy

Fedora CoreOS (FCOS) came from the merging of CoreOS Container Linux and Fedora Atomic Host. It is a minimal and monolithic OS focused on running containerized applications. Security being a first class citizen, FCOS provides automatic updates and comes with SELinux hardening.

For automatic updates to work well they need to be very robust. The goal being that servers running FCOS won’t break after an update. This is achieved by using different release streams (stable, testing and next). Each stream is released every 2 weeks and content is promoted from one stream to the other (next -> testing -> stable). That way updates landing in the stable stream have had the opportunity to be tested over a long period of time.

Getting Started

For this example let’s use the stable stream and a QEMU base image that we can run as a virtual machine. You can use coreos-installer to download that image.

From your (Workstation) terminal, run the following commands after updating the link to the image. [Edit: On Silverblue the container based coreos tools are the simplest method to try. Instructions can be found at https://docs.fedoraproject.org/en-US/fedora-coreos/tutorial-setup/ , in particular “Setup with Podman or Docker”.]

$ sudo dnf install coreos-installer
$ coreos-installer download --image-url https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/32.20200907.3.0/x86_64/fedora-coreos-32.20200907.3.0-qemu.x86_64.qcow2.xz
$ xz -d fedora-coreos-32.20200907.3.0-qemu.x86_64.qcow2.xz
$ ls
fedora-coreos-32.20200907.3.0-qemu.x86_64.qcow2

Create a configuration

To customize a FCOS system, you need to provide a configuration file that will be used by Ignition to provision the system. You may use this file to configure things like creating a user, adding a trusted SSH key, enabling systemd services, and more.

The following configuration creates a ‘core’ user and adds an SSH key to the authorized_keys file. It is also creating a systemd service that uses podman to run a simple hello world container.

version: "1.0.0"
variant: fcos
passwd: users: - name: core ssh_authorized_keys: - ssh-ed25519 my_public_ssh_key_hash fcos_key
systemd: units: - contents: | [Unit] Description=Run a hello world web service After=network-online.target Wants=network-online.target [Service] ExecStart=/bin/podman run --pull=always --name=hello --net=host -p 8080:8080 quay.io/cverna/hello ExecStop=/bin/podman rm -f hello [Install] WantedBy=multi-user.target enabled: true name: hello.service

After adding your SSH key in the configuration save it as config.yaml. Next use the Fedora CoreOS Config Transpiler (fcct) tool to convert this YAML configuration into a valid Ignition configuration (JSON format).

Install fcct directly from Fedora’s repositories or get the binary from GitHub.

$ sudo dnf install fcct
$ fcct -output config.ign config.yaml

Install and run Fedora CoreOS

To run the image, you can use the libvirt stack. To install it on a Fedora system using the dnf package manager

$ sudo dnf install @virtualization

Now let’s create and run a Fedora CoreOS virtual machine

$ chcon --verbose unconfined_u:object_r:svirt_home_t:s0 config.ign
$ virt-install --name=fcos \
--vcpus=2 \
--ram=2048 \
--import \
--network=bridge=virbr0 \
--graphics=none \
--qemu-commandline="-fw_cfg name=opt/com.coreos/config,file=${PWD}/config.ign" \
--disk=size=20,backing_store=${PWD}/fedora-coreos-32.20200907.3.0-qemu.x86_64.qcow2

Once the installation is successful, some information is displayed and a login prompt is provided.

Fedora CoreOS 32.20200907.3.0
Kernel 5.8.10-200.fc32.x86_64 on an x86_64 (ttyS0)
SSH host key: SHA256:BJYN7AQZrwKZ7ZF8fWSI9YRhI++KMyeJeDVOE6rQ27U (ED25519)
SSH host key: SHA256:W3wfZp7EGkLuM3z4cy1ZJSMFLntYyW1kqAqKkxyuZrE (ECDSA)
SSH host key: SHA256:gb7/4Qo5aYhEjgoDZbrm8t1D0msgGYsQ0xhW5BAuZz0 (RSA)
ens2: 192.168.122.237 fe80::5054:ff:fef7:1a73
Ignition: user provided config was applied
Ignition: wrote ssh authorized keys file for user: core

The Ignition configuration file did not provide any password for the core user, therefore it is not possible to login directly via the console. (Though, it is possible to configure a password for users via Ignition configuration.)

Use Ctrl + ] key combination to exit the virtual machine’s console. Then check if the hello.service is running.

$ curl http://192.168.122.237:8080
Hello from Fedora CoreOS!

Using the preconfigured SSH key, you can also access the VM and inspect the services running on it.

$ ssh core@192.168.122.237
$ systemctl status hello
● hello.service - Run a hello world web service
Loaded: loaded (/etc/systemd/system/hello.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2020-10-28 10:10:26 UTC; 42s ago

zincati, rpm-ostree and automatic updates

The zincati service drives rpm-ostreed with automatic updates.
Check which version of Fedora CoreOS is currently running on the VM, and check if Zincati has found an update.

$ ssh core@192.168.122.237
$ rpm-ostree status
State: idle
Deployments:
● ostree://fedora:fedora/x86_64/coreos/stable
Version: 32.20200907.3.0 (2020-09-23T08:16:31Z)
Commit: b53de8b03134c5e6b683b5ea471888e9e1b193781794f01b9ed5865b57f35d57
GPGSignature: Valid signature by 97A1AE57C3A2372CCA3A4ABA6C13026D12C944D0
$ systemctl status zincati
● zincati.service - Zincati Update Agent
Loaded: loaded (/usr/lib/systemd/system/zincati.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2020-10-28 13:36:23 UTC; 7s ago
…
Oct 28 13:36:24 cosa-devsh zincati[1013]: [INFO ] initialization complete, auto-updates logic enabled
Oct 28 13:36:25 cosa-devsh zincati[1013]: [INFO ] target release '32.20201004.3.0' selected, proceeding to stage it ... zincati reboot ...

After the restart, let’s remote login once more to check the new version of Fedora CoreOS.

$ ssh core@192.168.122.237
$ rpm-ostree status
State: idle
Deployments:
● ostree://fedora:fedora/x86_64/coreos/stable
Version: 32.20201004.3.0 (2020-10-19T17:12:33Z)
Commit: 64bb377ae7e6949c26cfe819f3f0bd517596d461e437f2f6e9f1f3c24376fd30
GPGSignature: Valid signature by 97A1AE57C3A2372CCA3A4ABA6C13026D12C944D0
ostree://fedora:fedora/x86_64/coreos/stable
Version: 32.20200907.3.0 (2020-09-23T08:16:31Z)
Commit: b53de8b03134c5e6b683b5ea471888e9e1b193781794f01b9ed5865b57f35d57
GPGSignature: Valid signature by 97A1AE57C3A2372CCA3A4ABA6C13026D12C944D0

rpm-ostree status now shows 2 versions of Fedora CoreOS, the one that came in the QEMU image, and the latest one received from the update. By having these 2 versions available, it is possible to rollback to the previous version using the rpm-ostree rollback command.

Finally, you can make sure that the hello service is still running and serving content.

$ curl http://192.168.122.237:8080
Hello from Fedora CoreOS!

More information: Fedora CoreOS updates

Deleting the Virtual Machine

To clean up afterwards, the following commands will delete the VM and associated storage.

$ virsh destroy fcos
$ virsh undefine --remove-all-storage fcos

Conclusion

Fedora CoreOS provides a solid and secure operating system tailored to run applications in containers. It excels in a DevOps environment which encourages the hosts to be provisioned using declarative configuration files. Automatic updates and the ability to rollback to a previous version of the OS, bring a peace of mind during the operation of a service.

Learn more about Fedora CoreOS by following the tutorials available in the project’s documentation.

Posted on Leave a comment

What’s new in Fedora 33 Workstation

Fedora 33 Workstation is the latest release of our free, leading-edge operating system. You can download it from the official website here right now. There are several new and noteworthy changes in Fedora 33 Workstation. Read more details below.

GNOME 3.38

Fedora 33 Workstation includes the latest release of GNOME Desktop Environment for users of all types. GNOME 3.38 in Fedora 33 Workstation includes many updates and improvements, including:

A new GNOME Tour app

New users are now greeted by “a new Tour application, highlighting the main functionality of the desktop and providing first time users a nice welcome to GNOME.”

The new GNOME Tour application in Fedora 33

Drag to reorder apps

GNOME 3.38 replaces the previously split Frequent and All apps views with a single customizable and consistent view that allows you to reorder apps and organize them into custom folders. Simply click and drag to move apps around.

GNOME 3.38 Drag to Reorder

Improved screen recording

The screen recording infrastructure in GNOME Shell has been improved to take advantage of PipeWire and kernel APIs. This will help reduce resource consumption and improve responsiveness.

GNOME 3.38 also provides many additional features and enhancements. Check out the GNOME 3.38 Release Notes for further information.


B-tree file system

As announced previously, new installations of Fedora 33 will default to using Btrfs. Features and enhancements are added to Btrfs with each new kernel release. The change log has a complete summary of the features that each new kernel version brings to Btrfs.


Swap on ZRAM

Anaconda and Fedora IoT have been using swap-on-zram by default for years. With Fedora 33, swap-on-zram will be enabled by default instead of a swap partition. Check out the Fedora wiki page for more details about swap-on-zram.


Nano by default

Fresh Fedora 33 installations will set the EDITOR environment variable to nano by default. This change affects several command line tools that spawn a text editor when they require user input. With earlier releases, this environment variable default was unspecified, leaving it up to the individual application to pick a default editor. Typically, applications would use vi as their default editor due to it being a small application that is traditionally available on the base installation of most Unix/Linux operating systems. Since Fedora 33 includes nano in its base installation, and since nano is more intuitive for a beginning user to use, Fedora 33 will use nano by default. Users who want vi can, of course, override the value of the EDITOR variable in their own environment. See the Fedora change request for more details.

Posted on Leave a comment

Announcing the release of Fedora 33 Beta

The Fedora Project is pleased to announce the immediate availability of Fedora 33 Beta, the next step towards our planned Fedora 33 release at the end of October.

Download the prerelease from our Get Fedora site:

Or, check out one of our popular variants, including KDE Plasma, Xfce, and other desktop environments, as well as images for ARM devices like the Raspberry Pi 2 and 3:

Beta Release Highlights

BTRFS by default

All of the desktop variants of Fedora 33 Beta – including Fedora Workstation, Fedora KDE, and others – will use BTRFS as the default filesystem. This is a big shift: we’ve been using ext filesystems since Fedora Core 1. BTRFS offers some really compelling features for users, including transparent compression and copy-on-write. For Fedora 33, we’re only defaulting to the basic features of BTRFS, but we’ll build out the default feature set to include more goodies in future releases.

Fedora Workstation

Fedora 33 Workstation Beta includes GNOME 3.38, the newest release of the GNOME desktop environment. It is full of performance enhancements and improvements. GNOME 3.38 now includes a welcome tour after installation to help users learn about all of the great features this desktop environment offers. It also improves screen recording and multi-monitor support. For a full list of GNOME 3.38 highlights, see the release notes.

Fedora 33 Workstation Beta also provides better thermal management and peak performance on Intel CPUs by including thermald in the default install. And because your desktop should be fun to look at as well as easy to use, Fedora 33 Workstation Beta includes animated backgrounds (a time-of-day slideshow with hue changes) by default.

Fedora IoT

With Fedora 33 Beta, Fedora IoT is now an official Fedora Edition. Fedora IoT is geared toward edge devices on a wide variety of hardware platforms. It is based on ostree technology for safe update and rollback. It includes the Platform AbstRaction for SECurity (PARSEC), an open-source initiative to provide a common API to hardware security and cryptographic services in a platform-agnostic way.

Other updates

Fedora 33 Beta defaults to using nano as the editor. nano is a more approachable editor that is more welcoming to new users. Of course, those who want to use vim, emacs, or any other editor are still able to.

Fedora 33 KDE Beta enables earlyOOM by default, as Fedora Workstation did in the previous release. This helps improve system responsiveness on systems that are running out of memory. 

Fedora 33 Beta includes updated versions of many popular packages like Ruby, Python, and Perl. .NET Core will now be available on Fedora on aarch64, in addition to x86_64. We’re also dropping a few older versions: Python 2.6 and Python 3.4 are retired. The httpd module mod_php is also dropped, as php-fpm is a more performant and more secure PHP module.

Testing needed

Since this is a Beta release, we expect that you may encounter bugs or missing features. To report issues encountered during testing, contact the Fedora QA team via the mailing list or in the #fedora-qa channel on Freenode IRC. As testing progresses, common issues are tracked on the Common F33 Bugs page.

For tips on reporting a bug effectively, read how to file a bug.

What is the Beta Release?

A Beta release is code-complete and bears a very strong resemblance to the final release. If you take the time to download and try out the Beta, you can check and make sure the things that are important to you are working. Every bug you find and report doesn’t just help you, it improves the experience of millions of Fedora users worldwide! Together, we can make Fedora rock-solid. We have a culture of coordinating new features and pushing fixes upstream as much as we can. Your feedback improves not only Fedora, but Linux and free software as a whole.

More information

For more detailed information about what’s new on Fedora 33 Beta release, you can consult the Fedora 33 Change set. It contains more technical information about the new packages and improvements shipped with this release.

Posted on Leave a comment

Installing and running Vagrant using qemu-kvm

Vagrant is a brilliant tool, used by DevOps professionals, coders, sysadmins and regular geeks to stand up repeatable infrastructure for development and testing. From their website:

Vagrant is a tool for building and managing virtual machine environments in a single workflow. With an easy-to-use workflow and focus on automation, Vagrant lowers development environment setup time, increases production parity, and makes the “works on my machine” excuse a relic of the past.

If you are already familiar with the basics of Vagrant, the documentation provides a better reference build for all available features and internals.

Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

https://www.vagrantup.com/intro

This guide will walk through the steps necessary to get Vagrant working on a Fedora-based machine.

I started with a minimal install of Fedora Server as this reduces the memory footprint of the host OS, but if you already have a working Fedora machine, either Server or Workstation, then this should still work.

Check the machine supports virtualisation:

$ sudo lscpu | grep Virtualization Virtualization:                  VT-x Virtualization type:             full

Install qemu-kvm:

sudo dnf install qemu-kvm libvirt libguestfs-tools virt-install rsync

Enable and start the libvirt daemon:

sudo systemctl enable --now libvirtd

Install Vagrant:

sudo dnf install vagrant

Install the Vagrant libvirtd plugin:

sudo vagrant plugin install vagrant-libvirt

Add a box

vagrant box add fedora/32-cloud-base --provider=libvirt

Create a minimal Vagrantfile to test

$ mkdir vagrant-test $ cd vagrant-test $ vi Vagrantfile

Vagrant.configure("2") do |config| config.vm.box = "fedora/32-cloud-base" end

Note the capitalisation of the file name and in the file itself.

Check the file:

vagrant status

Current machine states: default not created (libvirt) The Libvirt domain is not created. Run 'vagrant up' to create it.

Start the box:

vagrant up

Connect to your new machine:

vagrant ssh

That’s it – you now have Vagrant working on your Fedora machine.

To stop the machine, use vagrant halt. This simply halts the machine but leaves the VM and disk in place.
To shut it down and delete it use vagrant destroy. This will remove the whole machine and any changes you’ve made in it.

Next steps

You don’t need to download boxes before issuing the vagrant up command – you can specify the box and the provider in the Vagrantfile directly and Vagrant will download it if it’s not already there. Below is an example which also sets the amount memory and number of CPUs:

# -*- mode: ruby -*-
# vi: set ft=ruby : Vagrant.configure("2") do |config| config.vm.box = "fedora/32-cloud-base" config.vm.provider :libvirt do |libvirt| libvirt.cpus = 1 libvirt.memory = 1024 end
end

For more information on using Vagrant, creating your own machines and using different boxes, see the official documentation at https://www.vagrantup.com/docs

There is a huge repository of boxes ready to download and use, and the official location for these is Vagrant Cloud – https://app.vagrantup.com/boxes/search. Some are basic operating systems and some offer complete functionality such as databases, web servers etc.

Posted on Leave a comment

Incremental backups with Btrfs snapshots

Snapshots are an interesting feature of Btrfs. A snapshot is a copy of a subvolume. Taking a snapshot is immediate. However, taking a snapshot is not like performing a rsync or a cp, and a snapshot doesn’t occupy space as soon as it is created.

Editors note: From the BTRFS Wiki – A snapshot is simply a subvolume that shares its data (and metadata) with some other subvolume, using Btrfs’s COW capabilities.

Occupied space will increase alongside the data changes in the original subvolume or in the snapshot itself, if it is writeable. Added/modified files, and deleted files in the subvolume still reside in the snapshots. This is a convenient way to perform backups.

Using snapshots for backups

A snapshot resides on the same disk where the subvolume is located. You can browse it like a regular directory and recover a copy of a file as it was when the snapshot was performed. By the way, a snapshot on the same disk of the snapshotted subvolume is not an ideal backup strategy: if the hard disk broke, snapshots will be lost as well. An interesting feature of snapshots is the ability to send them to another location. The snapshot can be sent to an external hard drive or to a remote system via SSH (the destination filesystems need to be formatted as Btrfs as well). To do this, the commands btrfs send and btrfs receive are used.

Taking a snapshot

In order to use the send and the receive commands, it is important to create the snapshot as read-only, and snapshots are writeable by default.

The following command will take a snapshot of the /home subvolume. Note the -r flag for readonly.

sudo btrfs subvolume snapshot -r /home /.snapshots/home-day1

Instead of day1, the snapshot name can be the current date, like home-$(date +%Y%m%d). Snapshots look like regular subdirectories. You can place them wherever you like. The directory /.snapshots could be a good choice to keep them neat and to avoid confusion.

Editors note: Snapshots will not take recursive snapshots of themselves. If you create a snapshot of a subvolume, every subvolume or snapshot that the subvolume contains is mapped to an empty directory of the same name inside the snapshot.

Backup using btrfs send

In this example the destination Btrfs volume in the USB drive is mounted as /run/media/user/mydisk/bk . The command to send the snapshot to the destination is:

sudo btrfs send /.snapshots/home-day1 | sudo btrfs receive /run/media/user/mydisk/bk

This is called initial bootstrapping, and it corresponds to a full backup. This task will take some time, depending on the size of the /home directory. Obviously, subsequent incremental sends will take a shorter time.

Incremental backup

Another useful feature of snapshots is the ability to perform the send task in an incremental way. Let’s take another snapshot.

sudo btrfs subvolume snapshot -r /home /.snapshots/home-day2

In order to perform the send task incrementally, you need to specify the previous snapshot as a base and this snapshot has to exist in the source and in the destination. Please note the -p option.

sudo btrfs send -p /.snapshot/home-day1 /.snapshot/home-day2 | sudo btrfs receive /run/media/user/mydisk/bk

And again (the day after):

sudo btrfs subvolume snapshot -r /home /.snapshots/home-day3
sudo btrfs send -p /.snapshot/home-day2 /.snapshot/home-day3 | sudo btrfs receive /run/media/user/mydisk/bk

Cleanup

Once the operation is complete, you can keep the snapshot. But if you perform these operations on a daily basis, you could end up with a lot of them. This could lead to confusion and potentially a lot of used space on your disks. So it is a good advice to delete some snapshots if you think you don’t need them anymore.

Keep in mind that in order to perform an incremental send you need at least the last snapshot. This snapshot must be present in the source and in the destination.

sudo btrfs subvolume delete /.snapshot/home-day1
sudo btrfs subvolume delete /.snapshot/home-day2
sudo btrfs subvolume delete /run/media/user/mydisk/bk/home-day1
sudo btrfs subvolume delete /run/media/user/mydisk/bk/home-day2

Note: the day 3 snapshot was preserved in the source and in the destination. In this way, tomorrow (day 4), you can perform a new incremental btrfs send.

As some final advice, if the USB drive has a bunch of space, you could consider maintaining multiple snapshots in the destination, while in the source disk you would keep only the last one.

Posted on Leave a comment

Btrfs Coming to Fedora 33

by Chris Murphy and Langdon White


User data is the most important thing on a computer. Whether it’s source code for the next big release, family pictures, a music library, or anything else, you want it to be safe. Changing the default file system is not a change to make casually. The Fedora Project is changing the default file system for desktop variants (Fedora Workstation, Fedora KDE, etc), for the first time since Fedora 11. Btrfs will replace ext4 as the default filesystem in Fedora 33.

What does this mean for me?

Btrfs is a stable and mature file system with modern features: data integrity, optimizations for SSDs, compression, cheap writable snapshots, multiple device support, and more.

The switch to Btrfs will use a single-partition disk layout, and Btrfs’ built-in volume management. The previous default layout placed constraints on disk usage that can be a difficult adjustment for novice users. Btrfs solves this problem by avoiding it.

As a techie, you may have heard of bit rot, and memory bit flips. Data can be corrupted by a multitude of physical factors, even cosmic rays from the sun! Before an SSD fails outright, often it will return either zeros or garbage, instead of your data. Btrfs safeguards your data with checksums, and performs verification on every read. Corrupt data is never given to your programs, and it won’t replicate into your backups to be discovered another day (or year).

Btrfs uses a “copy-on-write” model: your data and the file system itself are never overwritten. This enhances crash-safeness. When copying a file, Btrfs does not write new data until you actually change the old data, saving space.

In fact, users will save more space when using Btrfs’ transparent compression. Compressing data reduces total writes, saves space, and extends flash drive life. In many cases, it can also improve performance. Compression can be enabled on an entire file system, or per subvolume, directory, and even per file. You will be able to opt-in to using compression in Fedora 33. And it’s one of the features we’re looking forward to taking advantage of by default in future Fedora releases.

Trusted

Facebook uses Btrfs on millions of machines in production. They compare its stability to ext4 and XFS (another file system available in Fedora). In fact, they use Btrfs to “improve” the quality of the consumer storage hardware that they use in production. Btrfs detects problems before the hardware fails.

(open)SUSE have been using Btrfs for many years now, including SUSE Linux Enterprise Server (SLES). You can’t imagine a company that provides support to customers shipping software that they don’t completely trust.

What’s next?

The Change is code complete, and has been testable in Rawhide as the default file system since early July. Btrfs has been explicitly supported in Fedora since 2012. This is expected to be a transparent change for most users, however it is still significant. Fedora will ensure we deliver the dependable and reliable experience Fedora users have come to expect.

Special thanks to: Ben Cotton, Michael Catanzaro, and the Fedora Workstation Working Group for contributing to this article.

Posted on Leave a comment

Come test a new release of pipenv, the Python development tool

Pipenv is a tool that helps Python developers maintain isolated virtual environments with specifacally defined set of dependencies to achieve reproducible development and deployment environments. It is similar to tools for different programming languages, such as bundler, composer, npm, cargo, yarn, etc.

A new version of pipenv, 2020.6.2, has been recently released. It is now available in Fedora 33 and rawhide. For older Fedoras, the maintainers decided to package it in COPR to be tested first. So come try it out, before they push it into stable Fedora versions. The new version doesn’t bring any fancy new features, but after two years of development it fixes a lot of problems and does many things differently under the hood. What worked for you previously should continue to work, but might behave slightly differently.

How to get it

If you are already running Fedora 33 or rawhide, run $ sudo dnf upgrade pipenv or $ sudo dnf install pipenv and you’ll get the new version.

On Fedora 31 or Fedora 32, you’ll need to use a copr repository until such time comes that the tested package will be updated in the official place. To enable the repository, run:

$ sudo dnf copr enable @python/pipenv

Then to upgrade pipenv to the new version, run:

$ sudo dnf upgrade pipenv

Or, if you haven’t installed it yet, install it via:

$ sudo dnf install pipenv

In case you ever need to roll back to the officially maintained version, you can run:

$ sudo dnf copr disable @python/pipenv
$ sudo dnf distro-sync pipenv

COPR is not officially supported by Fedora infrastructure. Use packages at your own risk.

How to use it

If you already have projects managed by the older version of pipenv, you should be able to use the new version in its place without issues. Let us know if something breaks.

If you are not yet familiar with pipenv or want to start a new project, here is a quick guide:

Create a working directory:

$ mkdir new-project && cd new-project

Initialize pipenv with Python 3:

$ pipenv --three

Install the packages you want, e.g.:

$ pipenv install six

Generate a Pipfile.lock file:

$ pipenv lock

Now you can commit the created Pipfile and Pipfile.lock files into your version control system (e.g. git) and others can use this command in the cloned repository to get the same environment:

$ pipenv install

See pipenv’s documentation for more examples.

How to report problems

If you encounter any problems with the new pipenv version, please report any issues in Fedora’s Bugzilla. The maintainers of the pipenv package in official Fedora repositories and in the copr repository are the same. Please indicate in the text that the report is regarding this new version.

Posted on Leave a comment

TCP window scaling, timestamps and SACK

The Linux TCP stack has a myriad of sysctl knobs that allow to change its behavior.  This includes the amount of memory that can be used for receive or transmit operations, the maximum number of sockets and optional features and protocol extensions.

There are  multiple articles that recommend to disable TCP extensions, such as timestamps or selective acknowledgments (SACK) for various “performance tuning” or “security” reasons.

This article provides background on what these extensions do, why they
are enabled by default, how they relate to one another and why it is normally a bad idea to turn them off.

TCP Window scaling

The data transmission rate that TCP can sustain is limited by several factors. Some of these are:

  • Round trip time (RTT).  This is the time it takes for a packet to get to the destination and a reply to come back. Lower is better.
  • lowest link speed of the network paths involved
  • frequency of packet loss
  • the speed at which new data can be made available for transmission
    For example, the CPU needs to be able to pass data to the network adapter fast enough. If the CPU needs to encrypt the data first, the adapter might have to wait for new data. In similar fashion disk storage can be a bottleneck if it can’t read the data fast enough.
  • The maximum possible size of the TCP receive window. The receive window determines how much data (in bytes) TCP can transmit before it has to wait for the receiver to report reception of that data. This is announced by the receiver. The receiver will constantly update this value as it reads and acknowledges reception of the incoming data. The receive windows current value is contained in the TCP header that is part of every segment sent by TCP. The sender is thus aware of the current receive window whenever it receives an acknowledgment from the peer. This means that the higher the round-trip time, the longer it takes for sender to get receive window updates.

TCP is limited to at most 64 kilobytes of unacknowledged (in-flight) data. This is not even close to what is needed to sustain a decent data rate in most networking scenarios. Let us look at some examples.

Theoretical data rate

With a round-trip-time of 100 milliseconds, TCP can transfer at most 640 kilobytes per second. With a 1 second delay, the maximum theoretical data rate drops down to only 64 kilobytes per second.

This is because of the receive window. Once 64kbyte of data have been sent the receive window is already full.  The sender must wait until the peer informs it that at least some of the data has been read by the application. 

The first segment sent reduces the TCP window by the size of that segment. It takes one round-trip before an update of the receive window value will become available. When updates arrive with a 1 second delay, this results in a 64 kilobyte limit even if the link has plenty of bandwidth available.

In order to fully utilize a fast network with several milliseconds of delay, a window size larger than what classic TCP supports is a must. The ’64 kilobyte limit’ is an artifact of the protocols specification: The TCP header reserves only 16bits for the receive window size. This allows receive windows of up to 64KByte. When the TCP protocol was originally designed, this size was not seen as a limit.

Unfortunately, its not possible to just change the TCP header to support a larger maximum window value. Doing so would mean all implementations of TCP would have to be updated simultaneously or they wouldn’t understand one another anymore. To solve this, the interpretation of the receive window value is changed instead.

The ‘window scaling option’ allows to do this while keeping compatibility to existing implementations.

TCP Options: Backwards-compatible protocol extensions

TCP supports optional extensions. This allows to enhance the protocol with new features without the need to update all implementations at once. When a TCP initiator connects to the peer, it also send a list of supported extensions. All extensions follow the same format: an unique option number followed by the length of the option and the option data itself.

The TCP responder checks all the option numbers contained in the connection request. If it does not understand an option number it skips
‘length’ bytes of data and checks the next option number. The responder omits those it did not understand from the reply. This allows both the sender and receiver to learn the common set of supported options.

With window scaling, the option data always consist of a single number.

The window scaling option

 
Window Scale option (WSopt): Kind: 3, Length: 3
    +---------+---------+---------+
    | Kind=3  |Length=3 |shift.cnt|
    +---------+---------+---------+
         1         1         1

The window scaling option tells the peer that the receive window value found in the TCP header should be scaled by the given number to get the real size.

For example, a TCP initiator that announces a window scaling factor of 7 tries to instruct the responder that any future packets that carry a receive window value of 512 really announce a window of 65536 byte. This is an increase by a factor of 128. This would allow a maximum TCP Window of 8 Megabytes.

A TCP responder that does not understand this option ignores it. The TCP packet sent in reply to the connection request (the syn-ack) then does not contain the window scale option. In this case both sides can only use a 64k window size. Fortunately, almost every TCP stack supports and enables this option by default, including Linux.

The responder includes its own desired scaling factor. Both peers can use a different number. Its also legitimate to announce a scaling factor of 0. This means the peer should treat the receive window value it receives verbatim, but it allows scaled values in the reply direction — the recipient can then use a larger receive window.

Unlike SACK or TCP timestamps, the window scaling option only appears in the first two packets of a TCP connection, it cannot be changed afterwards. It is also not possible to determine the scaling factor by looking at a packet capture of a connection that does not contain the initial connection three-way handshake.

The largest supported scaling factor is 14. This allows TCP window sizes
of up to one Gigabyte.

Window scaling downsides

It can cause data corruption in very special cases. Before you disable the option – it is impossible under normal circumstances. There is also a solution in place that prevents this. Unfortunately, some people disable this solution without realizing the relationship with window scaling. First, let’s have a look at the actual problem that needs to be addressed. Imagine the following sequence of events:

  1. The sender transmits segments: s_1, s_2, s_3, … s_n
  2.  The receiver sees: s_1, s_3, .. s_n and sends an acknowledgment for s_1.
  3.  The sender considers s_2 lost and sends it a second time. It also sends new data contained in segment s_n+1.
  4.  The receiver then sees: s_2, s_n+1, s_2: the packet s_2 is received twice.

This can happen for example when a sender triggers re-transmission too early. Such erroneous re-transmits are never a problem in normal cases, even with window scaling. The receiver will just discard the duplicate.

Old data to new data

The TCP sequence number can be at most 4 Gigabyte. If it becomes larger than this, the sequence wraps back to 0 and then increases again. This is not a problem in itself, but if this occur fast enough then the above scenario can create an ambiguity.

If a wrap-around occurs at the right moment, the sequence number s_2 (the re-transmitted packet) can already be larger than s_n+1. Thus, in the last step (4), the receiver may interpret this as: s_2, s_n+1, s_n+m, i.e. it could view the ‘old’ packet s_2 as containing new data.

Normally, this won’t happen because a ‘wrap around’ occurs only every couple of seconds or minutes even on high bandwidth links. The interval between the original and a unneeded re-transmit will be a lot smaller.

For example,with a transmit speed of 50 Megabytes per second, a
duplicate needs to arrive more than one minute late for this to become a problem. The sequence numbers do not wrap fast enough for small delays to induce this problem.

Once TCP approaches ‘Gigabyte per second’ throughput rates, the sequence numbers can wrap so fast that even a delay by only a few milliseconds can create duplicates that TCP cannot detect anymore. By solving the problem of the too small receive window, TCP can now be used for network speeds that were impossible before – and that creates a new, albeit rare problem. To safely use Gigabytes/s speed in environments with very low RTT receivers must be able to detect such old duplicates without relying on the sequence number alone.

TCP time stamps

A best-before date

In the most simple terms, TCP timestamps just add a time stamp to the packets to resolve the ambiguity caused by very fast sequence number wrap around. If a segment appears to contain new data, but its timestamp is older than the last in-window packet, then the sequence number has wrapped and the ”new” packet is actually an older duplicate. This resolves the ambiguity of re-transmits even for extreme corner cases.

But this extension allows for more than just detection of old packets. The other major feature made possible by TCP timestamps are more precise round-trip time measurements (RTTm).

A need for precise round-trip-time estimation

When both peers support timestamps,  every TCP segment carries two additional numbers: a timestamp value and a timestamp echo.

 
TCP Timestamp option (TSopt): Kind: 8, Length: 10
+-------+----+----------------+-----------------+
|Kind=8 | 10 |TS Value (TSval)|EchoReply (TSecr)|
+-------+----+----------------+-----------------+
    1      1         4                4

An accurate RTT estimate is crucial for TCP performance. TCP automatically re-sends data that was not acknowledged. Re-transmission is triggered by a timer: If it expires, TCP considers one or more packets that it has not yet received an acknowledgment for to be lost. They are then sent again.

But “has not been acknowledged” does not mean the segment was lost. It is also possible that the receiver did not send an acknowledgment so far or that the acknowledgment is still in flight. This creates a dilemma: TCP must wait long enough for such slight delays to not matter, but it can’t wait for too long either.

Low versus high network delay

In networks with a high delay, if the timer fires too fast, TCP frequently wastes time and bandwidth with unneeded re-sends.

In networks with a low delay however,  waiting for too long causes reduced throughput when a real packet loss occurs. Therefore, the timer should expire sooner in low-delay networks than in those with a high delay. The tcp retransmit timeout therefore cannot use a fixed constant value as a timeout. It needs to adapt the value based on the delay that it experiences in the network.

Round-trip time measurement

TCP picks a retransmit timeout that is based on the expected round-trip time (RTT). The RTT is not known in advance. RTT is estimated by measuring the delta between the time a segment is sent and the time TCP receives an acknowledgment for the data carried by that segment.

This is complicated by several factors.

  • For performance reasons, TCP does not generate a new acknowledgment for every packet it receives. It waits  for a very small amount of time: If more segments arrive, their reception can be acknowledged with a single ACK packet. This is called “cumulative ACK”.
  •  The round-trip-time is not constant. This is because of a myriad of factors. For example, a client might be a mobile phone switching to different base stations as its moved around. Its also possible that packet switching takes longer when link or CPU utilization increases.
  • a packet that had to be re-sent must be ignored during computation. This is because the sender cannot tell if the ACK for the re-transmitted segment is acknowledging the original transmission (that arrived after all) or the re-transmission.

This last point is significant: When TCP is busy recovering from a loss, it may only receives ACKs for re-transmitted segments. It then can’t measure (update) the RTT during this recovery phase. As a consequence it can’t adjust the re-transmission timeout, which then keeps growing exponentially. That’s a pretty specific case (it assumes that other mechanisms such as fast retransmit or SACK did not help). Nevertheless, with TCP timestamps, RTT evaluation is done even in this case.

If the extension is used, the peer reads the timestamp value from the TCP segments extension space and stores it locally. It then places this value in all the segments it sends back as the “timestamp echo”.

Therefore the option carries two timestamps: Its senders own timestamp and the most recent timestamp it received from the peer. The “echo timestamp” is used by the original sender to compute the RTT. Its the delta between its current timestamp clock and what was reflected in the “timestamp echo”.

Other timestamp uses

TCP timestamps even have other uses beyond PAWS and RTT measurements. For example it becomes possible to detect if a retransmission was unnecessary. If the acknowledgment carries an older timestamp echo, the acknowledgment was for the initial packet, not the re-transmitted one.

Another, more obscure use case for TCP timestamps is related to the TCP syn cookie feature.

TCP connection establishment on server side

When connection requests arrive faster than a server application can accept the new incoming connection, the connection backlog will eventually reach its limit. This can occur because of a mis-configuration of the system or a bug in the application. It also happens when one or more clients send connection requests without reacting to the ‘syn ack’ response. This fills the connection queue with incomplete connections. It takes several seconds for these entries to time out. This is called a “syn flood attack”.

TCP timestamps and TCP syn cookies

Some TCP stacks allow to accept new connections even if the queue is full. When this happens, the Linux kernel will print a prominent message to the system log:

Possible SYN flooding on port P. Sending Cookies. Check SNMP counters.

This mechanism bypasses the connection queue entirely. The information that is normally stored in the connection queue is encoded into the SYN/ACK responses TCP sequence number. When the ACK comes back, the queue entry can be rebuilt from the sequence number.

The sequence number only has limited space to store information. Connections established using the ‘TCP syn cookie’ mechanism can not support TCP options for this reason.

The TCP options that are common to both peers can be stored in the timestamp, however. The ACK packet reflects the value back in the timestamp echo field which allows to recover the agreed-upon TCP options as well. Else, cookie-connections are restricted by the standard 64 kbyte receive window.

Common myths – timestamps are bad for performance

Unfortunately some guides recommend disabling TCP timestamps to reduce the number of times the kernel needs to access the timestamp clock to get the current time. This is not correct. As explained before, RTT estimation is a necessary part of TCP. For this reason, the kernel always takes a microsecond-resolution time stamp when a packet is received/sent.

Linux re-uses the clock timestamp taken for the RTT estimation for the remainder of the packet processing step. This also avoids the extra clock access to add a timestamp to an outgoing TCP packet.

The entire timestamp option only requires 10 bytes of TCP option space in each packet, this is not a significant decrease in space available for packet payload.

common myths – timestamps are a security problem

Some security audit tools and (older) blog posts recommend to disable TCP
timestamps because they allegedly leak system uptime: This would then allow to estimate the patch level of the system/kernel. This was true in the past: The timestamp clock is based on a constantly increasing value that starts at a fixed value on each system boot. A timestamp value would give a estimate as to how long the machine has been running (uptime).

As of Linux 4.12 TCP timestamps do not reveal the uptime anymore. All timestamp values sent use a peer-specific offset. Timestamp values also wrap every 49 days.

In other words, connections from or to address “A” see a different timestamp than connections to the remote address “B”.

Run sysctl net.ipv4.tcp_timestamps=2 to disable the randomization offset. This makes analyzing packet traces recorded by tools like wireshark or tcpdump easier – packets sent from the host then all have the same clock base in their TCP option timestamp.  For normal operation the default setting should be left as-is.

Selective Acknowledgments

TCP has problems if several packets in the same window of data are lost. This is because TCP Acknowledgments are cumulative, but only for packets
that arrived in-sequence. Example:

  • Sender transmits segments s_1, s_2, s_3, … s_n
  • Sender receives ACK for s_2
  • This means that both s_1 and s_2 were received and the
    sender no longer needs to keep these segments around.
  • Should s_3 be re-transmitted? What about s_4? s_n?

The sender waits for a “retransmission timeout” or ‘duplicate ACKs’ for s_2 to arrive. If a retransmit timeout occurs or several duplicate ACKs for s_2 arrive, the sender transmits s_3 again.

If the sender receives an acknowledgment for s_n, s_3 was the only missing packet. This is the ideal case. Only the single lost packet was re-sent.

If the sender receives an acknowledged segment that is smaller than s_n, for example s_4, that means that more than one packet was lost. The
sender needs to re-transmit the next segment as well.

Re-transmit strategies

Its possible to just repeat the same sequence: re-send the next packet until the receiver indicates it has processed all packet up to s_n. The problem with this approach is that it requires one RTT until the sender knows which packet it has to re-send next. While such strategy avoids unnecessary re-transmissions, it can take several seconds and more until TCP has re-sent the entire window of data.

The alternative is to re-send several packets at once. This approach allows TCP to recover more quickly when several packets have been lost. In the above example TCP re-send s_3, s_4, s_5, .. while it can only be sure that s_3 has been lost.

From a latency point of view, neither strategy is optimal. The first strategy is fast if only a single packet has to be re-sent, but takes too long when multiple packets were lost.

The second one is fast even if multiple packet have to be re-sent, but at the cost of wasting bandwidth. In addition, such a TCP sender could have transmitted new data already while it was doing the unneeded re-transmissions.

With the available information TCP cannot know which packets were lost. This is where TCP Selective Acknowledgments (SACK) come in. Just like window scaling and timestamps, it is another optional, yet very useful TCP feature.

The SACK option

 
   TCP Sack-Permitted Option: Kind: 4, Length 2
   +---------+---------+
   | Kind=4  | Length=2|
   +---------+---------+

A sender that supports this extension includes the “Sack Permitted” option in the connection request. If both endpoints support the extension, then a peer that detects a packet is missing in the data stream can inform the sender about this.

 
   TCP SACK Option: Kind: 5, Length: Variable
                     +--------+--------+
                     | Kind=5 | Length |
   +--------+--------+--------+--------+
   |      Left Edge of 1st Block       |
   +--------+--------+--------+--------+
   |      Right Edge of 1st Block      |
   +--------+--------+--------+--------+
   |                                   |
   /            . . .                  /
   |                                   |
   +--------+--------+--------+--------+
   |      Left Edge of nth Block       |
   +--------+--------+--------+--------+
   |      Right Edge of nth Block      |
   +--------+--------+--------+--------+

A receiver that encounters segment_s2 followed by s_5…s_n, it will include a SACK block when it sends the acknowledgment for s_2:

 
                +--------+-------+
                | Kind=5 |   10  |
+--------+------+--------+-------+
| Left edge: s_5                 |
+--------+--------+-------+------+
| Right edge: s_n                |
+--------+-------+-------+-------+

This tells the sender that segments up to s_2 arrived in-sequence, but it also lets the sender know that the segments s_5 to s_n were also received. The sender can then re-transmit these two packets and proceed to send new data.

The mythical lossless network

In theory SACK provides no advantage if the connection cannot experience packet loss. Or the connection has such a low latency that even waiting one full RTT does not matter.

In practice lossless behavior is virtually impossible to ensure.
Even if the network and all its switches and routers have ample bandwidth and buffer space packets can still be lost:

  • The host operating system might be under memory pressure and drop
    packets. Remember that a host might be handling tens of thousands of packet streams simultaneously.
  • The CPU might not be able to drain incoming packets from the network interface fast enough. This causes packet drops in the network adapter itself.
  • If TCP timestamps are not available even a connection with a very small RTT can stall momentarily during loss recovery.

Use of SACK does not increase the size of TCP packets unless a connection experiences packet loss. Because of this, there is hardly a reason to disable this feature. Almost all TCP stacks support SACK – it is typically only absent on low-power IOT-alike devices that are not doing TCP bulk data transfers.

When a Linux system accepts a connection from such a device, TCP automatically disables SACK for the affected connection.

Summary

The three TCP extensions examined in this post are all related to TCP performance and should best be left to the default setting: enabled.

The TCP handshake ensures that only extensions that are understood by both parties are used, so there is never a need to disable an extension globally just because a peer might not support it.

Turning these extensions off results in severe performance penalties, especially in case of TCP Window Scaling and SACK. TCP timestamps can be disabled without an immediate disadvantage, however there is no compelling reason to do so anymore. Keeping them enabled also makes it possible to support TCP options even when SYN cookies come into effect.

Posted on Leave a comment

install Fedora on a Raspberry Pi 3

Fire up a Raspberry Pi with Fedora.

The Raspberry Pi Foundation has produced quite a few models over the years. This procedure was tested on third generation Pis – a Model B v1.2, and a Model B+ (the older 2 and the new 4 weren’t tested). These are the credit-card size Pis that have been around a few years.

get hardware

You do need a few hardware items, including the Pi. You don’t need any HaT (Hardware Attached on Top) boards or USB antennas. If you have used your Pi in the past, you probably have all these items.

  • current network. Perhaps this is your home lab.
  • ethernet cable. This connects the current network to the Raspberry Pi
  • Raspberry Pi 3, model B or B+.
  • power supply
  • micro-SD card, 8GB or larger.
  • keyboard and video monitor.

The keyboard and video monitor together make up the local console. It’s possible – though complicated – to get by without a console, such as setting up an automated install then connecting over the network. A local console makes it easy to answer the configuration questions during Fedora’s first boot. Also, a mistake during AP configuration may break the network, locking out remote users.

download Fedora Minimal

The Fedora Minimal image, one of Fedora’s alt downloads, has all the core packages and network packages required (well, nearly – check out dnsmasq below). The image contains a ready-made file system, with over 400 packages already installed. This minimal image does not include popular packages like a development environment, Internet service or desktop. These types of software aren’t required for this work, and may well use too much memory if you install them.

The Fedora Minimal raw image fits on a small SD card and runs in less than 1 GB of memory (these old Pis have 1GB RAM).

The name of the downloaded file is something like Fedora-Minimal-32-1.6.aarch64.raw.xz. The file is compressed and is about 700MB in size. When the file is uncompressed, it’s 5GB. It’s an ext4 file system that’s mostly empty – about 1GB is used and 4GB is empty. All that empty space is the reason the compressed download is so much smaller than the uncompressed raw image.

copy to the micro-SD card

  • Copy the image to a micro-SD card.

This can be a more complex than it sounds, and a painful experience. Finding a good micro-SD card takes work. Then there’s the challenge of physically attaching the card to your computer.Perhaps your laptop has a full SD card slot and you need a card adapter, or perhaps you need a USB adapter. Then, when it comes to copying, the OS may either help or get in your way. You may have luck with Fedora Media Writer, or with these Linux commands.

unxz ./Fedora-Minimal-32-1.6.aarch64.raw.xz
dd if=./Fedora-Minimal-32-1.6.aarch64.raw of=/dev/mmcblk0 bs=8M status=progress oflag=direct

set up Fedora

  • Connect the Pi, power cable, network cable and micro-SD card.
  • Hit the power.
  • See the colored box as the graphics chip powers up.
  • Wait for the anaconda installer to start.
  • Answer anaconda’s setup questions.

Initial configuration of the OS takes a few minutes – a couple minutes waiting for boot-up, and a couple to fill out the spokes of anaconda’s text-based installer. In the examples below, the user is named nick and is an administrator (a member of the wheel group).

Congratulations! Your Fedora Pi is up and operational.

update software

  • Update packages with `dnf update`.
  • Reboot the machine with `systemctl reboot`.

Over the years, a lot of people have put a lot of work into making the Raspberry Pi devices work. Use the latest software to make sure you get the benefit of their hard work. If you skip this step, you may find some things just don’t work.

The update downloads and installs about a hundred packages. Since the storage is a micro-SD card, writing new software is a slow process. This is what using computing storage felt like in the 1990s.

things to play with

There are a few other things that can be set up at this point, if you want to play around. It’s all optional. Try things like this.

  • Replace the localhost hostname with the command `sudo hostnamectl set-hostname raspi`.
  • Find the IP address with `ip addr`.
  • Try an SSH login, or even set up key-based login with `ssh-copy-id`.
  • Power down with `systemctl poweroff`.