Posted on Leave a comment

Unreal Fellowship Launched

Unreal Engine is increasingly being used for virtual production in TV and Film, perhaps most famously in recent productions such as the Mandalorian.  This move to real-time tools in VFX development is only going to become more common and to encourage growth, Epic Games have launched the Unreal Fellowship program.

The Unreal Fellowship is a 30-day intensive online workshop during which experienced industry professionals can learn Unreal Engine, master state-of-the-art virtual production tools, and develop the skills to build teams proficient in the emerging field of real-time production.

The Fellowship provides $10,000 in financial assistance to each participant to ensure they are able to dedicate ample time to complete the rigorous curriculum.
Top VFX veterans will be joined by guest speakers in hands-on sessions that teach Unreal Engine fundamentals, model ingestion, animation, mocap integration, lookdev, lighting setups, and cinematic storytelling.

With a strong focus on practical work, participants will learn real-time workflows through the production of animated content and other projects. Fellows will have direct access to trainers with weekly team mentor sessions, a weekly four-hour open lab, and a dedicated Slack channel with specific hours for help.

Applications for the Fellowship are open until Monday, July 27 at 11:59 PM EDT. A minimum of five years of experience in commercial film and television production, immersive entertainment or game development is required, along with the ability to commit to the Fellowship full-time from August 21 to September 21. Don’t miss out—there are only 50 spaces available!

The application form and further details are available here.  You can learn more about the fellowship in the video below.

[embedded content]

News


<!–

–>

Posted on Leave a comment

Spam Classification with ML-Pack

Introduction

ML-Pack is a small footprint C++ machine learning library that can be easily integrated into other programs. It is an actively developed open source project and released under a BSD-3 license. Machine learning has gained popularity due to the large amount of electronic data that can be collected. Some other popular machine learning frameworks include TensorFlow, MxNet, PyTorch, Chainer and Paddle Paddle, however these are designed for more complex workflows than ML-Pack. On Fedora, ML-Pack is packaged by its lead developer Ryan Curtin. In addition to a command line interface, ML-Pack has bindings for Python and Julia. Here, we will focus on the command line interface since this may be useful for system administrators to integrate into their workflows.

Installation

You can install ML-Pack on the Fedora command line using

$ sudo dnf -y install mlpack mlpack-bin

You can also install the documentation, development headers and Python bindings by using …

$ sudo dnf -y install mlpack-doc \
mlpack-devel mlpack-python3

though they will not be used in this introduction.

Example

As an example, we will train a machine learning model to classify spam SMS messages. To keep this article brief, linux commands will not be fully explained, but you can find out more about them by using the man command, for example for the command first command used below, wget

$ man wget

will give you information that wget will download files from the web and options you can use for it.

Get a dataset

We will use an example spam dataset in Indonesian provided by Yudi Wibisono

 
$ wget https://drive.google.com/file/d/1-stKadfTgJLtYsHWqXhGO3nTjKVFxm_Q/view
$ unzip dataset_sms_spam_bhs_indonesia_v1.zip

Pre-process dataset

We will try to classify a message as spam or ham by the number of occurrences of a word in a message. We first change the file line endings, remove line 243 which is missing a label and then remove the header from the dataset. Then, we split our data into two files, labels and messages. Since the labels are at the end of the message, the message is reversed and then the label removed and placed in one file. The message is then removed and placed in another file.

$ tr 'r' 'n' < dataset_sms_spam_v1.csv > dataset.txt
$ sed '243d' dataset.txt > dataset1.csv
$ sed '1d' dataset1.csv > dataset.csv
$ rev dataset.csv | cut -c1 | rev > labels.txt
$ rev dataset.csv | cut -c2- | rev > messages.txt
$ rm dataset.csv
$ rm dataset1.csv
$ rm dataset.txt

Machine learning works on numeric data, so we will use labels of 1 for ham and 0 for spam. The dataset contains three labels, 0, normal sms (ham), 1, fraud (spam), and 2 promotion (spam). We will label all spam as 1, so promotions and fraud will be labelled as 1.

$ tr '2' '1' < labels.txt > labels.csv
$ rm labels.txt

The next step is to convert all text in the messages to lower case and for simplicity remove punctuation and any symbols that are not spaces, line endings or in the range a-z (one would need expand this range of symbols for production use)

$ tr '[:upper:]' '[:lower:]' < \
messages.txt > messagesLower.txt
$ tr -Cd 'abcdefghijklmnopqrstuvwxyz n' < \ messagesLower.txt > messagesLetters.txt
$ rm messagesLower.txt

We now obtain a sorted list of unique words used (this step may take a few minutes, so use nice to give it a low priority while you continue with other tasks on your computer).

$ nice -20 xargs -n1 < messagesLetters.txt > temp.txt
$ sort temp.txt > temp2.txt
$ uniq temp2.txt > words.txt
$ rm temp.txt
$ rm temp2.txt

We then create a matrix, where for each message, the frequency of word occurrences is counted (more on this on Wikipedia, here and here). This requires a few lines of code, so the full script, which should be saved as ‘makematrix.sh’ is below

#!/bin/bash
declare -a words=()
declare -a letterstartind=()
declare -a letterstart=()
letter=" "
i=0
lettercount=0
while IFS= read -r line; do labels[$((i))]=$line let "i++"
done < labels.csv
i=0
while IFS= read -r line; do words[$((i))]=$line firstletter="$( echo $line | head -c 1 )" if [ "$firstletter" != "$letter" ] then letterstartind[$((lettercount))]=$((i)) letterstart[$((lettercount))]=$firstletter letter=$firstletter let "lettercount++" fi let "i++"
done < words.txt
letterstartind[$((lettercount))]=$((i))
echo "Created list of letters" touch wordfrequency.txt
rm wordfrequency.txt
touch wordfrequency.txt
messagecount=0
messagenum=0
messages="$( wc -l messages.txt )"
i=0
while IFS= read -r line; do let "messagenum++" declare -a wordcount=() declare -a wordarray=() read -r -a wordarray <<> wordfrequency.txt echo "Processed message ""$messagenum" let "i++"
done < messagesLetters.txt
# Create csv file
tr ' ' ',' data.csv

Since Bash is an interpreted language, this simple implementation can take upto 30 minutes to complete. If using the above Bash script on your primary workstation, run it as a task with low priority so that you can continue with other work while you wait:

$ nice -20 bash makematrix.sh

Once the script has finished running, split the data into testing (30%) and training (70%) sets:

$ mlpack_preprocess_split \ --input_file data.csv \ --input_labels_file labels.csv \ --training_file train.data.csv \ --training_labels_file train.labels.csv \ --test_file test.data.csv \ --test_labels_file test.labels.csv \ --test_ratio 0.3 \ --verbose

Train a model

Now train a Logistic regression model:

$ mlpack_logistic_regression \
--training_file train.data.csv \
--labels_file train.labels.csv --lambda 0.1 \
--output_model_file lr_model.bin

Test the model

Finally we test our model by producing predictions,

$ mlpack_logistic_regression \
--input_model_file lr_model.bin \ --test_file test.data.csv \
--output_file lr_predictions.csv

and comparing the predictions with the exact results,

$ export incorrect=$(diff -U 0 lr_predictions.csv \
test.labels.csv | grep '^@@' | wc -l)
$ export tests=$(wc -l < lr_predictions.csv)
$ echo "scale=2; 100 * ( 1 - $((incorrect)) \
/ $((tests)))" | bc

This gives approximately 90% validation rate, similar to that obtained here.

The dataset is composed of approximately 50% spam messages, so the validation rates are quite good without doing much parameter tuning. In typical cases, datasets are unbalanced with many more entries in some categories than in others. In these cases a good validation rate can be obtained by mispredicting the class with a few entries. Thus to better evaluate these models, one can compare the number of misclassifications of spam, and the number of misclassifications of ham. Of particular importance in applications is the number of false positive spam results as these are typically not transmitted. The script below produces a confusion matrix which gives a better indication of misclassification. Save it as ‘confusion.sh’

#!/bin/bash
declare -a labels
declare -a lr
i=0
while IFS= read -r line; do labels[i]=$line let "i++"
done < test.labels.csv
i=0
while IFS= read -r line; do lr[i]=$line let "i++"
done < lr_predictions.csv
TruePositiveLR=0
FalsePositiveLR=0
TrueZerpLR=0
FalseZeroLR=0
Positive=0
Zero=0
for i in "${!labels[@]}"; do if [ "${labels[$i]}" == "1" ] then let "Positive++" if [ "${lr[$i]}" == "1" ] then let "TruePositiveLR++" else let "FalseZeroLR++" fi fi if [ "${labels[$i]}" == "0" ] then let "Zero++" if [ "${lr[$i]}" == "0" ] then let "TrueZeroLR++" else let "FalsePositiveLR++" fi fi done
echo "Logistic Regression"
echo "Total spam" $Positive
echo "Total ham" $Zero
echo "Confusion matrix"
echo " Predicted class"
echo " Ham | Spam "
echo " ---------------"
echo " Actual| Ham | " $TrueZeroLR "|" $FalseZeroLR
echo " class | Spam | " $FalsePositiveLR " |" $TruePositiveLR
echo ""

then run the script

$ bash confusion.sh

You should get output similar to

Logistic Regression
Total spam 183
Total ham 159
Confusion matrix

    Predicted class
    Ham Spam
Actual class Ham 128 26
Spam 31 157

which indicates a reasonable level of classification. Other methods you can try in ML-Pack for this problem include Naive Bayes, random forest, decision tree, AdaBoost and perceptron.

To improve the error rating, you can try other pre-processing methods on the initial data set. Neural networks can give upto 99.95% validation rates, see for example here, here and here. However, using these techniques with ML-Pack cannot be done on the command line interface at present and is best covered in another post.

For more on ML-Pack, please see the documentation.

Posted on Leave a comment

How to configure an SSH proxy server with Squid

Sometimes you can’t connect to an SSH server from your current location. Other times, you may want to add an extra layer of security to your SSH connection. In these cases connecting to another SSH server via a proxy server is one way to get through.

Squid is a full-featured proxy server application that provides caching and proxy services. It’s normally used to help improve response times and reduce network bandwidth by reusing and caching previously requested web pages during browsing.

However for this setup you’ll configure Squid to be used as an SSH proxy server since it’s a robust trusted proxy server that is easy to configure.

Installation and configuration

Install the squid package using sudo:

$ sudo dnf install squid -y

The squid configuration file is quite extensive but there are only a few things we need to configure. Squid uses access control lists to manage connections.

Edit the /etc/squid/squid.conf file to make sure you have the two lines explained below.

First, specify your local IP network. The default configuration file already has a list of the most common ones but you will need to add yours if it’s not there. For example, if your local IP network range is 192.168.1.X, this is how the line would look:

acl localnet src 192.168.1.0/24

Next, add the SSH port as a safe port by adding the following line:

acl Safe_ports port 22

Save that file. Now enable and restart the squid proxy service:

$ sudo systemctl enable squid
$ sudo systemctl restart squid

4.) By default squid proxy listens on port 3128. Configure firewalld to allow for this:

$ sudo firewall-cmd --add-service=squid --perm
$ sudo firewall-cmd --reload

Testing the ssh proxy connection

To connect to a server via ssh through a proxy server we’ll be using netcat.

Install nmap-ncat if it’s not already installed:

$ sudo dnf install nmap-ncat -y

Here is an example of a standard ssh connection:

$ ssh user@example.com

Here is how you would connect to that same server using the squid proxy server as a gateway.

This example assumes the squid proxy server’s IP address is 192.168.1.63. You can also use the host-name or the FQDN of the squid proxy server:

$ ssh user@example.com -o "ProxyCommand nc --proxy 192.168.1.63:3128 %h %p"

Here are the meanings of the options:

  • ProxyCommand – Tells ssh a proxy command is going to be used.
  • nc – The command used to establish the connection to the proxy server. This is the netcat command.
  • %h – The placeholder for the proxy server’s host-name or IP address.
  • %p – The placeholder for the proxy server’s port number.

There are many ways to configure an SSH proxy server but this is a simple way to get started.

Posted on Leave a comment

Fedora Classroom Session: Git 101 with Pagure

The Fedora Classroom is a project to help people by spreading knowledge on subjects related to Fedora for others, If you would like to propose a session, feel free to open a ticket here with the tag classroom. If you’re interested in taking a proposed session, kindly let us know and once you take it, you will be awarded the Sensei Badge too as a token of appreciation. Recordings from the previous sessions can be found here.

We’re back with another awesome classroom on Git 101 with Pagure led by Akashdeep Dhar (t0xic0der).

About the session

In short, the Git 101 with Pagure session will be a guide for newcomers on how to get started with Git with the git forge Pagure used by the Fedora community. After finishing the session you will have the knowledge to manage Git and Pagure and generate the first contributions on the Fedora Project.

When and where

The Classroom session will be organized on Jul 17th, 17:00 UTC. Here’s a link to see what time it is in your timezone. The session will be streamed on Fedora Project’s YouTube channel.

Topics covered in the session

  • Version Control Systems
  • Why Git?
  • VCS Hosting Sites
  • Fedora Pagure
  • Exploring Pagure
  • Git Fundamentals

About the instructor

Akashdeep Dhar is a cybersecurity enthusiast with keen interests in networking, cloud computing and operating systems. He is currently in the final year of his computer science major with cybersecurity minor bachelor degree. He has over five years of experience in using GNU/Linux systems and is new to the Fedora community with contributions made so far in infrastructure, classroom and documentation.

If you miss the session, the recording will also be uploaded in the Fedora Project‘s YouTube channel.

We hope you can attend and enjoy this experience from some of the awesome people that work in Fedora Project. We look forward to seeing you in the Classroom session.


Photograph used in feature image is San Simeon School House by Anita RitenourCC-BY 2.0.

Posted on Leave a comment

Automating Network Devices with Ansible

Ansible is a great automation tool for system and network engineers, with Ansible we can automate small network to a large scale enterprise network. I have been using Ansible to automate both Aruba, and Cisco switches from my Fedora powered laptops for a couple of years. This article covers the requirements and executing a couple of playbooks.

Configuring Ansible

If Ansible is not installed, it can be installed using the command below

$ sudo dnf -y install ansible

Once installed, create a folder in your home directory or a directory of your preference and copy the ansible configuration file. For this demonstration, I will be using the following.

$ mkdir -pv /home/$USER/network_automation
$ sudo cp -v /etc/ansible.cfg /home/$USER/network_automation
$ cd /home/$USER/network_automation
$ sudo chown $USER.$USER && chmod 0600 ansible.cfg

To prevent lengthy commands from failing, edit the ansible.cfg and append the following lines. We must add the persistent connection and set the desired time in seconds for the command_timeout as demonstrated below. A use case where this is useful is when you are performing backups of a network device that has a lengthy configuration.

$ vim ansible.cfg
[persistent_connection]
command_timeout = 300
connection_timeout = 30

Requirements

If SELinux is enabled, you will need to install SELinux binding, which is required when using the copy module.

# Install SELinux bindings
dnf -y install python3-libselinux python3-libsemanage

Creating the inventory

The inventory holds the names of the network assets, and grouping of the assets are in square brackets [], below is a  sample inventory.

[site_a]
Core_A ansible_host=192.168.122.200
Distro_A ansible_host=192.168.122.201
Distro_B ansible_host=192.168.122.202

Group vars can be used to address the common variables, for example, credentials, network operating system, and so on. Ansible document on inventory provides additional details.

Playbook

Playbooks are Ansible’s configuration, deployment, and orchestration language. They can describe a policy you want your remote systems to enforce, or a set of steps in a general IT process.
Ansible Playbook

Read Operations

Let us create a simple playbook to run a show command to read the configuration on a few switches.

 
  1 ---
  2 - name: Basic Playbook
  3   hosts: site_a
  4   connection: local
  5
  6   tasks:
  7   - name: Get Interface Brief
  8     ios_command:
  9       commands:
 10         - show ip interface brief | e una
 11     register: interfaces
 12
 13   - name: Print results
 14     debug:
 15       msg: "{{ interfaces.stdout[0] }}
Without Debug

With Debug

The above images show the differences without and with the debug module respectively.

Let’s break the playbook into three blocks, starting with lines 1 to 4.

  • The three dashes/hyphens starts the YAML document
  • The hosts defines the hosts or host groups, multiple groups are comma-separated
  • Connection defines the methodology to connect to the network devices. Another option is network_cli (recommended method) and will be used later in this article. See IOS Platform Options for more details.

Lines 6 to 11 starts the tasks, we will be using ios_command and ios_config. This play will execute the show command show ip interface brief | e una and save the output from the command into the interfaces variable, with the register key.

Lines 13 to 15, by default, when you execute a show command you will not see the output, though this is not used during automation. It is very useful for debugging; therefore, the debug module was used.

The below video shows the execution of the playbook. There are a couple of ways you can execute the playbook.

  • Passing arguments to the command line, for example, include -u <username> -k to prompt for the remote user credentials
 
ansible-playbook -i inventory show_demo.yaml -u admin -k
  • Include the credentials in the host or group vars
 
ansible-playbook -i inventory show_demo.yaml

Never store passwords in plain text. We recommend using SSH keys to authenticate SSH connections. Ansible supports ssh-agent to manage your SSH keys. If you must use passwords to authenticate SSH connections, we recommend encrypting them with
Using Vault in Playbooks

Passing arguments to the command line
Credentials in the inventory

If we want to save the output to a file, we will use the copy module as shown in the playbook below. In addition to using the copy module, we will include the backup_dir variable to specify the directory path.

 
---
- name: Get System Infomation
  hosts: site_a
  connection: network_cli
  gather_facts: no
 
  vars:
    backup_dir: /home/eramirez/dev/ansible/fedora_magazine
 
  tasks:
  - name: get system interfaces
    ios_command:
      commands:
        - show ip int br | e una
    register: interface
   
  - name: Save result to disk
    copy:
      content: "{{ interface.stdout[0] }}"
      dest: "{{ backup_dir }}/{{ inventory_hostname }}.txt"

To demonstrate the use of variables in the inventory, we will use plain text. This method Must not be used in production.

 
[site_a]
Core_A ansible_host=192.168.122.200
Distro_A ansible_host=192.168.122.201
Distro_B ansible_host=192.168.122.202
[all:vars]
ansible_connection=network_cli
ansible_network_os=ios
ansible_user=admin
ansible_password=fedora
ansible_become=yes
ansible_become_password=yes
ansible_become_method=enable

Write Operations

In the previous section, we saw that we could get information from the network devices; in this section, we will write (add/modify) the configuration on these network devices. To make changes to the network device, we will be using the ios config module.

Let us create a playbook to configure a couple of interfaces in all of the network devices in site_a. We will first take a backup of the current configuration of all devices in site_a. Lastly, we will save the configuration.

 
---
- name: Get System Infomation
  hosts: site_a
  connection: network_cli
  gather_facts: no
 
  vars:
    backup_dir: /home/eramirez/dev/ansible/fedora_magazine
 
  tasks:
  - name: Backup configs
    ios_config:
      backup: yes
      backup_options:
        filename: "{{ inventory_hostname }}_running_cfg.txt"
        dir_path: "{{ backup_dir }}"
   
  - name: get system interfaces
    ios_config:
      lines:
        - description Raspberry Pi
        - switchport mode access
        - switchport access vlan 100
        - spanning-tree portfast
        - logging event link-status
        - no shutdown
      parents: "{{ item }}"
    with_items:
      - interface FastEthernet1/12
      - interface FastEthernet1/13
     
  - name: Save switch configuration
    ios_config:
      save_when: modified

Before we execute the playbook, we will first validate the interface configuration. We will then run the playbook and confirm the changes as illustrated below.

Conclusion

This article is a basic introduction to whet your appetite that demonstrates how Ansible is used to manage network devices. Ansible is capable of automating a vast network, which includes MPLS routing and performing validation before executing the next task.

Posted on Leave a comment

Use DNS over TLS

The Domain Name System (DNS) that modern computers use to find resources on the internet was designed 35 years ago without consideration for user privacy. It is exposed to security risks and attacks like DNS Hijacking. It also allows ISPs to intercept the queries.

Luckily, DNS over TLS and DNSSEC are available. DNS over TLS and DNSSEC allow safe and encrypted end-to-end tunnels to be created from a computer to its configured DNS servers. On Fedora, the steps to implement these technologies are easy and all the necessary tools are readily available.

This guide will demonstrate how to configure DNS over TLS on Fedora using systemd-resolved. Refer to the documentation for further information about the systemd-resolved service.

Step 1 : Set-up systemd-resolved

Modify /etc/systemd/resolved.conf so that it is similar to what is shown below. Be sure to enable DNS over TLS and to configure the IP addresses of the DNS servers you want to use.

$ cat /etc/systemd/resolved.conf
[Resolve]
DNS=1.1.1.1 9.9.9.9
DNSOverTLS=yes
DNSSEC=yes
FallbackDNS=8.8.8.8 1.0.0.1 8.8.4.4
#Domains=~.
#LLMNR=yes
#MulticastDNS=yes
#Cache=yes
#DNSStubListener=yes
#ReadEtcHosts=yes

A quick note about the options:

  • DNS: A space-separated list of IPv4 and IPv6 addresses to use as system DNS servers
  • FallbackDNS: A space-separated list of IPv4 and IPv6 addresses to use as the fallback DNS servers.
  • Domains: These domains are used as search suffixes when resolving single-label host names, ~. stand for use the system DNS server defined with DNS= preferably for all domains.
  • DNSOverTLS: If true all connections to the server will be encrypted. Note that this mode requires a DNS server that supports DNS-over-TLS and has a valid certificate for it’s IP.

NOTE: The DNS servers listed in the above example are my personal choices. You should decide which DNS servers you want to use; being mindful of whom you are asking IPs for internet navigation.

Step 2 : Tell NetworkManager to push info to systemd-resolved

Create a file in /etc/NetworkManager/conf.d named 10-dns-systemd-resolved.conf.

$ cat /etc/NetworkManager/conf.d/10-dns-systemd-resolved.conf
[main]
dns=systemd-resolved

The setting shown above (dns=systemd-resolved) will cause NetworkManager to push DNS information acquired from DHCP to the systemd-resolved service. This will override the DNS settings configured in Step 1. This is fine on a trusted network, but feel free to set dns=none instead to use the DNS servers configured in /etc/systemd/resolved.conf.

Step 3 : start & restart services

To make the settings configured in the previous steps take effect, start and enable systemd-resolved. Then restart NetworkManager.

CAUTION: This will lead to a loss of connection for a few seconds while NetworkManager is restarting.

$ sudo systemctl start systemd-resolved
$ sudo systemctl enable systemd-resolved
$ sudo systemctl restart NetworkManager

NOTE: Currently, the systemd-resolved service is disabled by default and its use is opt-in. There are plans to enable systemd-resolved by default in Fedora 33.

Step 4 : Check if everything is fine

Now you should be using DNS over TLS. Confirm this by checking DNS resolution status with:

$ resolvectl status
MulticastDNS setting: yes DNSOverTLS setting: yes DNSSEC setting: yes DNSSEC supported: yes Current DNS Server: 1.1.1.1 DNS Servers: 1.1.1.1 9.9.9.9 Fallback DNS Servers: 8.8.8.8 1.0.0.1 8.8.4.4

/etc/resolv.conf should point to 127.0.0.53

$ cat /etc/resolv.conf
# Generated by NetworkManager
search lan
nameserver 127.0.0.53

To see the address and port that systemd-resolved is sending and receiving secure queries on, run:

$ sudo ss -lntp | grep '\(State\|:53 \)'
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process LISTEN 0 4096 127.0.0.53%lo:53 0.0.0.0:* users:(("systemd-resolve",pid=10410,fd=18))

To make a secure query, run:

$ resolvectl query fedoraproject.org
fedoraproject.org: 8.43.85.67 -- link: wlp58s0 8.43.85.73 -- link: wlp58s0 [..] -- Information acquired via protocol DNS in 36.3ms.
-- Data is authenticated: yes

BONUS Step 5 : Use Wireshark to verify the configuration

First, install and run Wireshark:

$ sudo dnf install wireshark
$ sudo wireshark

It will ask you which link device it have to begin capturing packets on. In my case, because I use a wireless interface, I will go ahead with wlp58s0. Set up a filter in Wireshark like tcp.port == 853 (853 is the DNS over TLS protocol port). You need to flush the local DNS caches before you can capture a DNS query:

$ sudo resolvectl flush-caches

Now run:

$ nslookup fedoramagazine.org

You should see a TLS-encryped exchange between your computer and your configured DNS server:

Poster in Cover Image Approved for Release by NSA on 04-17-2018, FOIA Case # 83661

Posted on Leave a comment

Running Rosetta@home on a Raspberry Pi with Fedora IoT

The Rosetta@home project is a not-for-profit distributed computing project created by the Baker laboratory at the University of Washington. The project uses idle compute capacity from volunteer computers to study protein structure, which is used in research into diseases such as HIV, Malaria, Cancer, and Alzheimer’s.

In common with many other scientific organizations, Rosetta@home is currently expending significant resources on the search for vaccines and treatments for COVID-19.

Rosetta@home uses the open source BOINC platform to manage donated compute resources. BOINC was originally developed to support the SETI@home project searching for Extraterrestrial Intelligence. These days, it is used by a number of projects in many different scientific fields. A single BOINC client can contribute compute resources to many such projects, though not all projects support all architectures.

For the example shown in this article a Raspberry Pi 3 Model B was used, which is one of the tested reference devices for Fedora IoT. This device, with only 1GB of RAM, is only just powerful enough to be able to make a meaningful contribution to Rosetta@home, and there’s certainly no way the Raspberry Pi can be used for anything else – such as running a desktop environment – at the same time.

It’s also worth mentioning at this point that the first rule of Raspberry Pi computing is to get the recommended power supply. It is important to get as close to the specified 2.5A as you can, and use a good quality micro-usb cable.

Getting Fedora IoT

To install Fedora IoT on a Raspberry Pi, the first step is to download the aarch64 Raw Image from the iot.fedoraproject.org download page.

Then use the arm-image-installer utility (sudo dnf install fedora-arm-installer) to write the image to the SD card. As always, be very sure which device name corresponds to your SD Card before continuing. Check the device with the lsblk command like this:

$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 1 59.5G 0 disk
└─sdb1 8:17 1 59.5G 0 part /run/media/gavin/154F-1CEC
nvme0n1 259:0 0 477G 0 disk
├─nvme0n1p1 259:1 0 600M 0 part
...

If you’re still not sure, try running lsblk with the SD card removed, then again with the SD card inserted and comparing the outputs. In this case it lists the SD card as /dev/sdb. If you’re really unsure, there are some more tips described in the Getting Started guide.

We need to tell arm-image-installer which image file to use, what type of device we’re going to be using, and the device name – determined above – to use for writing the image. The arm-image-installer utility is also able to expand the filesystem to use the entire SD card at the point of writing the image.

Since we’re not going to use the zezere provisioning server to deploy SSH keys to the Raspberry Pi, we need to specify the option to remove the root password so that we can log in and set it at first boot.

In my case, the full command was:

sudo arm-image-installer --image ~/Downloads/Fedora-IoT-32-20200603.0.aarch64.raw.xz --target=rpi3 --media=/dev/sdb --resizefs --norootpass

After a final confirmation prompt:

= Selected Image: = /var/home/gavin/Downloads/Fedora-IoT-32-20200603.0.aarc...
= Selected Media : /dev/sdb
= U-Boot Target : rpi3
= Root Password will be removed.
= Root partition will be resized
===================================================== *****************************************************
*****************************************************
******** WARNING! ALL DATA WILL BE DESTROYED ********
*****************************************************
***************************************************** Type 'YES' to proceed, anything else to exit now 

the image is written to the SD Card.

...
= Installation Complete! Insert into the rpi3 and boot.

Booting the Raspberry Pi

For the initial setup, you’ll need to attach a keyboard and mouse to the Raspberry Pi. Alternatively, you can follow the instructions for connecting with a USB-to-Serial cable.

When the Raspberry Pi boots up, just type root at the login prompt and press enter.

localhost login: root
[root@localhost~]#

The first task is to set a password for the root user.

[root@localhost~]# passwd
Changing password for user root.
New password: Retype new password:
passwd: all authentication tokens updated successfully
[root@localhost~]#

Verifying Network Connectivity

To verify the network connectivity, the checklist in the Fedora IoT Getting Started guide was followed. This system is using a wired ethernet connection, which shows as eth0. If you need to set up a wireless connection this can be done with nmcli.

ip addr will allow you to check that you have a valid IP address.

[root@localhost ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether b8:27:eb:9d:6e:13 brd ff:ff:ff:ff:ff:ff
inet 192.168.178.60/24 brd 192.168.178.255 scope global dynamic noprefixroute eth0
valid_lft 863928sec preferred_lft 863928sec
inet6 fe80::ba27:ebff:fe9d:6e13/64 scope link
valid_lft forever preferred_lft forever
3: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
link/ether fe:d3:c9:dc:54:25 brd ff:ff:ff:ff:ff:ff

ip route will check that the network has a default gateway configured.

[root@localhost ~]# ip route
default via 192.168.178.1 dev eth0 proto dhcp metric 100 192.168.178.0/24 dev eth0 proto kernel scope link src 192.168.178.60 metric 100 

To verify internet access and name resolution, use ping

[root@localhost ~]# ping -c3 iot.fedoraproject.org
PING wildcard.fedoraproject.org (8.43.85.67) 56(84) bytes of data.
64 bytes from proxy14.fedoraproject.org (8.43.85.67): icmp_seq=1 ttl=46 time=93.4 ms
64 bytes from proxy14.fedoraproject.org (8.43.85.67): icmp_seq=2 ttl=46 time=90.0 ms
64 bytes from proxy14.fedoraproject.org (8.43.85.67): icmp_seq=3 ttl=46 time=91.3 ms --- wildcard.fedoraproject.org ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 90.043/91.573/93.377/1.374 ms

Optional: Configuring sshd so we can disconnect the keyboard and monitor

Before disconnecting the keyboard and monitor, we need to ensure that we can connect to the Raspberry Pi over the network.

First we verify that sshd is running

[root@localhost~]# systemctl is-active sshd
active

and that there is a firewall rule present to allow ssh.

[root@localhost ~]# firewall-cmd --list-all
public (active) target: default icmp-block-inversion: no interfaces: eth0 sources: services: dhcpv6-client mdns ssh ports: protocols: masquerade: no forward-ports: source-ports: icmp-blocks: rich rules: 

In the file /etc/ssh/sshd_config, find the section named

# Authentication

and add the line

PermitRootLogin yes

There will already be a line

#PermitRootLogin prohibit-password

which you can edit by removing the # comment character and changing the value to yes.

Restart the sshd service to pick up the change

[root@localhost ~]# systemctl restart sshd

If all this is in place, we should be able to ssh to the Raspberry Pi.

[gavin@desktop ~]$ ssh root@192.168.178.60
The authenticity of host '192.168.178.60 (192.168.178.60)' can't be established.
ECDSA key fingerprint is SHA256:DLdFaYbvKhB6DG2lKmJxqY2mbrbX5HDRptzWMiAUgBM.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.178.60' (ECDSA) to the list of known hosts.
root@192.168.178.60's password: Boot Status is GREEN - Health Check SUCCESS
Last login: Wed Apr 1 17:24:50 2020
[root@localhost ~]#

It’s now safe to log out from the console (exit) and disconnect the keyboard and monitor.

Disabling unneeded services

Since we’re right on the lower limit of viable hardware for Rosetta@home, it’s worth disabling any unneeded services. Fedora IoT is much more lightweight than desktop distributions, but there are still a few optimizations we can do.

Like disabling bluetooth, Modem Manager (used for cellular data connections), WPA supplicant (used for Wi-Fi) and the zezere services, which are used to centrally manage a fleet of Fedora IoT devices.

[root@localhost /]# for serviceName in bluetooth ModemManager wpa_supplicant zezere_ignition zezere_ignition.timer zezere_ignition_banner; do sudo systemctl stop $serviceName; sudo systemctl disable $serviceName; sudo systemctl mask $serviceName; done

Getting the BOINC client

Instead of installing the BOINC client directly onto the operating system with rpm-ostree, we’re going to use podman to run the containerized version of the client.

This image uses a volume mount to store its data, so we create the directories it needs in advance.

[root@localhost ~]# mkdir -p /opt/appdata/boinc/slots /opt/appdata/boinc/locale

We also need to add a firewall rule to allow the container to resolve external DNS names.

[root@localhost ~]# firewall-cmd --permanent --zone=trusted --add-interface=cni-podman0 success [root@localhost ~]# systemctl restart firewalld

Finally we are ready to pull and run the BOINC client container.

[root@localhost ~]# podman run --name boinc -dt -p 31416:31416 -v /opt/appdata/boinc:/var/lib/boinc:Z -e BOINC_GUI_RPC_PASSWORD="blah" -e BOINC_CMD_LINE_OPTIONS="--allow_remote_gui_rpc" boinc/client:arm64v8 
Trying to pull...
...
... 787a26c34206e75449a7767c4ad0dd452ec25a501f719c2e63485479f...

We can inspect the container logs to make sure everything is working as expected:

[root@localhost ~]# podman logs boinc
20-Jun-2020 09:02:44 [---] cc_config.xml not found - using defaults
20-Jun-2020 09:02:44 [---] Starting BOINC client version 7.14.12 for aarch64-unknown-linux-gnu
...
...
...
20-Jun-2020 09:02:44 [---] Checking presence of 0 project files
20-Jun-2020 09:02:44 [---] This computer is not attached to any projects
20-Jun-2020 09:02:44 Initialization completed

Configuring the BOINC container to run at startup

We can automatically generate a systemd unit file for the container with podman generate systemd.

[root@localhost ~]# podman generate systemd --files --name boinc
/root/container-boinc.service

This creates a systemd unit file in root’s home directory.

[root@localhost ~]# cat container-boinc.service 
# container-boinc.service
# autogenerated by Podman 1.9.3
# Sat Jun 20 09:13:58 UTC 2020 [Unit]
Description=Podman container-boinc.service
Documentation=man:podman-generate-systemd(1)
Wants=network.target
After=network-online.target [Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
ExecStart=/usr/bin/podman start boinc
ExecStop=/usr/bin/podman stop -t 10 boinc
PIDFile=/var/run/containers/storage/overlay-containers/787a26c34206e75449a7767c4ad0dd452ec25a501f719c2e63485479fbe21631/userdata/conmon.pid
KillMode=none
Type=forking [Install]
WantedBy=multi-user.target default.target

We install the file by moving it to the appropriate directory.

[root@localhost ~]# mv -Z container-boinc.service /etc/systemd/system
[root@localhost ~]# systemctl enable /etc/systemd/system/container-boinc.service
Created symlink /etc/systemd/system/multi-user.target.wants/container-boinc.service → /etc/systemd/system/container-boinc.service.
Created symlink /etc/systemd/system/default.target.wants/container-boinc.service → /etc/systemd/system/container-boinc.service.

Connecting to the Rosetta Stone project

You need to create an account at the Rosetta@home signup page, and retrieve your account key from your account home page. The key to copy is the “Weak Account Key”.

Finally, we execute the boinccmd configuration utility inside the container using podman exec, passing the Rosetta@home url and our account key.

[root@localhost ~]# podman exec boinc boinccmd --project_attach https://boinc.bakerlab.org/rosetta/ 2160739_cadd20314e4ef804f1d95ce2862c8f73

Running podman logs –follow boinc will allow us to see the container connecting to the project. You will probably see errors of the form

20-Jun-2020 10:18:40 [Rosetta@home] Rosetta needs 1716.61 MB RAM but only 845.11 MB is available for use.

This is because most, but not all, of the work units in Rosetta@Home require more memory than we have to offer. However, if you leave the device running for a while, it should eventually get some jobs to process. The polling interval seems to be approximately 10 minutes. We can also tweak the memory settings using BOINC manager to allow BOINC to use slightly more memory. This will increase the probability that Rosetta@home will be able to find tasks for us.

Installing BOINC Manager for remote access

You can use dnf to install the BOINC manager component to remotely manage the BOINC client on the Raspberry Pi.

[gavin@desktop ~]$ sudo dnf install boinc-manager

If you switch to “Advanced View” , you will be able to select “File -> Select Computer” and connect to your Raspberry Pi, using the IP address of the Pi and the value supplied for BOINC_GUI_RPC_PASSWORD in the podman run command, in my case “blah“.

Press Shift+Ctrl+I to connect BOINC manager to a remote computer

Under “Options -> Computing Preferences”, increase the value for “When Computer is not in use, use at most _ %”. I’ve been using 93%; this seems to allow Rosetta@home to schedule work on the pi, whilst still leaving it just about usable. It is possible that further fine tuning of the operating system might allow this percentage to be increased.

Using the Computing Preferences Dialog to set the memory threshhold

These settings can also be changed through the Rosetta@home website settings page, but bear in mind that changes made through the BOINC Manager client override preferences set in the web interface.

Wait

It may take a while, possibly several hours, for Rosetta@home to send work to our newly installed client, particularly as most work units are too big to run on a Raspberry Pi. COVID-19 has resulted in a large number of new computers being joined to the Rosetta@home project, which means that there are times when there isn’t enough work to do.

When we are assigned some work units, BOINC will download several hundred megabytes of data. This will be stored on the SD Card and can be viewed using BOINC manager.

We can also see the tasks running in the Tasks pane:

The client has downloaded four tasks, but only one of them is currently running due to memory constraints. At times, two tasks can run simultaneously, but I haven’t seen more than that. This is OK as long as the tasks are completed by the deadline shown on the right. I’m fairly confident these will be completed as long as the Raspberry Pi is left running. I have found that the additional memory overhead created by the BOINC Manager connection and sshd services can reduce parallelism, so I try to disconnect these when I’m not using them.

Conclusion

Rosetta@home, in common with many other distributed computing projects, is currently experiencing a large spike in participation due to COVID-19. That aside, the project has been doing valuable work for many years to combat a number of other diseases.

Whilst a Raspberry Pi is never going to appear at the top of the contribution chart, I think this is a worthwhile project to undertake with a spare Raspberry Pi. The existence of work units aimed at low-spec ARM devices indicates that the project organizers agree with this sentiment. I’ll certainly be leaving mine running for the foreseeable future.

Posted on Leave a comment

Demonstrating Perl with Tic-Tac-Toe, Part 3

The articles in this series have mainly focused on Perl’s ability to manipulate text. Perl was designed to manipulate and analyze text. But Perl is capable of much more. More complex problems often require working with sets of data objects and indexing and comparing them in elaborate ways to compute some desired result.

For working with sets of data objects, Perl provides arrays and hashes. Hashes are also known as associative arrays or dictionaries. This article will prefer the term hash because it is shorter.

The remainder of this article builds on the previous articles in this series by demonstrating basic use of arrays and hashes in Perl.

An example Perl program

Copy and paste the below code into a plain text file and use the same one-liner that was provided in the the first article of this series to strip the leading numbers. Name the version without the line numbers chip2.pm and move it into the hal subdirectory. Use the version of the game that was provided in the second article so that the below chip will automatically load when placed in the hal subdirectory.

00 # advanced operations chip
01 02 package chip2;
03 require chip1;
04 05 use strict;
06 use warnings;
07 08 use constant SCORE=>'
09 ┌───┬───┬───┐
10 │ 3 │ 2 │ 3 │
11 ├───┼───┼───┤
12 │ 2 │ 4 │ 2 │
13 ├───┼───┼───┤
14 │ 3 │ 2 │ 3 │
15 └───┴───┴───┘
16 ';
17 18 sub get_prob {
19 my $game = shift;
20 my @nums;
21 my %odds;
22 23 while ($game =~ /[1-9]/g) {
24 $odds{$&} = substr(SCORE, $-[0], 1);
25 }
26 27 @nums = sort { $odds{$b} <=> $odds{$a} } keys %odds;
28 29 return $nums[0];
30 }
31 32 sub win_move {
33 my $game = shift;
34 my $mark = shift;
35 my $tkns = shift;
36 my @nums = $game =~ /[1-9]/g;
37 my $move;
38 39 TRY: for (@nums) {
40 my $num = $_;
41 my $try = $game =~ s/$num/$mark/r;
42 my $vic = chip1::get_victor $try, $tkns;
43 44 if (defined $vic) {
45 $move = $num;
46 last TRY;
47 }
48 }
49 50 return $move;
51 }
52 53 sub hal_move {
54 my $game = shift;
55 my $mark = shift;
56 my @mark = @{ shift; };
57 my $move;
58 59 $move = win_move $game, $mark, \@mark;
60 61 if (not defined $move) {
62 $mark = ($mark eq $mark[0]) ? $mark[1] : $mark[0];
63 $move = win_move $game, $mark, \@mark;
64 }
65 66 if (not defined $move) {
67 $move = get_prob $game;
68 }
69 70 return $move;
71 }
72 73 sub complain {
74 print "My mind is going. I can feel it.\n";
75 }
76 77 sub import {
78 no strict;
79 no warnings;
80 81 my $p = __PACKAGE__;
82 my $c = caller;
83 84 *{ $c . '::hal_move' } = \&{ $p . '::hal_move' };
85 *{ $c . '::complain' } = \&{ $p . '::complain' };
86 }
87 88 1;

How it works

In the above example Perl module, each position on the Tic-Tac-Toe board is assigned a score based on the number of winning combinations that intersect it. The center square is crossed by four winning combinations – one horizontal, one vertical, and two diagonal. The corner squares each intersect one horizontal, one vertical, and one diagonal combination. The side squares each intersect one horizontal and one vertical combination.

The get_prob subroutine creates a hash named odds (line 21) and uses it to map the numbers on the current game board to their score (line 24). The keys of the hash are then sorted by their score and the resulting list is copied to the nums array (line 27). The get_prob subroutine then returns the first element of the nums array ($nums[0]) which is the number from the original game board that has the highest score.

The algorithm described above is an example of what is called a heuristic in artificial intelligence programming. With the addition of this module, the Tic-Tac-Toe game can be considered a very rudimentary artificial intelligence program. It is really just playing the odds though and it is quite beatable. The next module (chip3.pm) will provide an algorithm that actually calculates the best possible move based on the opponent’s counter moves.

The win_move subroutine simply tries placing the provided mark in each available position and passing the resulting game board to chip1’s get_victor subroutine to see if it contains a winning combination. Notice that the r flag is being passed to the substitution operation (s/$num/$mark/r) on line 41 so that, rather than modifying the original game board, a new copy of the board containing the substitution is created and returned.

Arrays

It was mentioned in part one that arrays are variables whose names are prefixed with an at symbol (@) when they are created. In Perl, these prefixed symbols are called sigils.

Context

In Perl, many things return a different value depending on the context in which they are accessed. The two contexts to be aware of are called scalar context and list context. In the following example, $value1 and $value2 are different because @nums is accessed first in scalar context and then in list context.

$value1 = @nums;
($value2) = @nums;

In the above example, it might seem like @nums should return the same value each time it is accessed, but it doesn’t because what is accessing it (the context) is different. $value1 is a scalar, so it receives the scalar value of @nums which is its length. ($value2) is a list, so it receives the list value of @nums. In the above example, $value2 will receive the value of the first element of the nums array.

In part one, the below statement from the get_mark subroutine copied the numbers from the current Tic-Tac-Toe board into an array named nums.

@nums = $game =~ /[1-9]/g

Since the nums array in the above statement receives one copy of each board number in each of its elements, the count of the board numbers is equal to the length of the array. In Perl, the length of an array is obtained by accessing it in scalar context.

Next, the following formula was used to compute which mark should be placed on the Tic-Tac-Toe board in the next turn.

$indx = (@nums+1) % 2;

Because the plus operator requires a single value (a scalar) on its left hand side, not a list of values, the nums array evaluates to its length, not the list of its values. The parenthesis, in the above example, are just being used to set the order of operations so that the addition (+) will happen before the modulo (%).

Copying

In Perl you can create a list for immediate use by surrounding the list values with parenthesis and separating them with commas. The following example creates a three-element list and copies its values to an array.

@nums = (4, 5, 6);

As long as the elements of the list are variables and not constants, you can also copy the elements of an array to a list:

($four, $five, $six) = @nums;

If there were more elements in the array than the list in the above example, the extra elements would simply be discarded.

Different from lists in scalar context

Be aware that lists and arrays are different things in Perl. A list accessed in scalar context returns its last value, not its length. In the following example, $value3 receives 3 (the length of @nums) while $value4 receives 6 (the last element of the list).

$value3 = @nums;
$value4 = (4, 5, 6);

Indexing

To access an individual element of an array or list, suffix it with the desired index in square brackets as shown on line 29 of the above example Perl module.

Notice that the nums array on line 29 is prefixed with the dollar sigil ($) rather than the at sigil (@). This is done because the get_prob subroutine is supposed to return a single value, not a list. If @nums[0] were used instead of $nums[0], the subroutine would return a one-element list. Since a list evaluates to its last element in scalar context, this program would probably work if I had used @nums[0], but if you mean to retrieve a single element from an array, be sure to use the dollar sigil ($), not the at sigil (@).

It is possible to retrieve a subset from an array (or a list) rather than just one value in which case you would use the at sigil and you would provide a series of indexes or a range instead of a single index. This is what is known in Perl as a list slice.

Hashes

Hashes are variables whose names are prefixed with the percent sigil (%) when they are created. They are subscripted with curly brackets ({}) when accessing individual elements or subsets of elements (hash slices). Like arrays, hashes are variables that can hold multiple discrete data elements. They differ from arrays in the following ways:

  1. Hashes are indexed by strings (or anything that can be converted to a string), not numbers.
  2. Hashes are unordered. If you retrieve a list of their keys, values or key-value pairs, the order of the listing will be random.
  3. The number of elements in the hash will be equal to the number of keys that have been assigned values. If a value is assigned to index 99 of an array that has only three elements (indexes 0-2), the array will grow to a length of 100 elements (indexes 0-99). If a value is assigned to a new key in a hash that has only three elements, the hash will grow by only one element.

As with arrays, if you mean to access (or assign to) a single element of a hash, you should prefix it with the dollar sigil ($). When accessing a single element, Perl will go by the type of the subscript to determine the type of variable being accessed – curly brackets ({}) for hashes or square brackets ([]) for arrays. The get_prob subroutine in the above Perl module demonstrates assigning to and accessing individual elements of a hash.

Perl has two special built-in functions for working with hashes – keys and values. The keys function, when provided a hash, returns a list of all the hash’s keys (indexes). Similarly, the values function will return a list of all the hash’s values. Remember though that the order in which the list is returned is random. This randomness can be seen when playing the Tic-Tac-Toe game. If there is more than one move available with the highest score, the computer will chose one at random because the keys function returns the available moves from the odds hash in random order.

On line 27 of the above example Perl module, the keys function is being used to retrieve the list of keys from the odds hash. The keys of the odds hash are the numbers that were found on the current game board. The values of the odds hash are the corresponding probabilities that were retrieved from the SCORE constant on line 24.

Admittedly, this example could have used an array instead of a string to store and retrieve the scores. I chose to use a string simply because I think it presents the layout of the board a little nicer. An array would likely perform better, but with such a small data set, the difference is probably too small to measure.

Sort

On line 27, the list of keys from the odds hash is being feed to Perl’s built-in sort function. Beware that Perl’s sort function sorts lexicographically by default, not numerically. For example, provided the list (10, 9, 8, 1), Perl’s sort function will return the list (1, 10, 8, 9).

The behavior of Perl’s sort function can be modified by providing it a code block as its first parameter as demonstrated on line 27. The result of the last statement in the code block should be a number less-than, equal-to, or greater-than zero depending on whether element $a should be placed before, concurrent-with, or after element $b in the resulting list respectively. $a and $b are pairs of elements from the provided list. The code in the block is executed repeatedly with $a and $b set to different pairs of elements from the original list until all the pairs have been compared and sorted.

The <=> operator is a special Perl operator that returns -1, 0, or 1 depending on whether the left argument is numerically less-than, equal-to, or greater-than the right argument respectively. By using the <=> operator in the code block of the sort function, Perl’s sort function can be made to sort numerically rather than lexicographically.

Notice that rather than comparing $a and $b directly, they are first being passed through the odds hash. Since the values of the odds hash are the probabilities that were retrieved from the SCORE constant, what is being compared is actually the score of $a versus the score of $b. Consequently, the numbers from the original game board are being sorted by their score, not their value. Numbers with an equal score are left in the same random order that the keys function returned them.

Notice also that I have reversed the typical order of the parameters to <=> in the code block of the sort function ($b on the left and $a on the right). By switching their order in this way, I have caused the sort function to return the elements in reverse order – from greatest to least – so that the number(s) with the highest score will be first in the list.

References

References provide an indirect means of accessing a variable. They are often used when making copies of the variable is either undesirable or impractical. References are a sort of short cut that allows you to skip performing the copy and instead provide access to the original variable.

Why to use references

There is a cost in time and memory associated with making copies of variables. References are sometimes used as a means of reducing that cost. Be aware, however, that recent versions of Perl implement a technology called copy-on-write that greatly reduces the cost of copying variables. This new optimization should work transparently. You don’t have to do anything special to enable the copy-on-write optimization.

Why not to use references

References violate the action-at-a-distance principle that was mentioned in part one of this series. References are just as bad as global variables in terms of their tendency to trip up programmers by allowing data to be modified outside the local scope. You should generally try to avoid using references. But there are times when they are necessary.

How to create references

An example of passing a reference is provided on line 59 of the above Perl module. Rather than placing the mark array directly in the list of parameters to the win_move subroutine, a reference to the array is provided instead by prefixing the variable’s sigil with a backslash (\).

It is necessary to use a reference (\@mark) on line 59 because if the array were placed directly on the list, it would expand such that the first element of the mark array would become the third parameter to the win_move function, the second element of the mark array would become the fourth parameter to the win_move function, and so on for as many elements as the mark array has. Whereas an array will expand in list context, a reference will not. If the array were passed in expanded form, the receiving subroutine would need to call shift once for each element of the array. Also, the receiving function would not be able to tell how long the original array was.

Three ways to dereference references

In the receiving subroutine, the reference has to be dereferenced to get at the original values. An example of dereferencing an array reference is provided on line 56. On line 56, the shift statement has been enclosed in curly brackets and the opening bracket has been prefixed with the array sigil (@).

There is also a shorter form for dereferencing an array reference that is demonstrated on line 43 of the chip1.pm module. The short form allows you to omit the curly brackets and instead place the array sigil directly in front of the sigil of the scalar that holds the array reference. The short form only works when you have an array reference stored in a scalar. When the array reference is coming from a function, as it is on line 56 of the above Perl module, the long form must be used.

There is yet a third way of dereferencing an array reference that is demonstrated on line 29 of the game script. Line 29 shows the MARKS array reference being dereferenced with the arrow operator (->) and an index enclosed in square brackets. The MARKS array reference is missing its sigil because it is a constant. You can tell that what is being dereferenced is an array reference because the arrow operator is followed by square brackets ([]). Had the MARKS constant been a hash reference, the arrow operator would have been followed by curly brackets ({}).

There are also corresponding long and short forms for dereferencing hash references that use the hash sigil (%) instead of the array sigil. Note also that hashes, just like arrays, need to be passed by reference to subroutines unless you want them to expand into their constituent elements. The latter is sometimes done in Perl as a clever way of emulating named parameters.

A word of caution about references

It was stated earlier that references allow data to be modified outside of their declared scope and, just as with global variables, this non-local manipulation of the data can be confusing to the programmer(s) and thereby lead to unintended bugs. This is an important point to emphasize and explain.

On line 35 of the win_move subroutine, you can see that I did not dereference the provided array reference (\@mark) but rather I chose to store the reference in a scalar named tkns. I did this because I do not need to access the individual elements of the provided array in the win_move subroutine. I only need to pass the reference on to the get_victor subroutine. Not making a local copy of the array is a short cut, but it is dangerous. Because $tkns is only a copy of the reference, not a copy of the original data being referred to, if I or a later program developer were to write something like $tkns->[0] = ‘Y’ in the win_move subroutine, it would actually modify the value of the mark array in the hal_move subroutine. By passing a reference to its mark array (\@mark) to the win_move subroutine, the hal_move subroutine has granted access to modify its local copy of @mark. In this case, it would probably be better to make a local copy of the mark array in the win_move subroutine using syntax similar to what is shown on line 56 rather than preserving the reference as I have done for the purpose of demonstration on line 35.

Aliases

In addition to references, there is another way that a local variable created with the my or state keyword can leak into the scope of a called subroutine. The list of parameters that you provide to a subroutine is directly accessible in the @_ array.

To demonstrate, the following example script prints b, not a, because the inc subroutine accesses the first element of @_ directly rather than first making a local copy of the parameter.

#!/usr/bin/perl sub inc { $_[0]++;
} MAIN: { my $var = 'a'; inc $var; print "$var\n";
}

Aliases are different from references in that you don’t have to dereference them to get at their values. They really are just alternative names for the same variable. Be aware that aliases occur in a few other places as well. One such place is the list returned from the sort function – if you were to modify an element of the returned list directly, without first copying it to another variable, you would actually be modifying the element in the original list that was provided to the sort function. Other places where aliases occur include the code blocks of functions like grep and map. The grep and map functions are not covered in this series of articles. See the provided links if you want to know more about them.

Final notes

Many of Perl’s built-in functions will operate on the default scalar ($_) or default array (@_) if they are not explicitly provided a variable to read from or write to. Line 40 of the above Perl module provides an example. The numbers from the nums array are sequentially aliased to $_ by the for keyword. If you chose to use these variables, in most cases you will probably want to retrieve your data from $_ or @_ fairly quickly to prevent it being accidentally overwritten by a subsequent command.

The substitution command (s/…/…/), for example, will manipulate the data stored in $_ if it is not explicitly bound to another variable by one of the =~ or !~ operators. Likewise, the shift function operates on @_ (or @ARGV if called in the global scope) if it is not explicitly provided an array to operate on. There is no obvious rule to which functions support this shortcut. You will have to consult the documentation for the command you are interested in to see if it will operate on a default variable when not provided one explicitly.

As demonstrated on lines 55 and 56, the same name can be reused for variables of different types. Reusing variable names generally makes the code harder to follow. It is probably better for the sake of readability to avoid variable name reuse.

Beware that making copies of arrays or hashes in Perl (as demonstrated on line 56) is shallow by default. If any of the elements of the array or hash are references, the corresponding elements in the duplicated array or hash will be references to the same original data. To make deep copies of data structures, use one of the Clone or Storable Perl modules. An alternative workaround that may work in the case of multi-dimensional arrays is to emulate them with a one-dimensional hash.

Similar in form to Perl’s syntax for creating lists – (1, 2, 3) – unnamed array references and unnamed hash references can be constructed on the fly by bounding a comma-separated set of elements in square brackets ([]) or curly brackets ({}) respectively. Line 07 of the game script demonstrates an unnamed (anonymous) array reference being constructed and assigned to the MARKS constant.

Notice that the import subroutine at the end of the above Perl module (chip2.pm) is assigning to some of the same names in the calling namespace as the previous module (chip1.pm). This is intentional. The hal_move and complain aliases created by chip1’s import subroutine will simply be overridden by the identically named aliases created by chip2’s import subroutine (assuming chip2.pm is loaded after chip1.pm in the calling namespace). Only the aliases are updated/overridden. The original subroutines from chip1 will still exist and can still be called with their full names – chip1::hal_move and chip1::complain.

Posted on Leave a comment

LaTeX typesetting part 2 (tables)

LaTeX offers a number of tools to create and customise tables, in this series we will be using the tabular and tabularx environment to create and customise tables.

Basic table

To create a table you simply specify the environment \begin{tabular}{columns}

\begin{tabular}{c|c} Release &;\;Codename \\ \hline Fedora Core 1 &;Yarrow \\ Fedora Core 2 &;Tettnang \\ Fedora Core 3 &;Heidelberg \\ Fedora Core 4 &;Stentz \\ \end{tabular}

In the above example “{c|c}” in the curly bracket refers to the position of the text in the column. The below table summarises the positional argument together with the description.

Position Argument
c Position text in the centre
l Position text left-justified
r Position text right-justified
p{width} Align the text at the top of the cell
m{width} Align the text in the middle of the cell
b{width} Align the text at the bottom of the cell

Both m{width} and b{width} requires the array package to be specified in the preamble.

Using the example above, let us breakdown the important points used and describe a few more options that you will see in this series

Option Description
&; Defines each cell, the ampersand is only used from the second column
\ This terminates the row and start a new row
| Specifies the vertical line in the table (optional)
\hline Specifies the horizontal line (optional)
*{num}{form} This is handy when you have many columns and is an efficient way of limiting the repetition
|| Specifies the double vertical line

Customizing a table

Now that some of the options available are known, let us create a table using the options described in the previous section.

\begin{tabular}{*{3}{|l|}}
\hline \textbf{Version} &;\textbf{Code name} &;\textbf{Year released} \\
\hline Fedora 6 &;Zod &;2006 \\ \hline Fedora 7 &;Moonshine &;2007 \\ \hline Fedora 8 &;Werewolf &;2007 \\
\hline
\end{tabular}

Managing long text

With LaTeX, if there are many texts in a column it will not be formatted well and does not look presentable.

The below example shows how long text is formatted, we will use “blindtext” in the preamble so that we can produce sample text.

\begin{tabular}{|l|l|}\hline Summary &;Description \\ \hline Test &;\blindtext \\
\end{tabular}

As you can see the text exceeds the page width; however, there are a couple of options to overcome this challenge.

  • Specify the column width, for example, m{5cm}
  • Utilise the tabularx environment, this requires tabularx package in the preamble.

Managing long text with column width

By specifying the column width the text will be wrapped into the width as shown in the example below.

\begin{tabular}{|l|m{14cm}|} \hline Summary &;Description \\ \hline Test &;\blindtext \\ \hline
\end{tabular}\vspace{3mm}

Managing long text with tabularx

Before we can leverage tabularx we need to add it in the preamble. Tabularx takes the following example

**\begin{tabularx}{width}{columns}** \begin{tabularx}{\textwidth}{|l|X|} \hline
Summary &; Tabularx Description\\ \hline
Text &;\blindtext \\ \hline
\end{tabularx}

Notice that the column that we want the long text to be wrapped has a capital “X” specified.

Multi-row and multi-column

There are times when you will need to merge rows and/or column. This section describes how it is accomplished. To use multi-row and multi-column add multi-row to the preamble.

Multirow

Multirow takes the following argument \multirow{number_of_rows}{width}{text}, let us look at the below example.

\begin{tabular}{|l|l|}\hline Release &;Codename \\ \hline Fedora Core 4 &Stentz \\ \hline \multirow{2}{*}{MultiRow} &;Fedora 8 \\ &;Werewolf \\ \hline
\end{tabular}

In the above example, two rows were specified, the ‘*’ tells LaTeX to automatically manage the size of the cell.

Multicolumn

Multicolumn argument is \multicolumn{number_of_columns}{cell_position}{text}, below example demonstrates multicolumn.

\begin{tabular}{|l|l|l|}\hline Release &;Codename &;Date \\ \hline Fedora Core 4 &;Stentz &;2005 \\ \hline \multicolumn{3}{|c|}{Mulit-Column} \\ \hline
\end{tabular}

Working with colours

Colours can be assigned to the text, an individual cell or the entire row. Additionally, we can configure alternating colours for each row.

Before we can add colour to our tables we need to include \usepackage[table]{xcolor} into the preamble. We can also define colours using the following colour reference LaTeX Colour or by adding an exclamation after the colour prefixed by the shade from 0 to 100. For example, gray!30

\definecolor{darkblue}{rgb}{0.0, 0.0, 0.55}
\definecolor{darkgray}{rgb}{0.66, 0.66, 0.66}

Below example demonstrate this a table with alternate colours, \rowcolors take the following options \rowcolors{row_start_colour}{even_row_colour}{odd_row_colour}.

\rowcolors{2}{darkgray}{gray!20}
\begin{tabular}{c|c} Release &;Codename \\ \hline Fedora Core 1 &;Yarrow \\ Fedora Core 2 &;Tettnang \\ Fedora Core 3 &;Heidelberg \\ Fedora Core 4 &;Stentz \\
\end{tabular}

In addition to the above example, \rowcolor can be used to specify the colour of each row, this method works best when there are multi-rows. The following examples show the impact of using the \rowcolours with multi-row and how to work around it.

As you can see the multi-row is visible in the first row, to fix this we have to do the following.

\begin{tabular}{|l|l|}\hline \rowcolor{darkblue}\textsc{\color{white}Release} &;\textsc{\color{white}Codename} \\ \hline \rowcolor{gray!10}Fedora Core 4 &;Stentz \\ \hline \rowcolor{gray!40}&;Fedora 8 \\ \rowcolor{gray!40}\multirow{-2}{*}{Multi-Row} &;Werewolf \\ \hline
\end{tabular}

Let us discuss the changes that were implemented to resolve the multi-row with the alternate colour issue.

  • The first row started above the multi-row
  • The number of rows was changed from 2 to -2, which means to read from the line above
  • \rowcolor was specified for each row, more importantly, the multi-rows must have the same colour so that you can have the desired results.

One last note on colour, to change the colour of a column you need to create a new column type and define the colour. The example below illustrates how to define the new column colour.

\newcolumntype{g}{>{\columncolor{darkblue}}l} 

Let’s break it down:

  • \newcolumntype{g}: defines the letter g as the new column
  • {>{\columncolor{darkblue}}l}: here we select our desired colour, and l tells the column to be left-justified, this can be subsitued with c or r
\begin{tabular}{g|l} \textsc{Release} &;\textsc{Codename} \\ \hline Fedora Core 4 &;Stentz \\ &Fedora 8 \\ \multirow{-2}{*}{Multi-Row} &;Werewolf \\ \end{tabular}\

Landscape table

There may be times when your table has many columns and will not fit elegantly in portrait. With the rotating package in preamble you will be able to create a sideways table. The below example demonstrates this.

For the landscape table, we will use the sidewaystable environment and add the tabular environment within it, we also specified additional options.

  • \centering to position the table in the centre of the page
  • \caption{} to give our table a name
  • \label{} this enables us to reference the table in our document
\begin{sidewaystable}
\centering
\caption{Sideways Table}
\label{sidetable}
\begin{tabular}{ll} \rowcolor{darkblue}\textsc{\color{white}Release} &;\textsc{\color{white}Codename} \\ \rowcolor{gray!10}Fedora Core 4 &;Stentz \\ \rowcolor{gray!40} &;Fedora 8 \\ \rowcolor{gray!40}\multirow{-2}{*}{Multi-Row} &;Werewolf \\ \end{tabular}\vspace{3mm}
\end{sidewaystable}

Lists in tables

To include a list into a table you can use tabularx and include the list in the column where the X is specified. Another option will be to use tabular but you must specify the column width.

List in tabularx

\begin{tabularx}{\textwidth}{|l|X|} \hline Fedora Version &;Editions \\ \hline Fedora 32 &;\begin{itemize}[noitemsep] \item CoreOS \item Silverblue \item IoT \end{itemize} \\ \hline
\end{tabularx}\vspace{3mm}

List in tabular

\begin{tabular}{|l|m{6cm}|}\hline Fedora Version &;Editions \\ \hline Fedora 32 &;\begin{itemize}[noitemsep] \item CoreOS \item Silverblue \item IoT \end{itemize} \\ \hline
\end{tabular}

Conclusion

LaTeX offers many ways to customise your table with tabular and tabularx, you can also add both tabular and tabularx within the table environment (\begin\table) to add the table name and to position the table.

The packages used in this series are:

\usepackage{fullpage}
\usepackage{blindtext} % add demo text
\usepackage{array} % used for column positions
\usepackage{tabularx} % adds tabularx which is used for text wrapping
\usepackage{multirow} % multi-row and multi-colour support
\usepackage[table]{xcolor} % add colour to the columns \usepackage{rotating} % for landscape/sideways tables

Additional Reading

This was an intermediate lesson on tables. For more advanced information about tables and LaTex in general, you can go to the LaTeX Wiki.