02-22-2020, 12:35 PM
Demonstrating PERL with Tic-Tac-Toe, Part 1
<div><p><a rel="noreferrer noopener" aria-label="Larry Wall (opens in a new tab)" href="https://en.wikipedia.org/wiki/Larry_Wall" target="_blank">Larry Wall</a>’s <a rel="noreferrer noopener" aria-label="Practical Extraction and Reporting Language (PERL) (opens in a new tab)" href="https://en.wikipedia.org/wiki/Perl" target="_blank">Practical Extraction and Reporting Language (PERL)</a> was originally developed in 1987 as a general-purpose Unix scripting language that borrowed features from C, sh, awk, sed, BASIC, and LISP. In the late 1990s, before PHP became more popular, PERL was commonly used for <a rel="noreferrer noopener" aria-label="CGI scripting (opens in a new tab)" href="https://en.wikipedia.org/wiki/Common_Gateway_Interface" target="_blank">CGI scripting</a>. PERL is still the go-to tool for many sysadmins who need something more powerful than sed or awk when writing complex parsing and automation scripts. It has a somewhat high learning curve due to its dense notation. But a recent survey indicates that <a rel="noreferrer noopener" aria-label="PERL developers earn 54 per cent more than the average developer (opens in a new tab)" href="https://www.theregister.co.uk/2020/02/06/developers_pay_review/" target="_blank">PERL developers earn 54 per cent more than the average developer</a>. So it may still be a worthwhile language to learn.</p>
<p>PERL is far too complex to cover in any significant detail in this magazine. But this short series of articles will attempt to demonstrate a few of the most basic features of the language so that you can get a sense of what the language is like and the kind of things it can do.</p>
<h2>An example PERL program</h2>
<p>PERL was originally a language optimized for <a rel="noreferrer noopener" aria-label="scanning arbitrary text files, extracting information from those text files, and printing reports based on that information (opens in a new tab)" href="https://perldoc.perl.org/perl.html#DESCRIPTION" target="_blank">scanning arbitrary text files, extracting information from those text files, and printing reports based on that information</a>. To demonstrate how this core feature of PERL works, a very simple <a href="https://en.wikipedia.org/wiki/Tic-tac-toe" target="_blank" rel="noreferrer noopener" aria-label="Tic-Tac-Toe (opens in a new tab)">Tic-Tac-Toe</a> game is provided below. The below program scans a textual representation of a Tic-Tac-Toe board, extracts and manipulates the numbers on the board, and prints the modified result to the console.</p>
<pre class="wp-block-preformatted">00 #!/usr/bin/perl
01 02 use feature 'state';
03 04 use constant MARKS=>[ 'X', 'O' ];
05 use constant BOARD=>'
06 ┌───┬───┬───┐
07 │ 1 │ 2 │ 3 │
08 ├───┼───┼───┤
09 │ 4 │ 5 │ 6 │
10 ├───┼───┼───┤
11 │ 7 │ 8 │ 9 │
12 └───┴───┴───┘
13 ';
14 15 sub get_mark {
16 my $game = shift;
17 my @nums = $game =~ /[1-9]/g;
18 my $indx = (@nums+1) % 2;
19 20 return MARKS->[$indx];
21 }
22 23 sub put_mark {
24 my $game = shift;
25 my $mark = shift;
26 my $move = shift;
27 28 $game =~ s/$move/$mark/;
29 30 return $game;
31 }
32 33 sub get_move {
34 return (<> =~ /^[1-9]$/) ? $& : '0';
35 }
36 37 PROMPT: {
38 state $game = BOARD;
39 40 my $mark;
41 my $move;
42 43 print $game;
44 45 last PROMPT if ($game !~ /[1-9]/);
46 47 $mark = get_mark $game;
48 print "$mark\'s move?: ";
49 50 $move = get_move;
51 $game = put_mark $game, $mark, $move;
52 53 redo PROMPT;
54 }</pre>
<p>To try out the above program on your PC, you can copy-and-paste the above text into a plain text file and save and run it. The line numbers will have to be removed before the program will work. Of course, the command that one uses to perform that sort of textual extraction and reporting is <em>perl</em>.</p>
<p>Assuming that you have saved the above text to a file named <em>game.txt</em>, the following command can be used to strip the leading numbers from all the lines and write the modified version to a new file named <em>game</em>:</p>
<pre class="wp-block-preformatted">$ cat game.txt | perl -npe 's/...//' > game</pre>
<p>The above command is a very small PERL script and it is an example of what is called a <a rel="noreferrer noopener" aria-label="one-liner (opens in a new tab)" href="https://en.wikipedia.org/wiki/One-liner_program" target="_blank">one-liner</a>.</p>
<p>Now that the line numbers have been removed, the program can be run by entering the following command:</p>
<pre class="wp-block-preformatted">$ perl game</pre>
<h2>How it works</h2>
<p>PERL is a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Procedural_programming" target="_blank">procedural</a> programming language. A program written in PERL consists of a series of commands that are executed sequentially. With few exceptions, most commands alter the state of the computer’s memory in some way.</p>
<p>Line 00 in the Tic-Tac-Toe program isn’t technically part of the PERL program and it can be omitted. It is called a <a rel="noreferrer noopener" aria-label="shebang (opens in a new tab)" href="https://en.wikipedia.org/wiki/Shebang_(Unix)" target="_blank">shebang</a> (the letter <strong>e</strong> is pronounced soft as it is in the word sh<span style="text-decoration: underline">e</span>ll). The purpose of the shebang line is to tell the operating system what interpreter the remaining text should be processed with if one isn’t specified on the command line.</p>
<p>Line 02 isn’t strictly necessary for this program either. It makes available an advanced command named <em>state</em>. The <em>state</em> command creates a <a rel="noreferrer noopener" aria-label="variable (opens in a new tab)" href="https://en.wikipedia.org/wiki/Variable_(computer_science)" target="_blank">variable</a> that can retain its value after it has gone out of <a rel="noreferrer noopener" aria-label="scope (opens in a new tab)" href="https://en.wikipedia.org/wiki/Scope_(computer_science)" target="_blank">scope</a>. I’m using it here as a way to avoid declaring a <a rel="noreferrer noopener" aria-label="global variable (opens in a new tab)" href="https://en.wikipedia.org/wiki/Global_variable" target="_blank">global variable</a>. It is considered good practice in computer programming to avoid using global variables where possible because they allow for <a rel="noreferrer noopener" aria-label="action at a distance (opens in a new tab)" href="https://en.wikipedia.org/wiki/Action_at_a_distance_(computer_programming)" target="_blank">action at a distance</a>. If you didn’t follow all of that, don’t worry about it. It’s not important at this point.</p>
<h3>PERL scopes, blocks and subroutines</h3>
<p>Scope is a very important concept that one needs to be familiar with when reading and writing procedural programs. In PERL, scope is often delineated by a pair of <a rel="noreferrer noopener" aria-label="curly brackets (opens in a new tab)" href="https://en.wikipedia.org/wiki/Bracket#Curly_bracket" target="_blank">curly brackets</a>. Within the <a rel="noreferrer noopener" aria-label="global scope (opens in a new tab)" href="https://en.wikipedia.org/wiki/Scope_(computer_science)#Global_scope" target="_blank">global scope</a>, the above Tic-Tac-Toe program defines four sub-scopes on lines 15-21, 23-31, 33-35 and 37-54. The first three scopes are prefixed with subroutine declarations and the last scope is prefixed with the label <em>PROMPT</em>.</p>
<p>Scopes serve multiple purposes in programming languages. One purpose of a scope is to group a set of commands together as a unit so that they can be <a rel="noreferrer noopener" aria-label="called (opens in a new tab)" href="https://en.wikipedia.org/wiki/Subroutine" target="_blank">called</a> repeatedly with a single command rather than having to repeat several lines of code each time in the program. Another purpose is to enhance the readability of the program by denoting a restricted area where the value of a variable can be updated.</p>
<p>Within the scope that is labeled <em>PROMPT</em> and defined on lines 37-54 of the above Tic-Tac-Toe program, a variable named <em>mark</em> is created using the <em>my</em> keyword (line 40). After it is created, it is assigned a value by calling the <em>get_mark</em> subroutine (line 47). Later, the <em>put_mark</em> subroutine is called (line 51) to change the value in the square that was chosen by the <em>get_move</em> subroutine on line 50.</p>
<p>Hopefully it is obvious that the mark that <em>put_mark</em> is setting is meant to be the same mark that <em>get_mark</em> retrieved earlier. As a programmer though, how do I know that the value of <em>mark</em> wasn’t changed when the <em>get_move</em> subroutine was called? This example program is small enough that every line can be examined to make that determination. But most programs are much larger than this example and having to know exactly what is going on at all points in the program’s execution can be overwhelming and error-prone. Because <em>mark</em> was created with the <em>my</em> keyword, its value can only be accessed and modified within the scope that it was created (or a sub-scope). It doesn’t matter what subroutines at parallel or higher scopes do; even if they change variables with the same name in their own scopes. This property of scopes — restricting the range of lines on which the value of a variable can be updated — improves the readability of the code by allowing the programmer to focus on a smaller section of the program without having to be concerned about what is happening elsewhere in the program.</p>
<p>Lines 04 and 05 define the <em>MARKS</em> and <em>BOARD</em> variables, respectively. Because they are not within any curly bracket pairing, they exist in the global scope. It is permissible to create constant variables in the global scope because they are read-only and therefore not subject to the action at a distance concern. <a href="https://en.wikipedia.org/wiki/Naming_convention_(programming)#Perl" target="_blank" rel="noreferrer noopener" aria-label="In PERL, it is traditional to name constants in all upper case letters (opens in a new tab)">In PERL, it is traditional to name constants in all upper case letters</a>.</p>
<p>Notice that scopes can be nested such that variables defined in outer scopes can be accessed and modified from within inner scopes. This is why the <em>MARKS</em> and <em>BOARD</em> variables can be accessed within the <em>get_mark</em> subroutine and <em>PROMPT</em> block respectively — they are sub-scopes of the global scope.</p>
<p>The statements in the program are executed in order from top to bottom and left to right. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlsyn.html#Simple-Statements" target="_blank">Each statement is terminated with a semi-colon (<strong>;</strong>)</a>. The semi-colon can be omitted from the last statement in any scope and from after the last block of many statements that define the flow of the program such as <em>sub</em>, <em>if</em> and <em>while</em>.</p>
<p>In PERL nomenclature, <em>scopes</em> are called <em>blocks</em>. Scope is the more general term that is typically used in online references like Wikipedia, but the remainder of this article will use the more <a rel="noreferrer noopener" aria-label="perlish (opens in a new tab)" href="https://en.wiktionary.org/wiki/Perlish" target="_blank">perlish</a> term <em>blocks</em>.</p>
<p>The statements within the first three blocks are not immediately executed as the program is evaluated from top to bottom. Rather, they are associated with the subroutine name preceding the block. This is the function of the <em><a href="https://perldoc.perl.org/5.30.0/perlsub.html#DESCRIPTION" target="_blank" rel="noreferrer noopener" aria-label="sub (opens in a new tab)">sub</a></em> keyword — it associates a subroutine name with a block of statements so that they can be called as a unit elsewhere in the program. The three subroutines <em>get_mark</em> (lines 15-21), <em>put_mark</em> (lines 23-31), and <em>get_move</em> (lines 33-35) are called on lines 47, 51 and 50 respectively.</p>
<p>The <em>PROMPT</em> block is not associated with a subroutine definition <a href="https://perldoc.perl.org/5.30.0/perlsyn.html#Compound-Statements" target="_blank" rel="noreferrer noopener" aria-label="or other flow-control statement (opens in a new tab)">or other flow-control statement</a>, so the statements within it are immediately executed in sequence when the program is run.</p>
<h3>PERL regular expressions</h3>
<p>If there is one feature that is more central to PERL than any other it is regular expressions. Notice that in the example Tic-Tac-Toe program every block contains a <strong>=~</strong> (or <strong>!~</strong>) operator followed by some text surrounded with forward slashes (<strong>/</strong>). The text within the forward slashes is called a <a href="https://en.wikipedia.org/wiki/Regular_expression" target="_blank" rel="noreferrer noopener" aria-label="regular expression (opens in a new tab)">regular expression</a> and the operator binds the regular expression to a variable or <a href="https://en.wikipedia.org/wiki/Stream_(computing)" target="_blank" rel="noreferrer noopener" aria-label="data stream (opens in a new tab)">data stream</a>.</p>
<p>It is important to note that there are different regular expression syntaxes. Some editors and command-line tools (for example, <a rel="noreferrer noopener" aria-label="grep (opens in a new tab)" href="https://www.gnu.org/software/grep/manual/grep.html#grep-Programs" target="_blank">grep</a>) allow the user to select which regular expression syntax they prefer to use. <a rel="noreferrer noopener" aria-label="PERL-Compatible Regular Expressions (PCRE) (opens in a new tab)" href="https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions" target="_blank">PERL-Compatible Regular Expressions (PCRE)</a> are by far the most powerful.</p>
<h4>Regular expressions used in matching operations</h4>
<p>The result of applying the regular expression to a variable or data stream is usually a value that, when used in a flow-control statement such as <em>if</em> or <em>while</em>, will evaluate to <strong>true</strong> or <strong>false</strong> depending on whether or not the match succeeded. There are <a rel="noreferrer noopener" aria-label="modifiers (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlre.html#Modifiers" target="_blank">modifiers</a> that can be appended to the closing slash of a regular expression to change its return value.</p>
<p>Line 45 of the Tic-Tac-Toe program provides a typical example of how a regular expression is used in a PERL program. The regular expression <strong>[1-9]</strong> is being applied to the variable <em>game</em> which holds the in-memory representation of the Tic-Tac-Toe game board. The expression is a <a rel="noreferrer noopener" aria-label="character class (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlrecharclass.html#Character-Ranges" target="_blank">character class</a> that matches any character in the range from 1 to 9 (inclusive). The result of the regular expression will be <strong>true</strong> only if a character from 1 to 9 is present in what is being evaluated. On line 45, the <strong>!~</strong> operator applies the regular expression to the <em>game</em> variable and negates its sense such that the result will be <strong>true</strong> only if none of the characters from 1 to 9 are present. Because the regular expression is embedded within the conditional clause of the <em>if</em> <a rel="noreferrer noopener" aria-label="statement modifier (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlsyn.html#Statement-Modifiers" target="_blank">statement modifier</a>, the statement <em>last PROMPT</em> is only executed if there are no characters in the range from 1 to 9 left on the game board.</p>
<p>The <a rel="noreferrer noopener" aria-label="last (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/functions/last.html" target="_blank"><em>last</em></a> statement is one of a few flow-control statements in PERL that allow the program execution sequence to jump from the current line to another line somewhere else in the program. Other flow-control statements that work in a similar fashion include <a rel="noreferrer noopener" aria-label="next (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/functions/next.html" target="_blank"><em>next</em></a>, <a rel="noreferrer noopener" aria-label="continue (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/functions/continue.html" target="_blank"><em>continue</em></a>, <a rel="noreferrer noopener" aria-label="redo (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/functions/redo.html" target="_blank"><em>redo</em></a> and <a rel="noreferrer noopener" aria-label="goto (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/functions/goto.html" target="_blank"><em>goto</em></a> (the <em>goto</em> statement should be avoided whenever possible because it allows for <a rel="noreferrer noopener" aria-label="spaghetti code (opens in a new tab)" href="https://en.wikipedia.org/wiki/Spaghetti_code" target="_blank">spaghetti code</a>).</p>
<p>In the example Tic-Tac-Toe program, the <em>last PROMPT</em> statement on line 45 causes program execution to resume just after the <em>PROMPT</em> block. Because there are no more statements in the program, the program will terminate.</p>
<p>The label <em>PROMPT</em> was chosen arbitrarily. Any label (or none at all) could have been used.</p>
<p>The <em>redo PROMPT</em> statement at the end of the <em>PROMPT</em> block causes program execution to jump back to the beginning of the <em>PROMPT</em> block.</p>
<p>Notice that the <em>state</em> keyword like the <em>my</em> keyword creates a variable that can only be accessed or modified within the block that it is created (or a nested sub-block if any exist). Unlike the <em>my</em> keyword, variables created with the <em>state</em> keyword keep their former value when the blocks they are in are called repeatedly. This is the behavior that is needed for the <em>game</em> variable because it is being updated incrementally each time the <em>PROMPT</em> block is run. The <em>mark</em> and <em>move</em> variables are meant to be different on each iteration of the <em>PROMPT</em> block, so they do not need to be created with the <em>state</em> keyword.</p>
<h4>Regular expressions used for input validation</h4>
<p>Another common use of regular expressions is for <a rel="noreferrer noopener" aria-label="input validation (opens in a new tab)" href="https://en.wikipedia.org/wiki/Data_validation" target="_blank">input validation</a>. Line 34 of the example Tic-Tac-Toe program provides an example of a regular expression being used for input validation. The expression on line 34 is similar to the one on line 45. It is also checking for characters from 1 to 9. However, it is performing the check against <a rel="noreferrer noopener" aria-label="the null filehandle (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlop.html#I%2fO-Operators" target="_blank">the null filehandle</a> (<strong><></strong>); it is using the <strong>=~</strong> operator; and it is prefixed and suffixed with the <a rel="noreferrer noopener" aria-label="zero-width assertions (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlre.html#Assertions" target="_blank">zero-width assertions</a> <strong>^</strong> and <strong>$</strong> respectively.</p>
<p>The null filehandle, when accessed as it is on line 34, will cause the program to pause until one line of input is provided. The regular expression will evaluate to <strong>true</strong> only if the line contains one character in the range from 1 to 9. The assertions <strong>^</strong> and <strong>$</strong> do not match any characters. Rather, they match the beginning and end positions, respectively, on the line. The regular expression effectively reads: “Begin (<strong>^</strong>) with one character in the range from 1 to 9 (<strong>[1-9]</strong>) and end (<strong>$</strong>)”.</p>
<p>Because it is embedded in the conditional clause of <a rel="noreferrer noopener" aria-label="the ternary operator (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlop.html#Conditional-Operator" target="_blank">the ternary operator</a>, line 34 will return either what was matched (<a rel="noreferrer noopener" aria-label="$& (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlvar.html#Variables-related-to-regular-expressions" target="_blank"><strong>$&</strong></a>) if the match succeeded or the character zero (<strong>0</strong>) if it failed. If the input were not validated in this way, then the user could submit their opponent’s mark rather than a number on the board.</p>
<h4>Regular expressions used for filtering data</h4>
<p>Line 17 demonstrates using the <a rel="noreferrer noopener" aria-label="global modifier (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlretut.html#Using-regular-expressions-in-Perl" target="_blank">global modifier</a> (<strong>g</strong>) on a regular expression. With the global modifier, the regular expression will return the number of matches instead of <strong>true</strong> or <strong>false</strong>. In <a rel="noreferrer noopener" aria-label="list context (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perldata.html#Context" target="_blank">list context</a>, it returns a list of all the matched <a href="https://en.wikipedia.org/wiki/Substring" target="_blank" rel="noreferrer noopener" aria-label="substrings (opens in a new tab)">substrings</a>.</p>
<p>Line 17 uses a regular expression to copy all the numbers in the range from 1 to 9 from the <em>game</em> variable into the array named <em>nums</em>. Line 18 then uses the <a rel="noreferrer noopener" aria-label="modulo operator (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlop.html#Multiplicative-Operators" target="_blank">modulo operator</a> with the integer <strong>2</strong> as its second argument to determine whether the length of the <em>nums</em> array is even or odd. The formula on line 18 will result in <strong>0</strong> if the length of <em>nums</em> is odd and <strong>1</strong> if the length of <em>nums</em> is even. Finally, the computed index (<em>indx</em>) is used to access an element of the <em>MARKS</em> array and return it. Using this formula, the <em>get_mark</em> function will alternately return <strong>X</strong> or <strong>O</strong> depending on whether there are an odd or even number of positions left on the board.</p>
<h4>Regular expressions used for substituting data</h4>
<p>Line 28 demonstrates yet another common use of regular expressions in PERL. Rather than being used in a <a rel="noreferrer noopener" aria-label="match operator (m) (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlop.html#Regexp-Quote-Like-Operators" target="_blank">match operator (<strong>m</strong>)</a>, the regular expression on line 28 is being used in a <a rel="noreferrer noopener" aria-label="substitution operator (s) (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlop.html#Regexp-Quote-Like-Operators" target="_blank">substitution operator (<strong>s</strong>)</a>. If the value in the <em>move</em> variable is found in the <em>game</em> variable, it will be substituted with the value of the <em>mark</em> variable.</p>
<h3>PERL sigils and data types</h3>
<p>The last things of note that are used in the example Tic-Tac-Toe program are <a rel="noreferrer noopener" aria-label="the sigils (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perldata.html#DESCRIPTION" target="_blank">the sigils</a> (<strong>$</strong> and <strong>@</strong>) that are placed before the variable names. When creating a variable, the sigil indicates the type of variable being created. It is important to note that <a rel="noreferrer noopener" aria-label="a different sigil can be prefixed to the variable name when it is accessed (opens in a new tab)" href="https://perldoc.perl.org/perlfaq4.html#What-is-the-difference-between-%24array%5b1%5d-and-%40array%5b1%5d%3f" target="_blank">a different sigil can be prefixed to the variable name when it is accessed</a> to indicate whether one or many items should be returned from the variable.</p>
<p><a href="https://perldoc.perl.org/perlintro.html#Perl-variable-types" target="_blank" rel="noreferrer noopener" aria-label="There are three built-in data types in PERL (opens in a new tab)">There are three built-in data types in PERL</a>: scalars (<strong>$</strong>), arrays (<strong>@</strong>) and associative arrays (<strong>%</strong>). Scalars hold a single data item such as a number, character or string of characters. Arrays are numerically indexed sets of scalars. Associative arrays are arrays that are indexed by scalars rather than numbers.</p>
</div>
https://www.sickgaming.net/blog/2020/02/...oe-part-1/
<div><p><a rel="noreferrer noopener" aria-label="Larry Wall (opens in a new tab)" href="https://en.wikipedia.org/wiki/Larry_Wall" target="_blank">Larry Wall</a>’s <a rel="noreferrer noopener" aria-label="Practical Extraction and Reporting Language (PERL) (opens in a new tab)" href="https://en.wikipedia.org/wiki/Perl" target="_blank">Practical Extraction and Reporting Language (PERL)</a> was originally developed in 1987 as a general-purpose Unix scripting language that borrowed features from C, sh, awk, sed, BASIC, and LISP. In the late 1990s, before PHP became more popular, PERL was commonly used for <a rel="noreferrer noopener" aria-label="CGI scripting (opens in a new tab)" href="https://en.wikipedia.org/wiki/Common_Gateway_Interface" target="_blank">CGI scripting</a>. PERL is still the go-to tool for many sysadmins who need something more powerful than sed or awk when writing complex parsing and automation scripts. It has a somewhat high learning curve due to its dense notation. But a recent survey indicates that <a rel="noreferrer noopener" aria-label="PERL developers earn 54 per cent more than the average developer (opens in a new tab)" href="https://www.theregister.co.uk/2020/02/06/developers_pay_review/" target="_blank">PERL developers earn 54 per cent more than the average developer</a>. So it may still be a worthwhile language to learn.</p>
<p>PERL is far too complex to cover in any significant detail in this magazine. But this short series of articles will attempt to demonstrate a few of the most basic features of the language so that you can get a sense of what the language is like and the kind of things it can do.</p>
<h2>An example PERL program</h2>
<p>PERL was originally a language optimized for <a rel="noreferrer noopener" aria-label="scanning arbitrary text files, extracting information from those text files, and printing reports based on that information (opens in a new tab)" href="https://perldoc.perl.org/perl.html#DESCRIPTION" target="_blank">scanning arbitrary text files, extracting information from those text files, and printing reports based on that information</a>. To demonstrate how this core feature of PERL works, a very simple <a href="https://en.wikipedia.org/wiki/Tic-tac-toe" target="_blank" rel="noreferrer noopener" aria-label="Tic-Tac-Toe (opens in a new tab)">Tic-Tac-Toe</a> game is provided below. The below program scans a textual representation of a Tic-Tac-Toe board, extracts and manipulates the numbers on the board, and prints the modified result to the console.</p>
<pre class="wp-block-preformatted">00 #!/usr/bin/perl
01 02 use feature 'state';
03 04 use constant MARKS=>[ 'X', 'O' ];
05 use constant BOARD=>'
06 ┌───┬───┬───┐
07 │ 1 │ 2 │ 3 │
08 ├───┼───┼───┤
09 │ 4 │ 5 │ 6 │
10 ├───┼───┼───┤
11 │ 7 │ 8 │ 9 │
12 └───┴───┴───┘
13 ';
14 15 sub get_mark {
16 my $game = shift;
17 my @nums = $game =~ /[1-9]/g;
18 my $indx = (@nums+1) % 2;
19 20 return MARKS->[$indx];
21 }
22 23 sub put_mark {
24 my $game = shift;
25 my $mark = shift;
26 my $move = shift;
27 28 $game =~ s/$move/$mark/;
29 30 return $game;
31 }
32 33 sub get_move {
34 return (<> =~ /^[1-9]$/) ? $& : '0';
35 }
36 37 PROMPT: {
38 state $game = BOARD;
39 40 my $mark;
41 my $move;
42 43 print $game;
44 45 last PROMPT if ($game !~ /[1-9]/);
46 47 $mark = get_mark $game;
48 print "$mark\'s move?: ";
49 50 $move = get_move;
51 $game = put_mark $game, $mark, $move;
52 53 redo PROMPT;
54 }</pre>
<p>To try out the above program on your PC, you can copy-and-paste the above text into a plain text file and save and run it. The line numbers will have to be removed before the program will work. Of course, the command that one uses to perform that sort of textual extraction and reporting is <em>perl</em>.</p>
<p>Assuming that you have saved the above text to a file named <em>game.txt</em>, the following command can be used to strip the leading numbers from all the lines and write the modified version to a new file named <em>game</em>:</p>
<pre class="wp-block-preformatted">$ cat game.txt | perl -npe 's/...//' > game</pre>
<p>The above command is a very small PERL script and it is an example of what is called a <a rel="noreferrer noopener" aria-label="one-liner (opens in a new tab)" href="https://en.wikipedia.org/wiki/One-liner_program" target="_blank">one-liner</a>.</p>
<p>Now that the line numbers have been removed, the program can be run by entering the following command:</p>
<pre class="wp-block-preformatted">$ perl game</pre>
<h2>How it works</h2>
<p>PERL is a <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://en.wikipedia.org/wiki/Procedural_programming" target="_blank">procedural</a> programming language. A program written in PERL consists of a series of commands that are executed sequentially. With few exceptions, most commands alter the state of the computer’s memory in some way.</p>
<p>Line 00 in the Tic-Tac-Toe program isn’t technically part of the PERL program and it can be omitted. It is called a <a rel="noreferrer noopener" aria-label="shebang (opens in a new tab)" href="https://en.wikipedia.org/wiki/Shebang_(Unix)" target="_blank">shebang</a> (the letter <strong>e</strong> is pronounced soft as it is in the word sh<span style="text-decoration: underline">e</span>ll). The purpose of the shebang line is to tell the operating system what interpreter the remaining text should be processed with if one isn’t specified on the command line.</p>
<p>Line 02 isn’t strictly necessary for this program either. It makes available an advanced command named <em>state</em>. The <em>state</em> command creates a <a rel="noreferrer noopener" aria-label="variable (opens in a new tab)" href="https://en.wikipedia.org/wiki/Variable_(computer_science)" target="_blank">variable</a> that can retain its value after it has gone out of <a rel="noreferrer noopener" aria-label="scope (opens in a new tab)" href="https://en.wikipedia.org/wiki/Scope_(computer_science)" target="_blank">scope</a>. I’m using it here as a way to avoid declaring a <a rel="noreferrer noopener" aria-label="global variable (opens in a new tab)" href="https://en.wikipedia.org/wiki/Global_variable" target="_blank">global variable</a>. It is considered good practice in computer programming to avoid using global variables where possible because they allow for <a rel="noreferrer noopener" aria-label="action at a distance (opens in a new tab)" href="https://en.wikipedia.org/wiki/Action_at_a_distance_(computer_programming)" target="_blank">action at a distance</a>. If you didn’t follow all of that, don’t worry about it. It’s not important at this point.</p>
<h3>PERL scopes, blocks and subroutines</h3>
<p>Scope is a very important concept that one needs to be familiar with when reading and writing procedural programs. In PERL, scope is often delineated by a pair of <a rel="noreferrer noopener" aria-label="curly brackets (opens in a new tab)" href="https://en.wikipedia.org/wiki/Bracket#Curly_bracket" target="_blank">curly brackets</a>. Within the <a rel="noreferrer noopener" aria-label="global scope (opens in a new tab)" href="https://en.wikipedia.org/wiki/Scope_(computer_science)#Global_scope" target="_blank">global scope</a>, the above Tic-Tac-Toe program defines four sub-scopes on lines 15-21, 23-31, 33-35 and 37-54. The first three scopes are prefixed with subroutine declarations and the last scope is prefixed with the label <em>PROMPT</em>.</p>
<p>Scopes serve multiple purposes in programming languages. One purpose of a scope is to group a set of commands together as a unit so that they can be <a rel="noreferrer noopener" aria-label="called (opens in a new tab)" href="https://en.wikipedia.org/wiki/Subroutine" target="_blank">called</a> repeatedly with a single command rather than having to repeat several lines of code each time in the program. Another purpose is to enhance the readability of the program by denoting a restricted area where the value of a variable can be updated.</p>
<p>Within the scope that is labeled <em>PROMPT</em> and defined on lines 37-54 of the above Tic-Tac-Toe program, a variable named <em>mark</em> is created using the <em>my</em> keyword (line 40). After it is created, it is assigned a value by calling the <em>get_mark</em> subroutine (line 47). Later, the <em>put_mark</em> subroutine is called (line 51) to change the value in the square that was chosen by the <em>get_move</em> subroutine on line 50.</p>
<p>Hopefully it is obvious that the mark that <em>put_mark</em> is setting is meant to be the same mark that <em>get_mark</em> retrieved earlier. As a programmer though, how do I know that the value of <em>mark</em> wasn’t changed when the <em>get_move</em> subroutine was called? This example program is small enough that every line can be examined to make that determination. But most programs are much larger than this example and having to know exactly what is going on at all points in the program’s execution can be overwhelming and error-prone. Because <em>mark</em> was created with the <em>my</em> keyword, its value can only be accessed and modified within the scope that it was created (or a sub-scope). It doesn’t matter what subroutines at parallel or higher scopes do; even if they change variables with the same name in their own scopes. This property of scopes — restricting the range of lines on which the value of a variable can be updated — improves the readability of the code by allowing the programmer to focus on a smaller section of the program without having to be concerned about what is happening elsewhere in the program.</p>
<p>Lines 04 and 05 define the <em>MARKS</em> and <em>BOARD</em> variables, respectively. Because they are not within any curly bracket pairing, they exist in the global scope. It is permissible to create constant variables in the global scope because they are read-only and therefore not subject to the action at a distance concern. <a href="https://en.wikipedia.org/wiki/Naming_convention_(programming)#Perl" target="_blank" rel="noreferrer noopener" aria-label="In PERL, it is traditional to name constants in all upper case letters (opens in a new tab)">In PERL, it is traditional to name constants in all upper case letters</a>.</p>
<p>Notice that scopes can be nested such that variables defined in outer scopes can be accessed and modified from within inner scopes. This is why the <em>MARKS</em> and <em>BOARD</em> variables can be accessed within the <em>get_mark</em> subroutine and <em>PROMPT</em> block respectively — they are sub-scopes of the global scope.</p>
<p>The statements in the program are executed in order from top to bottom and left to right. <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlsyn.html#Simple-Statements" target="_blank">Each statement is terminated with a semi-colon (<strong>;</strong>)</a>. The semi-colon can be omitted from the last statement in any scope and from after the last block of many statements that define the flow of the program such as <em>sub</em>, <em>if</em> and <em>while</em>.</p>
<p>In PERL nomenclature, <em>scopes</em> are called <em>blocks</em>. Scope is the more general term that is typically used in online references like Wikipedia, but the remainder of this article will use the more <a rel="noreferrer noopener" aria-label="perlish (opens in a new tab)" href="https://en.wiktionary.org/wiki/Perlish" target="_blank">perlish</a> term <em>blocks</em>.</p>
<p>The statements within the first three blocks are not immediately executed as the program is evaluated from top to bottom. Rather, they are associated with the subroutine name preceding the block. This is the function of the <em><a href="https://perldoc.perl.org/5.30.0/perlsub.html#DESCRIPTION" target="_blank" rel="noreferrer noopener" aria-label="sub (opens in a new tab)">sub</a></em> keyword — it associates a subroutine name with a block of statements so that they can be called as a unit elsewhere in the program. The three subroutines <em>get_mark</em> (lines 15-21), <em>put_mark</em> (lines 23-31), and <em>get_move</em> (lines 33-35) are called on lines 47, 51 and 50 respectively.</p>
<p>The <em>PROMPT</em> block is not associated with a subroutine definition <a href="https://perldoc.perl.org/5.30.0/perlsyn.html#Compound-Statements" target="_blank" rel="noreferrer noopener" aria-label="or other flow-control statement (opens in a new tab)">or other flow-control statement</a>, so the statements within it are immediately executed in sequence when the program is run.</p>
<h3>PERL regular expressions</h3>
<p>If there is one feature that is more central to PERL than any other it is regular expressions. Notice that in the example Tic-Tac-Toe program every block contains a <strong>=~</strong> (or <strong>!~</strong>) operator followed by some text surrounded with forward slashes (<strong>/</strong>). The text within the forward slashes is called a <a href="https://en.wikipedia.org/wiki/Regular_expression" target="_blank" rel="noreferrer noopener" aria-label="regular expression (opens in a new tab)">regular expression</a> and the operator binds the regular expression to a variable or <a href="https://en.wikipedia.org/wiki/Stream_(computing)" target="_blank" rel="noreferrer noopener" aria-label="data stream (opens in a new tab)">data stream</a>.</p>
<p>It is important to note that there are different regular expression syntaxes. Some editors and command-line tools (for example, <a rel="noreferrer noopener" aria-label="grep (opens in a new tab)" href="https://www.gnu.org/software/grep/manual/grep.html#grep-Programs" target="_blank">grep</a>) allow the user to select which regular expression syntax they prefer to use. <a rel="noreferrer noopener" aria-label="PERL-Compatible Regular Expressions (PCRE) (opens in a new tab)" href="https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions" target="_blank">PERL-Compatible Regular Expressions (PCRE)</a> are by far the most powerful.</p>
<h4>Regular expressions used in matching operations</h4>
<p>The result of applying the regular expression to a variable or data stream is usually a value that, when used in a flow-control statement such as <em>if</em> or <em>while</em>, will evaluate to <strong>true</strong> or <strong>false</strong> depending on whether or not the match succeeded. There are <a rel="noreferrer noopener" aria-label="modifiers (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlre.html#Modifiers" target="_blank">modifiers</a> that can be appended to the closing slash of a regular expression to change its return value.</p>
<p>Line 45 of the Tic-Tac-Toe program provides a typical example of how a regular expression is used in a PERL program. The regular expression <strong>[1-9]</strong> is being applied to the variable <em>game</em> which holds the in-memory representation of the Tic-Tac-Toe game board. The expression is a <a rel="noreferrer noopener" aria-label="character class (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlrecharclass.html#Character-Ranges" target="_blank">character class</a> that matches any character in the range from 1 to 9 (inclusive). The result of the regular expression will be <strong>true</strong> only if a character from 1 to 9 is present in what is being evaluated. On line 45, the <strong>!~</strong> operator applies the regular expression to the <em>game</em> variable and negates its sense such that the result will be <strong>true</strong> only if none of the characters from 1 to 9 are present. Because the regular expression is embedded within the conditional clause of the <em>if</em> <a rel="noreferrer noopener" aria-label="statement modifier (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlsyn.html#Statement-Modifiers" target="_blank">statement modifier</a>, the statement <em>last PROMPT</em> is only executed if there are no characters in the range from 1 to 9 left on the game board.</p>
<p>The <a rel="noreferrer noopener" aria-label="last (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/functions/last.html" target="_blank"><em>last</em></a> statement is one of a few flow-control statements in PERL that allow the program execution sequence to jump from the current line to another line somewhere else in the program. Other flow-control statements that work in a similar fashion include <a rel="noreferrer noopener" aria-label="next (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/functions/next.html" target="_blank"><em>next</em></a>, <a rel="noreferrer noopener" aria-label="continue (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/functions/continue.html" target="_blank"><em>continue</em></a>, <a rel="noreferrer noopener" aria-label="redo (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/functions/redo.html" target="_blank"><em>redo</em></a> and <a rel="noreferrer noopener" aria-label="goto (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/functions/goto.html" target="_blank"><em>goto</em></a> (the <em>goto</em> statement should be avoided whenever possible because it allows for <a rel="noreferrer noopener" aria-label="spaghetti code (opens in a new tab)" href="https://en.wikipedia.org/wiki/Spaghetti_code" target="_blank">spaghetti code</a>).</p>
<p>In the example Tic-Tac-Toe program, the <em>last PROMPT</em> statement on line 45 causes program execution to resume just after the <em>PROMPT</em> block. Because there are no more statements in the program, the program will terminate.</p>
<p>The label <em>PROMPT</em> was chosen arbitrarily. Any label (or none at all) could have been used.</p>
<p>The <em>redo PROMPT</em> statement at the end of the <em>PROMPT</em> block causes program execution to jump back to the beginning of the <em>PROMPT</em> block.</p>
<p>Notice that the <em>state</em> keyword like the <em>my</em> keyword creates a variable that can only be accessed or modified within the block that it is created (or a nested sub-block if any exist). Unlike the <em>my</em> keyword, variables created with the <em>state</em> keyword keep their former value when the blocks they are in are called repeatedly. This is the behavior that is needed for the <em>game</em> variable because it is being updated incrementally each time the <em>PROMPT</em> block is run. The <em>mark</em> and <em>move</em> variables are meant to be different on each iteration of the <em>PROMPT</em> block, so they do not need to be created with the <em>state</em> keyword.</p>
<h4>Regular expressions used for input validation</h4>
<p>Another common use of regular expressions is for <a rel="noreferrer noopener" aria-label="input validation (opens in a new tab)" href="https://en.wikipedia.org/wiki/Data_validation" target="_blank">input validation</a>. Line 34 of the example Tic-Tac-Toe program provides an example of a regular expression being used for input validation. The expression on line 34 is similar to the one on line 45. It is also checking for characters from 1 to 9. However, it is performing the check against <a rel="noreferrer noopener" aria-label="the null filehandle (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlop.html#I%2fO-Operators" target="_blank">the null filehandle</a> (<strong><></strong>); it is using the <strong>=~</strong> operator; and it is prefixed and suffixed with the <a rel="noreferrer noopener" aria-label="zero-width assertions (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlre.html#Assertions" target="_blank">zero-width assertions</a> <strong>^</strong> and <strong>$</strong> respectively.</p>
<p>The null filehandle, when accessed as it is on line 34, will cause the program to pause until one line of input is provided. The regular expression will evaluate to <strong>true</strong> only if the line contains one character in the range from 1 to 9. The assertions <strong>^</strong> and <strong>$</strong> do not match any characters. Rather, they match the beginning and end positions, respectively, on the line. The regular expression effectively reads: “Begin (<strong>^</strong>) with one character in the range from 1 to 9 (<strong>[1-9]</strong>) and end (<strong>$</strong>)”.</p>
<p>Because it is embedded in the conditional clause of <a rel="noreferrer noopener" aria-label="the ternary operator (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlop.html#Conditional-Operator" target="_blank">the ternary operator</a>, line 34 will return either what was matched (<a rel="noreferrer noopener" aria-label="$& (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlvar.html#Variables-related-to-regular-expressions" target="_blank"><strong>$&</strong></a>) if the match succeeded or the character zero (<strong>0</strong>) if it failed. If the input were not validated in this way, then the user could submit their opponent’s mark rather than a number on the board.</p>
<h4>Regular expressions used for filtering data</h4>
<p>Line 17 demonstrates using the <a rel="noreferrer noopener" aria-label="global modifier (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlretut.html#Using-regular-expressions-in-Perl" target="_blank">global modifier</a> (<strong>g</strong>) on a regular expression. With the global modifier, the regular expression will return the number of matches instead of <strong>true</strong> or <strong>false</strong>. In <a rel="noreferrer noopener" aria-label="list context (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perldata.html#Context" target="_blank">list context</a>, it returns a list of all the matched <a href="https://en.wikipedia.org/wiki/Substring" target="_blank" rel="noreferrer noopener" aria-label="substrings (opens in a new tab)">substrings</a>.</p>
<p>Line 17 uses a regular expression to copy all the numbers in the range from 1 to 9 from the <em>game</em> variable into the array named <em>nums</em>. Line 18 then uses the <a rel="noreferrer noopener" aria-label="modulo operator (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlop.html#Multiplicative-Operators" target="_blank">modulo operator</a> with the integer <strong>2</strong> as its second argument to determine whether the length of the <em>nums</em> array is even or odd. The formula on line 18 will result in <strong>0</strong> if the length of <em>nums</em> is odd and <strong>1</strong> if the length of <em>nums</em> is even. Finally, the computed index (<em>indx</em>) is used to access an element of the <em>MARKS</em> array and return it. Using this formula, the <em>get_mark</em> function will alternately return <strong>X</strong> or <strong>O</strong> depending on whether there are an odd or even number of positions left on the board.</p>
<h4>Regular expressions used for substituting data</h4>
<p>Line 28 demonstrates yet another common use of regular expressions in PERL. Rather than being used in a <a rel="noreferrer noopener" aria-label="match operator (m) (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlop.html#Regexp-Quote-Like-Operators" target="_blank">match operator (<strong>m</strong>)</a>, the regular expression on line 28 is being used in a <a rel="noreferrer noopener" aria-label="substitution operator (s) (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perlop.html#Regexp-Quote-Like-Operators" target="_blank">substitution operator (<strong>s</strong>)</a>. If the value in the <em>move</em> variable is found in the <em>game</em> variable, it will be substituted with the value of the <em>mark</em> variable.</p>
<h3>PERL sigils and data types</h3>
<p>The last things of note that are used in the example Tic-Tac-Toe program are <a rel="noreferrer noopener" aria-label="the sigils (opens in a new tab)" href="https://perldoc.perl.org/5.30.0/perldata.html#DESCRIPTION" target="_blank">the sigils</a> (<strong>$</strong> and <strong>@</strong>) that are placed before the variable names. When creating a variable, the sigil indicates the type of variable being created. It is important to note that <a rel="noreferrer noopener" aria-label="a different sigil can be prefixed to the variable name when it is accessed (opens in a new tab)" href="https://perldoc.perl.org/perlfaq4.html#What-is-the-difference-between-%24array%5b1%5d-and-%40array%5b1%5d%3f" target="_blank">a different sigil can be prefixed to the variable name when it is accessed</a> to indicate whether one or many items should be returned from the variable.</p>
<p><a href="https://perldoc.perl.org/perlintro.html#Perl-variable-types" target="_blank" rel="noreferrer noopener" aria-label="There are three built-in data types in PERL (opens in a new tab)">There are three built-in data types in PERL</a>: scalars (<strong>$</strong>), arrays (<strong>@</strong>) and associative arrays (<strong>%</strong>). Scalars hold a single data item such as a number, character or string of characters. Arrays are numerically indexed sets of scalars. Associative arrays are arrays that are indexed by scalars rather than numbers.</p>
</div>
https://www.sickgaming.net/blog/2020/02/...oe-part-1/