From Tiny Python Projects by Ken Youens-Clark

In this article, the author tries to convince a sysadmin who might know bash to transfer systems-level tasks to Python.


Take 40% off Tiny Python Projects by entering fccclark into the discount code box at checkout at manning.com.


On 30 April 2020, I had a chance to talk with Dustin Reybrouck for The SysAdminShow Podcast. We mostly discussed why sysadmins might like to add Python as a tool in addition to shell scripting. Most of his audience is probably familiar with bash or Powershell, so I showed how I might write somewhat simple but parameterized and documented version of head in bash and how it would be translated to Python 3. Following is a summary of our discussion.

Sysadmins spend much of their lives on the command-line, so it makes sense that they would seek to automate tasks using the language of their command line – something like bash or Powershell. While it is possible to write many useful programs in these languages, a higher-level language like Python might prove to be a better choice especially given that a Python program is portable between systems that natively understand bash (e.g., Linux or Mac) and Powershell (Window). As an exercise, let’s write a bash implementation of the head command, then we’ll compare how we could write it in Python.

Writing head in bash

Let’s start by imagining how we might write our own implementation of the venerable head command. For example, given the text of the US Constitution, we would expect to see the first few lines of a given file, usually 10:

 
 $ head const.txt
 We the People of the United States, in Order to form a more perfect Union,
 establish Justice, insure domestic Tranquility, provide for the common
 defence, promote the general Welfare, and secure the Blessings of Liberty to
 ourselves and our Posterity, do ordain and establish this Constitution for the
 United States of America.
  
 Article 1.
  
 Section 1
 All legislative Powers herein granted shall be vested in a Congress of the

And we would expect to be able to modify that number using an option like -n:

 
 $ head -n 3 const.txt
 We the People of the United States, in Order to form a more perfect Union,
 establish Justice, insure domestic Tranquility, provide for the common
 defence, promote the general Welfare, and secure the Blessings of Liberty to
 

It’s commonplace for command-line tools to respond to -h or --help with a “usage” statement about how the program should be invoked. In the case of head , it does not give a usage because we ask for it but rather because it does not recognize these as valid options. Still, it manages to produce a “usage” under some circumstances which is better than nothing:

 
 $ head -h
 head: illegal option -- h
 usage: head [-n lines | -c bytes] [file ...]
 

bash version that handles one file

Let’s start off with a version in bash that handles just one file and a possible number of lines which will default to 10. If run with no arguments, it will print a “usage” statement:

 
 $ ./simple-head.sh
 Usage: simple-head.sh FILE [NUM]
 

When run with a file as the only argument, it will print the first 10 lines:

 
 $ ./simple-head.sh const.txt
 We the People of the United States, in Order to form a more perfect Union,
 establish Justice, insure domestic Tranquility, provide for the common
 defence, promote the general Welfare, and secure the Blessings of Liberty to
 ourselves and our Posterity, do ordain and establish this Constitution for the
 United States of America.
  
 Article 1.
  
 Section 1
 All legislative Powers herein granted shall be vested in a Congress of the
 

We can provide an optional second argument to change the number of lines we show:

 
 $ ./simple-head.sh const.txt 2
 We the People of the United States, in Order to form a more perfect Union,
 establish Justice, insure domestic Tranquility, provide for the common
 

Note that the program fails rather gracelessly. For instance, if we give it a non-existent file:

 $ ./simple-head.sh foo
 ./simple-head.sh: line 24: foo: No such file or directory

Our program also fails to show an error if the second argument is not a number and will, in fact, print the entire file. Try running the program like so:

 
$ ./simple-head.sh const.txt foo
 

Still, it’s instructive to look at this program:

 
 #!/bin/bash (1)
  
 #
 # Author : Ken Youens-Clark <[email protected]> (2)
 # Purpose: simple bash implementation of `head`
 #
  
 # Check number of arguments is 1 or 2
 if [[ $# -lt 1 ]] || [[ $# -gt 2 ]]; then       (3)
     echo "Usage: $(basename "$0") FILE [NUM]"   (4)
     exit 1                                      (5)
 fi
  
 FILE=$1       (6)
 NUM=${2:-10}  (7)
 LINE_NUM=0    (8)
  
 while read -r LINE; do                (9)
     echo "$LINE"                      (10)
     LINE_NUM=$((LINE_NUM+1))          (11)
     if [[ $LINE_NUM -eq $NUM ]]; then (12)
         break                         (13)
     fi
 done < "$FILE"
 
  1. This line is often called the “shebang,” and it is common to see the path to bash hard-coded like this. It’s not necessarily best practice, however, as bash might well be located at /usr/local/bin/bash.
  2. Any text following # is ignored by bash. Here we add comments to the program, but you can also use this to temporarily disable code. It’s polite to document your code so that other might contact you with questions.
  3. Everything in bash is a string, but we can use operators like -lt (less than) and -gt (greater than) to get numeric comparisons. The $# variable holds the number of arguments to our program, so we’re trying to see if we do not have exactly 1 or 2 arguments.
  4. We print a “usage”-type statement to explain to the user how to invoke the program. The FILE is a required position argument while the [NUM] is shown in [] to indicate that it is optional.
  5. We exit with a non-zero value (1 is fine) to indicate that the program failed to run as expected.
  6. Since we know we have at least 1 argument, we can copy the value of the first argument in $1 to our FILE variable.
  7. We may or may not have a second argument, so we can either copy $2 or a default value of 10 to our NUM variable.
  8. Initialize a LINE_NUM variable to 0 so we can count how many lines of our file we have shown.
  9. while loop is a common idiom for reading a file line-by-line in bash.
  10. The echo command will print text to the terminal.
  11. The $(()) evaluation will allow us to perform a bit of arithmetic with what is otherwise a string value. Here we want to add 1 to the value of LINE_NUM.
  12. The -eq is the numeric equality operator in bash. Here we check if the LINE_NUM is equal to the number of lines we mean to show.
  13. The break statement will cause the while loop to exit.

A complete implementation in bash

The previous simple-head.sh version shows some basic ideas of how to handle many systems-level tasks in bash such as:

  • Documenting the language of the program with a shebang line
  • Documenting the author and purpose program with comments
  • Parameterizing your program so as to values as arguments rather than hard-coding values
  • Documenting the program parameters with an automatically generated “usage” when needed by the user
  • Exiting the program with non-zero values when the program does not complete as normally expected
  • Defining reasonable default values for optional arguments

Still, this is a rather sophomoric replacement for head because:

  • It does not handle multiple files
  • It fails to validate if the arguments are actually readable files
  • There is no -n option because the program handles only positional arguments and so cannot handle options
  • The program will not print a “usage” for -h, again because it fails to handle options

Let’s write a better implementation that is a complete replacement for head:

 
 #!/usr/bin/env bash (1)
  
 #
 # Author : Ken Youens-Clark <[email protected]> (2)
 # Purpose: bash implementation of `head`
 #
  
 # Die on use of uninitialize variables
 set -u (3)
  
 # Default value for the argument
 NUM_LINES=10 (4)
  
 # A function to print the "usage"
 function USAGE() { (5)
     printf "Usage:\n  %s -n NUM_LINES [FILE ...]\n\n" "$(basename "$0")"
  
     echo "Options:"
     echo " -n NUM_LINES"
     echo
     exit "${1:-0}"
 }
  
 # Die if we have no arguments at all
 [[ $# -eq 0 ]] && USAGE 1 (6)
  
 # Process command line options
 while getopts :n:h OPT; do (7)
     case $OPT in           (8)
         n)
             NUM_LINES="$OPTARG" (9)
             shift 2             (10)
             ;;
         h)
             USAGE               (11)
             ;;
         :)
             echo "Error: Option -$OPTARG requires an argument." (12)
             exit 1
             ;;
         \?)
             echo "Error: Invalid option: -${OPTARG:-""}" (13)
             exit 1
     esac
 done
  
 # Verify that NUM_LINES looks like a positive integer
 if [[ $NUM_LINES -lt 1 ]]; then            (14)
     echo "-n \"${NUM_LINES}\" must be > 0"
     exit 1
 fi
  
 # Process the positional arguments
 FNUM=0                (15)
 for FILE in "[email protected]"; do  (16)
     FNUM=$((FNUM+1))  (17)
  
     # Verify this argument is a readable file
     if [[ ! -f "$FILE" ]] || [[ ! -r "$FILE" ]]; then (18)
         echo "\"${FILE}\" is not a readable file"
         continue (19)
     fi
  
     # Print a header in case of mulitiple files
     [[ $# -gt 1 ]] && echo "==> ${FILE} <==" (20)
  
     # Initialize a counter variable
     LINE_NUM=0 (21)
  
     # Loop through each line of the file
     while read -r LINE; do (22)
         echo $LINE
  
         # Increment the counter and see if it's time to break
         LINE_NUM=$((LINE_NUM+1))
         [[ $LINE_NUM -eq $NUM_LINES ]] && break (23)
     done < "$FILE"
  
     [[ $# -gt 1 ]] && [[ $FNUM -lt $# ]] && echo (24)
 done
  
 exit 0
  1. Using the env program (which is pretty universally located at /usr/bin/env) to find bash is more flexible than hard-coding the path as /bin/bash.
  2. Same documentation as comments.
  3. This will cause bash to die if we attempt to use an uninitialized variable and is one of the few safety features offered by the language.
  4. Here we set a default value for the NUM_LINES to show which can be overridden by an option.
  5. Since there are a multiple times I might want to show the usage and exit with an error (e.g., no arguments or as requested by -h), I can put this into a function to call later.
  6. If the number of arguments to the program $# is 0, then exit with a “usage” statement and a non-zero value.
  7. We can use getopts in bash to manually parse the command-line arguments. We are specifically looking for flags -n which takes a value and -h which does not.
  8. $OPT will have the flag value such as n for -n or h for -h.
  9. The $OPTARG will have the value for the -n flag. We can copy that to our NUM_LINES variable to save it.
  10. Now that we have processed -n 3, for instance, we use shift 2 to discard those two values from the program arguments [email protected].
  11. If processing the -h flag, call the USAGE function which will cause the program to exit.
  12. This handles when an option like -n does not have an accompanying value.
  13. This handles an option we didn’t define.
  14. This use the -lt operator to coerce the NUM_LINES to a numeric value. If it is less than -lt 1, we throw an error and exit with a non-zero value.
  15. Now that we have handled the optional arguments, we can handle the rest of the positional arguments found in [email protected]. We start off by defining a FNUM so we can track the file number we are working with. That is, this is the index value of the current file.
  16. We can use a for loop to iterate through the positional arguments found in [email protected].
  17. Add 1 to the FNUM variable.
  18. The -f test will return a “true” value if the given argument is a file, and ! will negate this. Ditto as -r will report if the argument is a readable file.
  19. The continue statement will cause the for loop to immediately advance to the next iteration, skipping all the code below.
  20. If the number of positional arguments is greater than -gt 1, then print a header showing the current file’s name.
  21. Initialize a line count variable for reading the file.
  22. This is the same loop as before that we used to read a given number of lines from the file. This one is improved, however, because we check if the number argument from the user is actually a positive integer!
  23. This is a shorter way to write a single-line if statement.
  24. If there are multiple files to process and we’re not currently on the last file, then print an extra newline to separate the outputs.

 

If you are new to bash programming, the syntax will probably look rather cryptic! The entirely manual handling of the command-line options and positional arguments is especially cumbersome. I will admit this is not an easy program to write correctly, and, even when it finally works on my Linux and Max machines, I won’t be able to give it to a Windows user unless they have something like WSL (Windows Subsystem for Linux) or Cygwin installed.

Still, this program works rather well! It will print nice documentation if we run with no arguments or if you run ./head.sh -h, which is actually an improvement over head:

 
 $ ./head.sh
 Usage:
   head.sh -n NUM_LINES [FILE ...]
  
 Options:
  -n NUM_LINES
 

It rejects bad options:

 
 $ ./head.sh -x 8 const.txt
 Error: Invalid option: -x
 

It can handle both options and positional arguments, provides a reasonable default for the -n option, and correctly skips non-file arguments:

 
 $ ./head.sh -n 3 foo const.txt
 "foo" is not a readable file
 ==> const.txt <==
 We the People of the United States, in Order to form a more perfect Union,
 establish Justice, insure domestic Tranquility, provide for the common
 defence, promote the general Welfare, and secure the Blessings of Liberty to
 

And it mimics the output from head for multiple files:

 
 $ ./head.sh -n 1 const.txt simple-head.sh head.sh
 ==> const.txt <==
 We the People of the United States, in Order to form a more perfect Union,
  
 ==> simple-head.sh <==
 #!/bin/bash
  
 ==> head.sh <==
 #!/usr/bin/env bash
 

For what it’s worth, I used the included new_bash.py program to create this program. If you find yourself stuck writing a bash program and don’t wish to start from scratch, this program might be useful to you.

Testing head.sh

I have included a test.py that is a Python program that will run the head.sh program to ensure it actually does what it is supposed to do. If you look at the contents of this program, you will see a number of functions with names that start with test_. This is because I use the pytest module/program to run these functions as a test suite. I like to use the -x flag to indicate that testing should halt at the first failing test and the -v flag for “verbose” output. These can be specified individually or combined like -xv or -vx:

 
 $ pytest -xv test.py
 ============================= test session starts ==============================
 ...
  
 test.py::test_exists PASSED                                              [ 14%]
 test.py::test_usage PASSED                                               [ 28%]
 test.py::test_bad_file PASSED                                            [ 42%]
 test.py::test_bad_num PASSED                                             [ 57%]
 test.py::test_default PASSED                                             [ 71%]
 test.py::test_n PASSED                                                   [ 85%]
 test.py::test_multiple_files PASSED                                      [100%]
  
 ============================== 7 passed in 0.56s ===============================
 

It’s a bit of a nuisance to have to write the tests for a program in a different language from the program itself, but I know of no testing framework in bash that I’d could use (or would like to learn) that can run a test suite such as the above!

Writing head.py in Python 3

To write a similar version in Python, we’ll rely heavily on the standard argparse module to handle the validation of all the command-line arguments as well as generating the “usage” statements. Here is a version that, similar to the simple-head.py, will handle just one file:

 
 #!/usr/bin/env python3 (1)
 """                    (2)
 Author : Ken Youens-Clark
 Purpose: Python implementation of head
          This version only handles one file!
 """
  
 import argparse        (3)
 import os
 import sys
  
  
 # --------------------------------------------------
 def get_args():        (4)
     """Get command-line arguments"""  (5)
  
     parser = argparse.ArgumentParser( (6)
         description='Python implementation of head',
         formatter_class=argparse.ArgumentDefaultsHelpFormatter)
  
     parser.add_argument('file',       (7)
                         metavar='FILE',
                         type=argparse.FileType('rt'), (8)
                         help='Input file')
  
     parser.add_argument('-n',         (9)
                         '--num',
                         help='Number of lines',
                         metavar='int',
                         type=int,     (10)
                         default=10)   (11)
  
     args = parser.parse_args()        (12)
  
     if args.num < 1:                  (13)
         parser.error(f'--num "{args.num}" must be > 0') (14)
  
     return args                       (15)
  
  
 # --------------------------------------------------
 def main():                           (16)
     """Make a jazz noise here"""
  
     args = get_args()                 (17)
  
     for i, line in enumerate(args.file, start=1): (18)
         print(line, end='') (19)
         if i == args.num:   (20)
             break           (21)
  
  
 # --------------------------------------------------
 if __name__ == '__main__':  (22)
     main()
 
  1. The “shebang” uses the env program to find the first python3 in our $PATH.
  2. The triple quotes allow us to create a string that spans multiple lines. Here we’re creating a string but not assigning it to a variable. This is a convention for creating documentation also called a “docstring.” This docstring summarizes the program itself. I like to document at least who wrote it and why.
  3. We can import code from other modules. While we can import several modules separated by commas, it’s recommended to put each on a separate line. Specifically we want to use argparse to handle the command-line arguments, and we’ll also use the os (operating system) and sys (systems) modules.
  4. I like to always define a get_args() function that exclusively deals with argparse for creating the program’s parameters and validating the arguments. I always place this first so I can see it immediately when I’m reading the program.
  5. This is a docstring for the function. It’s ignored like a comment would be, but it has significance to Python and would appear if I were to import this module and ask for help(get_args).
  6. This creates a parser that will handle the command-line arguments. I add a description for the program that will appear in any “usage” statements, and I always like to have argparse display any default values for the user.
  7. Positional arguments have no leading dashes in their names. Here we define a single positional argument that we can refer to internally as file.
  8. The default type for all arguments is a str (string). We can ask argparse to enforce a different type like int and it will print an error when the user fails to provide a value that can be parsed into an integer value. Here we are using the special argparse type that defined a “readable” ('r') “text” ('t') file. If the user provides anything other than a readable text file, argparse will halt the program, print an error and usage, and exit with a non-zero value.
  9. The leading - on -n (short name) and --num (long name) for the “number” argument means this will be an option.
  10. The user must provide a value that can be parsed into a int value.
  11. The default value will be 10.
  12. After defining the program’s parameters, we ask the parser to parse the arguments. If there are any problems like the wrong number or types of arguments, argparse will stop the program here.
  13. If we get to this point, the arguments were valid as far as argparse is concerned. We can perform additional manual checks such as verifying that args.num is greater than 0.
  14. The parser.error() function is a way for us to manually invoke the error-out function of argparse.
  15. Functions in Python must explicitly return a value or the None will be returned by default. Here was want to return the args to the calling function.
  16. Convention dictates the starting function be called main(), but this is not a requirement, and Python will not automatically call this function to start the program. Neither get_args() nor main() accept arguments, but, if they did, they would be listed in the parens.
  17. All the work to define the parameters, validate the arguments, and handle help and usage has now been hidden in the get_args() function. We can think of this as a “unit” that encapsulates those ideas. If our program successfully calls get_args() and returns with some args, then we can move forward knowing the arguments are actually correct and useful.
  18. We don’t have to initialize a counting variable like in bash as we can use the enumerate() function to return the index and value of any sequence of items. Here the args.file is actually an open file handle provided by argparse because we defined the args.file as a “file” type. That means I’ll be iterating over the lines in the file handle. I can use the start option to enumerate() to start counting at 1 instead of 0.
  19. The print() function is like the echo statement in bash. Here there will be a newline stuck to the line from the file, so I use the end='' to indicate that print() should not add the customary newline to the output.
  20. While bash uses -eq for numeric comparison and == for string equality, Python uses == for both.
  21. Both Python and bash use continue and break in loops to skip and leave loops, respectively.
  22. This is the idiom in Python to detect when a program/module is being run from the command line. Here we want to execute the main() function to start the program to running.

The above program has been contributed as py-head/solution1.py, and you can run it to see how it will create usage for no arguments:

 
 $ ./solution1.py
 usage: solution1.py [-h] [-n int] FILE
 solution1.py: error: the following arguments are required: FILE
 

Note that we did not define the -h and --help flags to argparse as those are reserved specifically for generating help:

 
 $ ./solution1.py -h
 usage: solution1.py [-h] [-n int] FILE
  
 Python implementation of head
  
 positional arguments:
   FILE               Input file
  
 optional arguments:
   -h, --help         show this help message and exit
   -n int, --num int  Number of lines (default: 10)
 

Note that argparse can actually handle options following positional arguments:

 
 $ ./solution1.py const.txt -n 3
 We the People of the United States, in Order to form a more perfect Union,
 establish Justice, insure domestic Tranquility, provide for the common
 defence, promote the general Welfare, and secure the Blessings of Liberty to
 

A complete implementation in Python

The previous Python version demonstrates many shortcuts to creating usable and documented programs that are easier to write and more reliable than the bash version. Still, this program is not yet a full replacement either for head or even head.sh. Let’s see how we can expand the program to handle one or more positional arguments:

 
 #!/usr/bin/env python3
 """
 Author : Ken Youens-Clark
 Purpose: Python implementation of head
          This version handles multiple files
          and is very similar to the bash version.
 """
  
 import argparse
  
  
 # --------------------------------------------------
 def get_args():
     """Get command-line arguments"""
  
     parser = argparse.ArgumentParser(
         description='Python implementation of head',
         formatter_class=argparse.ArgumentDefaultsHelpFormatter)
  
     parser.add_argument('file',
                         metavar='FILE',
                         type=argparse.FileType('rt'),
                         nargs='+', (1)
                         help='Input file')
  
     parser.add_argument('-n',
                         '--num',
                         help='Number of lines',
                         metavar='int',
                         type=int,
                         default=10)
  
     args = parser.parse_args()
  
     if args.num < 1:
         parser.error(f'--num "{args.num}" must be > 0')
  
     return args
  
  
 # --------------------------------------------------
 def main():
     """Make a jazz noise here"""
  
     args = get_args()
     num_files = len(args.file) (2)
  
     for fnum, fh in enumerate(args.file, start=1):    (3)
         if num_files > 1:                             (4)
             print(f'==> {fh.name} <==')               (5)
  
         for line_num, line in enumerate(fh, start=1): (6)
             print(line, end='')
             if line_num == args.num:
                 break
  
  
         if num_files > 1 and fnum < num_files:        (7)
             print()
  
  
 # --------------------------------------------------
 if __name__ == '__main__':
     main()
 
  1. The nargs option can take * to indicate zero or more, + for one or more, and ? for zero or one.
  2. Since I’ll refer to the number of files (num_files) several times, I put them into a variable. The args.file argument is now a list of open file handles. I can use the len() function to ask the length of this list which will tell me the number of files provided as arguments. I know they are actually readable text files because of the type constraint I added to this argument.
  3. Again I want both the index (position) and value of each element in args.file, so I can use enumerate(), starting the counting at 1 instead of 0.
  4. Decide whether to print a header.
  5. I call the variable fh to remind me that this is an open file handle. I can get the name of the file itself using fh.name. The f'' (f-string) allows me to interpolate the fh.name value inside the string given to print().
  6. A second loop to iterate over the lines in the file. This is the same code as above.
  7. Decide whether to print() an extra newline between multiple files.

This version is a full implementation of a typical head program and demonstrates many common systems-level programming concepts a sysadmin might need. The os and sys modules are particularly rich in functions for dealing with files and directories and permissions and the like. The argparse code allows one to outsource program validation to another module allowing the coder to focus on the tasks at hand rather than the implementation of tedious and repetitive tasks.

Testing the head.py program

“Without requirements or design, programming is the art of adding bugs to an empty text file.” – Louis Srygley

Python’s pytest module provides a rather simple and elegant way to construct a test suite. I have included a test.py to demonstrate how I typically write integration tests which exercise a program externally and verify that they work as intended. In my own programs, I also tend to write many unit tests that similarly exercise individual functions (the “units” of programming) to ensure they work as expected.

By combining both unit and integration tests, I come to have greater confidence that my code works. More importantly, I feel free to refactor my code to improve algorithms and add features without fearing I will break features that worked previously. When testing, I always run the entire test suite in such a way that testing halts at the first failure:

 
 $ make test
 pytest -xv test.py
 ============================= test session starts ==============================
 ...
  
 test.py::test_exists PASSED                                              [ 14%]
 test.py::test_usage PASSED                                               [ 28%]
 test.py::test_bad_file PASSED                                            [ 42%]
 test.py::test_bad_num PASSED                                             [ 57%]
 test.py::test_default PASSED                                             [ 71%]
 test.py::test_n PASSED                                                   [ 85%]
 test.py::test_multiple_files PASSED                                      [100%]
  
 ============================== 7 passed in 0.84s ===============================
 

This test.py is almost identical to the one for the head.sh. The differences mostly account for how I felt it best to handle errors in the two programs. The best improvement, of course, is that now my tests are in the same language as the program, so it’s easier for me to go back and forth between them!

A more advanced version

The previous version of head.py is reflected in the solution2.py version. We can briefly look at solution3.py to explore more advanced ideas in Python like function definition, list comprehensions, mock file handles, and unit testing.

 
 #!/usr/bin/env python3
 """
 Author : Ken Youens-Clark
 Purpose: Python implementation of head
          This version handles multiple files
          and uses more advanced ideas like function definition,
          list comprehensions, mock file handles, unit testing, etc.
 """
  
 import argparse
 import io
  
  
 # --------------------------------------------------
 def get_args():
     """Get command-line arguments"""
  
     parser = argparse.ArgumentParser(
         description='Python implementation of head',
         formatter_class=argparse.ArgumentDefaultsHelpFormatter)
  
     parser.add_argument('file',
                         metavar='FILE',
                         type=argparse.FileType('rt'),
                         nargs='+',
                         help='Input file')
  
     parser.add_argument('-n',
                         '--num',
                         help='Number of lines',
                         metavar='int',
                         type=int,
                         default=10)
  
     args = parser.parse_args()
  
     if args.num < 1:
         parser.error(f'--num "{args.num}" must be > 0')
  
     return args
  
  
 # --------------------------------------------------
 def main():
     """Make a jazz noise here"""
  
     args = get_args()
     show_header = len(args.file) > 1 (1)
     heads = [head(fh, args.num, show_header) for fh in args.file] (2)
     print('\n'.join(heads)) (3)
  
  
 # --------------------------------------------------
 def head(fh, num, show_header): (4)
     """Return num lines from file handle"""
  
     lines = [f'==> {fh.name} <==\n'] if show_header else [] (5)
     for line_num, line in enumerate(fh, start=1): (6)
         lines.append(line)
         if line_num == num:
             break
  
     return ''.join(lines) (7)
  
  
 # --------------------------------------------------
 def test_head(): (8)
     """Test head"""
  
     assert head(io.StringIO('foo\nbar\nbaz\n'), 1, False) == 'foo\n' (9)
     assert head(io.StringIO('foo\nbar\nbaz\n'), 2, False) == 'foo\nbar\n' (10)
  
  
 # --------------------------------------------------
 if __name__ == '__main__':
     main()
 
  1. Whether or not we show a header between each file is a function of there being 1 or more files.
  2. We use a list comprehension to create the head of each file.
  3. print() all the values of head() for each file.
  4. Here we define a function called head() that takes three arguments.
  5. We initialize a lines variable using an if expression that hinges on whether to show_header.
  6. This is the same logic as before, but now rather than calling print() on each line, we append it to the list of lines.
  7. Return the lines joined on the empty string. Note that each line still has any newline from the file.
  8. Any function starting with test_ will be run by pytest. Here I define a unit test to run the head() function.
  9. The head() function expects something like an open file handle. I can use io.StringIO to make a sort of mock file handle from a string. Each value ending with a newline \n will appear as a “line” of text.
  10. The assert statement will throw an exception if the given value does not evaluate as “true” (but not necessarily as True). Here I want the head() function to return the string foo\nbar\n when I run the function with the given values.

I can copy this version to head.py (or modify test.py to use solution3.py instead of head.py`_ and run `pytest -xv test.py to validate the program. Additionally, I can run pytest directly on the program to run my unit tests:

 
 $ pytest -xv solution3.py
 ============================= test session starts ==============================
 ...
  
 solution3.py::test_head PASSED                                           [100%]
  
 ============================== 1 passed in 0.01s ===============================
 

As programs grow in length and complexity, it makes more sense to write small, tested functions that can be composed into larger, more stable programs.

Creating new programs with argparse and new.py

Similar to the new_bash.py program I mentioned in the bash section, I have included the new.py program I use to create new Python programs that use argparse to validate arguments. You run it with -h for help, of course:

 
 $ ./new.py -h
 usage: new.py [-h] [-n NAME] [-e EMAIL] [-p PURPOSE] [-f] program
  
 Create Python argparse program
  
 positional arguments:
   program               Program name
  
 optional arguments:
   -h, --help            show this help message and exit
   -n NAME, --name NAME  Name for docstring (default: Ken Youens-Clark)
   -e EMAIL, --email EMAIL
                         Email for docstring (default: [email protected])
   -p PURPOSE, --purpose PURPOSE
                         Purpose for docstring (default: Rock the Casbah)
   -f, --force           Overwrite existing (default: False)
 

If I wanted to create, for instance, a Python implementation of cat now, I might do this:

 
 $ ./new.py cat.py
 Done, see new script "cat.py."
 

And now I have a cat.py program that I can immediately execute that shows the typical kinds of positional and optional parameters a program might have:

 
 $ ./cat.py -h
 usage: cat.py [-h] [-a str] [-i int] [-f FILE] [-o] str
  
 Rock the Casbah
  
 positional arguments:
   str                   A positional argument
  
 optional arguments:
   -h, --help            show this help message and exit
   -a str, --arg str     A named string argument (default: )
   -i int, --int int     A named integer argument (default: 0)
   -f FILE, --file FILE  A readable file (default: None)
   -o, --on              A boolean flag (default: False)
 

You can modify the get_args() function to reflect the needs of your new program. Place this program into your $PATH to use system-wide whenever you need to create a new Python program. I hope you may find this speeds you in your development of new Python programs!

Going further

  • Tiny Python Projects: All of these ideas about argparse and testing are discussed in greater detail in this book available now from Manning Publications.
  • Make tutorial: Dustin and I also briefly discussed the use of make and Makefile to document and automate the running of tests and shortcuts and commands and such.
  • GitHub repo: All the code and tests for Tiny Python Projects
  • YouTube: Videos for the chapters of Tiny Python Projects that demonstrate how to start writing Python programs and work through the requirements and tests and solutions.

That’s all for this article. If you want to see more, you can preview the book’s contents on our browser-based liveBook reader here.

All the code described in the article is available here.