Description: https://images.manning.com/360/480/resize/book/e/bec8fa5-11f5-4be8-80de-cea6123940cf/Gookin-MEAP-HI.png

From Tiny C Projects by Dan Gookin

In this article, we will discuss the NATO phonetic alphabet and how to write a simple program that will translate it to plain text.


Take 35% off Tiny C Projects by entering fccgookin into the discount code box at checkout at manning.com.


Count yourself blessed if you’ve never had to spell your name over the phone. Or perhaps you’re named Mary Smith, but you live on a street or in a city you must constantly spell out loud. If so, you resort to your own spelling alphabet, something like, “N as in Nancy” or “K as in knife.” As a programmer, you can ease this frustration by reading this chapter, where you

  • Understand the NATO phonetic alphabet and why they even bother.
  • Translate words into the spelling alphabet.
  • Read a file to translate words into the phonetic alphabet.
  • Go backward and translate the NATO alphabet into words.
  • Read a file to translate the NATO alphabet.
  • Learn that natto in Japanese is a delicious, fermented soybean paste.

The last bullet point isn’t covered in this Chapter. I just enjoy eating natto, and now I can write it off as a business expense.

Anyway.

The glorious conclusion to all this mayhem is to not only learn some new programming tricks but also proudly spell words aloud by saying “November” instead of “Nancy.”

The NATO Alphabet

Beyond being a handy nickname for anyone named Nathaniel, NATO stands for the North Atlantic Treaty Organization. It’s a group of countries who are members of a mutual defense pact. For example, if another country (I dunno — say, Russia, not for any particular reason) attacks a NATO country like Poland (again, no reason), all the other NATO countries are supposed to gang up to defend Poland. This has scenario has never happened, thankfully.

NATO was stablished after World War II. I could wax on, but the point is that NATO requires some commonality between the member states. You know, so that when Hans is short on ammo, Pierre can offer him bullets and they fit into the gun. Stuff like that.

One common item shared between NATO countries is a way to spell things out loud. That way, Hans doesn’t need to say, “Bullets! That’s B as in bratwurst. U as in über. L as in lederhosen.” And so on. Instead, Hans says: “Bravo, Uniform, Lima, Lima, Echo, Tango.” This way, Pierre can understand Hans over all the gunfire.

Table 1 lists the NATO phonetic alphabet, describing each letter with its corresponding word. The words are chosen to be unique and not easily misunderstood. Two of the words (Alfa and Juliett) are misspelled on purpose to avoid confusion in written form.

Table 1. The NATO phonetic alphabet.

Letter

NATO

Letter

NATO

A

Alfa

N

November

B

Bravo

O

Oscar

C

Charlie

P

Papa

D

Delta

Q

Quebec

E

Echo

R

Romeo

F

Foxtrot

S

Sierra

G

Golf

T

Tango

H

Hotel

U

Uniform

I

India

V

Victor

J

Juliett

W

Whiskey

K

Kilo

X

Xray

L

Lima

Y

Yankee

M

Mike

Z

Zulu

NATO isn’t the only phonetic alphabet, but it’s perhaps the most common. The point is consistency. As programmer, you don’t need to memorize any of these words, though as a nerd you probably will. Still, it’s the program that can output NATO code — or translate it back into words, depending on how you write your C code. Oscar Kilo.

The NATO Translator Program

Any NATO translator program you write must have a string array like the one shown here:

 
 char *nato[] = {
     "Alfa", "Bravo", "Charlie", "Delta", "Echo", "Foxtrot",
     "Golf", "Hotel", "India", "Juliett", "Kilo", "Lima",
     "Mike", "November", "Oscar", "Papa", "Quebec", "Romeo",
     "Sierra", "Tango", "Uniform", "Victor", "Whiskey",
     "Xray", "Yankee", "Zulu"
 };
  

The array’s notation, *nato[], implies an array of pointers, which is how the compiler builds this construction in memory. The array’s data type is char*, so the pointers reference character arrays — strings — stored in memory. The classification is constant, const; you do not want to mess with any strings declared this way — which is fine in this code. The *nato[] array is filled with these addresses, the memory locations of the strings, as illustrated in Figure 1.


Figure 1. How an array of pointers references strings as they sit in memory.


For example, in the figure, the string Alfa (terminated with a null character, \0) is stored at address 0x404020. This memory location is stored in the nato[] array, not the string itself. Yes, the string appears in the array’s declaration, but it’s stored elsewhere in memory at runtime. The same structure holds true for all elements in the array: Each one corresponds to a string’s memory location, Alfa to Zulu.

The beauty of the nato[] array is that the contents are sequential, matching up to ASCII values 'A' through 'Z'. This coincidence makes extracting the character corresponding to the NATO word embarrassingly easy.

Writing the NATO translator

A simple NATO translator is shown in Listing 1. It prompts for input, using the fgets() function to gather a word from standard input. A while loop churns through the word letter by letter. Along the way, any alphabetic characters are detected by the isalpha() function. If found, the letter is used as a reference into the nato[] array. The result is the NATO phonetic alphabet term output.

Listing 1 nato01.c

 
 #include <stdio.h>
 #include <ctype.h>
  
 int main()
 {
     const char *nato[] = {
         "Alfa", "Bravo", "Charlie", "Delta", "Echo", "Foxtrot",
         "Golf", "Hotel", "India", "Juliett", "Kilo", "Lima",
         "Mike", "November", "Oscar", "Papa", "Quebec", "Romeo",
         "Sierra", "Tango", "Uniform", "Victor", "Whiskey",
         "Xray", "Yankee", "Zulu"
     };
     char phrase[64];
     char ch;
     int i;
  
     printf("Enter a word or phrase: ");
     fgets(phrase,64,stdin);    #A
  
     i = 0;
     while(phrase[i])    #B
     {
         ch = toupper(phrase[i]);    #C
         if(isalpha(ch))    #D
             printf("%s ",nato[ch-'A']);    #E
         i++;
         if( i==64 )    #F
             break;
     }
     putchar('\n');
  
     return(0);
 }
  

#A Store into location phrase 63 characters (plus the null character) from stdin, standard input

#B Loop until the null char is found in the string

#C Convert ch to uppercase

#D True when character ch is alphabetic

#E ch-‘A’ transforms the letters to values 0 through 25, matching the corresponding array element. This expression works because char variables are considered integers.

#F A long string may not have a null character, so bail when the buffer size is reached

When built and run, the program prompts for input. Whatever text is typed is translated and output in the phonetic alphabet. For example, “Howdy” becomes:

Hotel Oscar Whiskey Delta Yankee

Typing a longer phrase, such as “Hello, World!” yields:

Hotel Echo Lima Lima Oscar Whiskey Oscar Romeo Lima Delta

Because non-alpha characters are ignored in the code, no output for them is generated.

Translation into another phonetic alphabet is easy with this code. All you do is replace the nato[] array with your own phonetic alphabet. For example, here is the array you can use for the law enforcement phonetic alphabet:

 
 const char *fuzz[] = {
     "Adam", "Boy", "Charles", "David", "Edward", "Frank",
     "George", "Henry", "Ida", "John", "King", "Lincoln",
     "Mary", "Nora", "Ocean", "Paul", "Queen", "Robert",
     "Sam", "Tom", "Union", "Victor", "William",
     "X-ray", "Young", "Zebra"
 };
  

Reading and converting a file

I’m unsure of the need to translate all the text from a file into the NATO phonetic alphabet. It’s a C project you can undertake, primarily for practice, but practically speaking, it makes little sense. I mean it would be tedious to hear three hours of Anthony and Cleopatra done entirely in the NATO alphabet, though if you’re a theatre/IT dual major, give it a shot.

Listing 2 presents code that devours a file and translates each character into its NATO phonetic alphabet counterpart. The filename is supplied at the command prompt. If not, the program bails with an appropriate error message. Otherwise, similar to the code in nato01.c, the code churns though the file one character at a time, spewing out the matching NATO words.

Listing 2. nato02.c

 
 #include <stdio.h>
 #include <stdlib.h>
 #include <ctype.h>
  
 int main(int argc, char *argv[])
 {
     const char *nato[] = {
         "Alfa", "Bravo", "Charlie", "Delta", "Echo", "Foxtrot",
         "Golf", "Hotel", "India", "Juliett", "Kilo", "Lima",
         "Mike", "November", "Oscar", "Papa", "Quebec", "Romeo",
         "Sierra", "Tango", "Uniform", "Victor", "Whiskey",
         "Xray", "Yankee", "Zulu"
     };
     FILE *n;
     int ch;
  
     if( argc<2 )    #A
     {
         fprintf(stderr,"Please supply a text file argument\n");
         exit(1);
     }
  
     n = fopen(argv[1],"r");    #B
     if( n==NULL )
     {
         fprintf(stderr,"Unable to open '%s'\n",argv[1]);
         exit(1);
     }
  
     while( (ch=fgetc(n))!=EOF )     #C
     {
         if(isalpha(ch))     #D
             printf("%s ",nato[toupper(ch)-'A']);     #E
     }
     putchar('\n');
  
     fclose(n);
  
     return(0);
 }
  

#A If fewer than two arguments are present, the filename option is missing

#B Open the filename supplied at the command prompt, referenced as argv[1]

#C Read one character at a time from the file, storing it in variable ch. The EOF marks the end of the file

#D Process only text characters

#E Use the uppercase version of the character, minus the value of ‘A’ to index the nato[] array

Remember to use integer variables when processing text from a file. The EOF flag that marks the end of a file is an int value, not a char value. The while statement in the code is careful to extract a character from the file as well as evaluate the character to determine when the operation is over.

To run the program, type a filename argument after the program name. Text files are preferred. The output appears as a single line of text reflecting the phonetic alphabet words for every dang doodle character in the file.

For extra fun on the Macintosh, pipe the program’s output through the say command:

 
 nato02 anthony_and_cleopatra.txt | say
  

This way, the phonetic alphabet contents of the file given are read aloud by the Mac, start to end. Sit back and enjoy.

From NATO to English

Phonetic alphabet translation is supposed to happen in your head. Someone spells their hometown: India Sierra, Sierra, Alfa, Quebec, Uniform, Alfa, Hotel. And the listener knows how to write down the word, spelling it properly. The word is Issaquah, which is a city where I once lived. I had to spell the name frequently. The beauty of this operation is that even a person who doesn’t know the NATO alphabet can understand what’s being spelled, thanks to the initial letter.

More difficult, however, is to write code that scans for phonetic alphabet words and translates them into the proper single characters. This process involves parsing input and examining it word-by-word to see whether one of the words matches a term found in the lexicon.

Converting NATO input to character output

To determine whether a phonetic alphabet term appears in a chunk of text, you must parse the text. The string is separated into word-chunks. Only after you pull out the words can you compare them with the phonetic alphabet terms.

To do the heavy lifting, use the strtok() function to parse words in a stream of text. I assume the function name translates as “string tokenizer” and “string to kilograms,” which makes no sense.

The strtok() function parses a string into chunks based on one or more separator characters. Defined in the string.h header file, the man page format is:

 
 char *strtok(char *str, const char *delim);
  

The first argument, str, is the string to scan. The second argument, delim, is a string containing the individual characters that can separate, or delimit, the character-chunks you want to parse. The value returned is a char pointer referencing the character chunk found. For example:

 
 match = strtok(string," ");
  

This statement scans characters held in buffer string, stopping when the space character is encountered. Yes, the second argument is a full string, even when only a single character is required. The char pointer match holds the address of the word (or text chunk) found, terminated with a null character where the space or another delimiter would otherwise be. The NULL constant is returned when nothing is found.

To continue scanning the same string, the first argument is replaced with the NULL constant:

 
 match = strtok(NULL," ");
  

The code shown in Listing 3 illustrates how to put the strtok() function to work.

Listing 3. word_parse01.c

 
 #include <stdio.h>
 #include <string.h>
  
 int main()
 {
     char sometext[64];
     char *match;
  
     printf("Type some text: ");
     fgets(sometext,64,stdin);
  
     match = strtok(sometext," ");    #A
     while(match)    #B
     {
         printf("%s\n",match);
         match = strtok(NULL," ");    #C
     }
  
     return(0);
 }
  

#A The initial call to strtok(), with the string to search.

#B Loop as long as the return value isn’t NULL.

#C In the second call to strtok(), NULL is used to keep searching the same string.

In this code, the user is prompted for a string. The strtok() function extracts words from the string, using a single space as the separator. Here’s a sample run:

 
 Type some text: This is some text
 This
 is
 some
 text
  

When separators other than the space appear in the string, they’re included in the character-chunk match:

 
 Type some text: Hello, World!
 Hello,
 World!
  

To avoid capturing the punctuation characters, you can set this delimiter string:

 
 match = strtok(sometext," ,.!?:;\"'");
  

Here, the second argument lists common punctuation characters. The result is that the delimited words are truncated, as in:

 
 Type some text: Hello, World!
 Hello
 World
  

You may find some trailing blank lines in the program’s output. These extra newline characters are fine for matching text, as the blank lines won’t match anything anyhow.

To create a phonetic alphabet input translator, you modify this code to perform string compares with an array of NATO phonetic alphabet terms. The strcmp() function handles this task, but two things must be considered.

First, strcmp() is case-sensitive. Some C libraries feature a strcasecmp() function that performs case-insensitive comparisons, though this function isn’t part of the C standard. (For example, the MinGW compiler lacks this function.)

Second, the string length may vary. For example, if you choose not to count the punctuation characters (" ,.!?:;\"'") in the strtok() function — or when an unanticipated punctuation character appears — the comparison fails.

Given these two situations, I figure it’s best to concoct a unique string comparison function, one designed specifically to check parsed words for a match with a phonetic alphabet term. This function, isterm(), is shown in Listing 4.

Listing 4. The isterm() function

 
 char isterm(char *term)
 {
     const char *nato[] = {
         "Alfa", "Bravo", "Charlie", "Delta", "Echo", "Foxtrot",
         "Golf", "Hotel", "India", "Juliett", "Kilo", "Lima",
         "Mike", "November", "Oscar", "Papa", "Quebec", "Romeo",
         "Sierra", "Tango", "Uniform", "Victor", "Whiskey",
         "Xray", "Yankee", "Zulu"
     };
     int x;
     const char *n ;
     char *t;
  
     for( x=0; x<26; x++)
     {
         n = nato[x];    #A
         t = term;    #B
         while( *n!='\0' )    #C
         {
             if( (*n|0x20)!=(*t|0x20) )    #D
                 break;    #E
             n++;    #F
             t++;    #F
         }
         if( *n=='\0' )    #G
             return( *nato[x] );    #H
     }
     return('\0');
 }
  

#A Set pointer n to the current NATO word

#B Pointer t references the term passed

#C Loop until the NATO term ends

#D Logically convert each letter to uppercase and compare. Refer to Chapter 5 for more info on this and other ASCII tricks.

#E For no match, the loop breaks and the next term in nato[] is compared

#F Increment through each letter

#G When pointer n is the null character, the terms have matched

#H Return the first letter of the NATO term

The isterm() function accepts a word as its argument. The return value is a single character if the word matches a NATO phonetic alphabet term; otherwise, the null character is returned.

To create a new NATO translation program, add the isterm() function to your source code file. You must include both the stdio.h and string.h header files. Then add the following main() function to build a new program, nato03.c:

Listing 5. nato03.c, main() function

 
 int main()
 {
     char phrase[64];
     char *match;
     char ch;
  
     printf("NATO word or phrase: ");
     fgets(phrase,64,stdin);
  
     match = strtok(phrase," ");
     while(match)
     {
         if( (ch=isterm(match))!='\0' )
             putchar(ch);
         match = strtok(NULL," ");
     }
     putchar('\n');
  
     return(0);
 }
  

The code scans the line input for any matching phonetic alphabet terms. The isterm() function handles the job. The matching character is returned and output. Here’s a sample run:

 
 NATO word or phrase: india tango whiskey oscar romeo kilo sierra
 ITWORKS
  

An input sentence with no matching characters outputs a blank line. Mixed characters are output like this:

 
 NATO word or phrase: Also starring Zulu as Kono
 Z
  

If you want to add in code to translate special characters, such as punctuation characters, you can do so on your own. Keep in mind that the NATO phonetic alphabet lacks terms with punctuation. Though if you’re creating your own text-translation program, checking for special characters might be required.

That’s all for this article. If you want to learn more about the book, check it out on Manning’s liveBook platform here.