Using Termcap – part 3

See also: Part 1 and Part 2.

The termcap file is a text file found in /etc/termcap on a Unix system. As it was ported to other operating systems, the default location would change accordingly. The contents of the termcap file is basically a series of entries for each terminal supported. Each entry contains various elements separated by colons. To make the file more readable, a backslash can be used at the end of a line to break it up.

See the Wikipedia page for a summary, or the GNU Software page for a more descriptive summary… Or see this for a vague summary:

For example:

2characterterminalname|longerterminalname|verboseterminalname:
  :capability=characetersqeuence:
  :capability=charactersequence:

The 2 character name is legacy and is no longer used, but remains for ancient backwards compatibility. For a DEC VT-52 terminal, it might look like this:

dw|vt52|DEC vt52:
  :cr=^M:do=^J:nl=^Jbl=^G:
  ...etc...

Each capability has a two character abbreviation. Above, we see that to generate a carriage return (cr) we send a control-M (enter key). A new line (nl) is ^J. The bell (bl) is a ^G (beep!). There are many other simple codes.

For moving the cursor position, a DEC VT52 terminal used the sequence: ESCAPE followed by [ (left bracket) followed by line followed by semicolon followed by column followed by H.

ESCAPE [ 10 ; 4 H

That would mean move the cursor to line 10, column 4. To represent sequences like this with variables inside of them (line, column, etc.), there are more complex termcap entries:

:cm=E%i%d;%dH:

Above, E represents ESCAPE (just like ^ represents CONTROL). %i is a special flag that means “increment the two values supplied” (base 1 numbering) then the two %ds are the variables similar to a C printf() call.

The %i is because termcap assumes base 0, so an 80 column screen would be 0-79. The VT terminal (and PC ANSI, I think) assume base 1, 1-80, so to make it universal, all termcap applications expect a screen that is base 0 (0-79) and the entry knows whether or not to output 0-79 or 1-80. Fun.

Termcap has pages of codes for all kinds of features, like cursor up, delete line, clear screen, clear to end of line, etc. If a terminal does not support a feature, the entry is not present. Applications that use termcap will query these capabilities then use what they can. In my situation, I needed “cm” for cursor movement — and if that feature was not there, I couldn’t work (or, better, I could default to a mode of just lines of text).

There are more advanced features where a termcap entry can reference another entry. For instance, there were series of terminals made and as new models came out, they added new features but maintained the earlier ones as well. The first version terminal would have an entry, then the “v2” terminal would have an entry that described the new features, but by adding a capability of “tc=terminal-v1” or whatever, it would get any other capabilities from the “terminal-v1” entry.

This cuts down on redundant information but also means you can’t just look at one termcap entry and necessarily know everything the terminal does. If you were writing your own code to parse a termcap file, you would have to take this in to consideration.

In a C program that will be linking to the termcap library, to load the terminal type you want, you need a buffer for it to be loaded (2K is the defined max size):


char term_buffer[2048];

…and then you just use the termcap tgetent() function:


status = tgetent(term_buffer, "ANSI");

If the termcap file is found, and there is an entry called “ANSI”, it will be copied in to the term_buffer. By checking for errors (always a good idea), you will know if the entry was not found.

But hard coding is bad. What if this code ever runs on a non-ANSI terminal? Termcap programs typically read the TERM environment variable, then get whatever that is set to. In windows you might “set TERM=ansi” and on Linux you might “export TERM=vt100”. Then the C program would query that environment variable first:


char termtype;
termtype = getenv("TERM");
if (termtype==NULL) { / handle error if env var not set */ }

termtype will come back pointing to whatever the TERM environment variable is set to (“ANSI” in the windows example above, or “vt100” in the Linux example above). Then the tgetent() is done using that response:


status = tgetend(term_buffer, termtype);

If both of those are successful, the individual capabilities can be loaded using the tgetstr() function. tgetstr() will parse capabilities in the loaded termcap entry and write them to a buffer that is processed to be the actual output (less any variables that get substituted when the actual sequence is used later). For instance, the termcap entry might say:

:bl=^G:

…but when you use tgetstr() to parse for the “bl” entry, it will write out the control-G (ASCII 7) character in the output buffer. Basically, it converts all the E (escape) and ^X (control) ASCII characters to what they really represent. This saves work later when they are output to the screen.

A second buffer (that must remain around) needs to be allocated to store the resulting output. Most examples also do a 2K buffer:


char capbuff[2048]; // output sequences are stored here

Then, as each capability is obtained, a pointer is passed in to where the output should be written, and when the call returns, that pointer is advanced to the next place in the buffer where the next capability will go. As tgetstr() is called over and over, the pointer increments filling up the output buffer with entries, and returning the location where each one the user cares about is located within that buffer.


char *tempPtr = capbuff; // start out pointing to our output buffer

If you want to know the code that clears the screen, it would be:


char *cl; // clear screen sequence

cl = tgetstr("cl", &tempPtr);

If cl comes back non-NULL, you know have a pointer to the byte sequence that will clear the screen. tempPtr returns with a higher value, so when you get the next capability you use it again:


ce = tgetstr("ce", &tempPtr);

This is repeated over and over for every code you wish to send. You check for NULL to know which capabilities actually exist, so you could write functions like this:


void clearscreen()
{
if (cl==NULL)
{
printf("Sorry, I cannot clear the screen...");
} else {
tputs(cl, 1, outchar);
}
}

And now we see how these pointers get used. The tputs() function is a special output routine that handles padding (time delays for slower terminals) and other features (though it ends up writing the character out using a function you specify — such as outchar() in this example).

For the cursor movement (cm) capability, it uses a special tgoto() function that knows how to substitute the X and Y values:


void setcursor(int x, int y)
{
tputs(tgoto(cm, x, y), 1, outchar);
}

tgoto() processes the cm output string and returns one that has everything set up with the x and y coordinate in it.

By now, you may see where I cam going with this… Read the termcap entry, parse the ones you care about, then create simple functions that output the screen code sequences:

void clearscreen();
void setcursor(int x, int y);
void underlineon();
void underlineoff();

…etc…

In the next installment, I will share with you my very basic and simple 1995 code that let me convert OS-9 L2 (and MM/1 K-Windows) text programs to run under Termcap on any supported type of terminal.

And then I will explain why I decided NOT to use termcap for my current project.

To be continued…

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.