The Linux Modem HOWTO: Serial Port & Modem Basics

2. Serial Port & Modem Basics

You don't have to understand the basics to use and install a modem. But understanding it may help to determine what is wrong if you run into problems. After reading this section, if you want to understand it even better you may want to see How Modems Work in this document (not yet complete). A future version of Serial-HOWTO (expected by Jan. 1999). should cover more on the serial port itself.

2.1 Modem Converts Digital to Analog (and conversely)

Most all telephone main lines are digital already but the lines leading to your house (or business) are usually analog which means that they were designed to transmit a voltage wave which is an exact replica of the sound wave coming out of your mouth. Such a voltage wave is called "analog". If viewed on an oscilloscope it looks like a sine wave of varying frequency and amplitude. A digital signal is like a square wave. For example 4 v (volts) might be a 1-bit and 0 v could be a 0-bit. For most serial ports (used by external modems) +12 v is a 0-bit and -12 v is a 1-bit (some are + or - 5 v).

To send data from your computer over the phone line, the modem takes the digital signal from your computer and converts it to "analog". It does this by both creating an analog sine wave and then "MODulating" it. Since the result still represents digital data, it could also be called a digital signal instead of analog. But it looks something like an analog signal and almost everyone calls it analog. At the other end of the phone line another modem "DEModulates" this signal and the pure digital signal recovered. Put together the "Mod" and "Dem" parts of the two words above and you get "modem" (if you drop one of the two D's). A "modem" is thus a MODulator-DEModulator. Just what modulation is may be found in the section Modulation Details.

2.2 What is a Serial Port ?

Introduction

Since modems have a serial port between them and the computer, it's necessary to understand the serial port as well as the modem. The serial port is an I/O (Input/Output) device. Most PC's have two serial ports and each has a 9-pin connector on the back of the computer. Computer programs can send data (bytes) to the transmit pin and receive bytes from the receive pin. The other pins are for control purposes and ground.

The serial port is much more than just a connector. It converts the data from the parallel to serial and changes the electrical representation of the data. Inside the computer, data bits flow in parallel (using many wires at the same time). Serial flow is a stream of bits over a single wire (such as on the transmit or receive pin of the serial connector). For the serial port to create such a flow, it must convert data from parallel (inside the computer) to serial (and conversely).

Pins and Wires

Old PC's used 25 pin connectors but only about 9 pins were actually used so today most connectors are only 9-pin. Each of the 9 pins connects to a wire. Besides the two wires used for transmitting and receiving data, another pin (wire) is signal ground. The voltage on any wire is measured with respect to this ground. There are still more wires which are for control purposes (signalling) only and not for sending bytes. All of these signals could have been sent on a single wire, but instead, there is a separate dedicated wire for every type of signal. Some (or all) these control wires are called "modem control lines". Modem control wires are either in the asserted state (on) of +12 volts or in the negated state (off) of -12 volts. There is a wire to signal the computer to stop sending bytes to the modem. Conversely, another wire signals the modem to stop sending bytes to the computer. Other wires may tell the modem to hang up the telephone line or tell the computer that a connection has been made or that the telephone line is ringing (someone is attempting to call in).

Internal Modems

For an internal modem there is no 9-pin connector but the behavior is exactly as if the above mentioned cable wires existed. Instead of a a 12 volt signal in a wire giving the state of a modem control line, the internal modem may just use a status bit in its memory to determine the state of this non-existent "wire". The internal modem's serial port looks just like a real serial port to the computer. It even includes the speed limits that one may set at ordinary serial ports such as 115200 bits/sec. Unfortunately today, many internal modems use MS Windows software to do their job and will not work under Linux. See Avoid: Winmodems.

2.3 Address & IRQ

Since the computer needs to communicate with each serial port, the operating system must know that the serial port exists, where it is (its I/O address) and what wire (IRQ number) the serial port is to use to request service from the computer's CPU. Thus every serial port device must store in its non-volatile memory both its I/O address and its Interrupt ReQuest number: IRQ. The IRQ determines what wire is used to request service using interrupt signals. See Interrupts.

The serial ports are labeled ttyS0, ttyS1, etc. (corresponding to COM1, COM2, etc. in DOS). Which one of these names refers to certain physical serial port is determined by the I/O address stored inside the hardware chip of the physical port. This mapping of names (such as ttyS1) to I/O addresses and IRQ's may be set by the "setserial" command. What is Setserial. This does not set the I/O address and IRQ on the hardware itself (which is set by jumpers or by plug-and-play).

2.4 Interrupts

Bytes come in over the phone line to the modem, are converted from analog to digital by the modem and passed along to the serial port on their way to their destination inside your computer. When the serial port gets a byte (or sometimes after it gets say 8 bytes) it signals the CPU to fetch it (them) by sending an electrical signal known as an interrupt on a dedicated conductor.

Each interrupt conductor (inside the computer) has a number (IRQ) and the serial port must know which conductor to use to signal on. For example, ttyS0 normally uses IRQ number 4 known as IRQ4 (or IRQ 4). A list of them and more will be found in "man setserial" (search for "Configuring Serial Ports"). Interrupts are issued whenever the serial port needs to get the CPU's attention. It's important to do this in a timely manner since the buffer inside the serial port can hold only 16 (1 in old modems) incoming bytes. If the CPU fails to remove such received bytes promptly, then there will not be any space left for any more incoming bytes and overflow will occur (bytes will be lost). There is no flow control to prevent this.

Interrupts are also issued when the serial port has just sent out all of the bytes in its small transmit buffer to the modem. It then has space for more outgoing bytes. The interrupt to notifies the CPU of that fact. Also, when a modem control line changes state an interrupt is issued.

Interrupts convey a lot of information but only indirectly. The interrupt itself just tells a chip called the interrupt controller that a certain serial port needs attention. The interrupt controller then signals the CPU. The CPU runs a special program to service the serial port called an interrupt service routine (part of the serial driver software). It tries to find out what has happened at the serial port and then deals with the problem such a transferring bytes from (or to) the serial port's hardware buffer. This program can easily find out what has happened since the serial port has registers at I/O addresses known to the the serial driver software. These registers contain status information about the serial port. The software reads these registers and by inspecting the contents, finds out what has happened and takes appropriate action.

2.5 Data Compression (by the Modem)

Before continuing with the basics of the serial port, one needs to understand about something done by the modem: data compression. In some cases this task is actually done by software run on the computer's CPU but unfortunately at present, such software only works for MS Windows. The discussion here will be for the case where the modem itself does the compression since this is what must happen under Linux.

In order to send data faster over the phone lines, one may compress (encode it) using a custom encoding scheme which is different for each chunk (often an entire file) of data. The encoded data is smaller than the original (less bytes) and can be sent over the Internet in less time. This process is called "data compression".

If you download files from the Internet, they are likely already compressed and it is not feasible for the modem to try to compress them further. Your modem may sense that what is passing thru has already been compressed and refrain from trying a compress it any more. If you are receiving data which has been compressed by the other modem, your modem will decompress it and create many more bytes than were sent over the phone line. Thus the flow of data from your modem into your computer will be higher than the flow over the phone line to you. The ratio of this flow is called the compression ratio. Compression ratios as high as 4 are possible, but not very likely.

2.6 Error Correction

Similar to data compression, modems may be set to do error correction. While there is some overhead cost involved which slows down the byte/sec flow rate, the fact that error correction strips off start and stop bits actually increases the data byte/sec flow rate.

For the serial port's interface with the external world, each 8-bit byte has 2 extra bits added to it: a start-bit and a stop-bit. Without error correction, these extra stop and stop bits usually go right thru the modem and out over the phone lines. But when error correction is enabled, these extra bits are stripped off and the 8-bit bytes are put into packets. This is more efficient and results in higher byte/sec flow in spite of the fact that there are a few more bytes added for packet headers and error correction purposes.

2.7 Data Flow (Speeds)

Data (bytes representing letters, pictures, etc.) flows from your computer to your modem and then out on the telephone line (and conversely). Flow rates (such as 56k (56000) bits/sec) are (incorrectly) called "speed". But almost everyone says "speed" instead of "flow rate". If there were no data compression the flow rate from the computer to the modem would be about the same as the flow rate over the telephone line.

Actually there are two different speeds to consider at your end of the phone line:

The speed on the phone line itself (DCE speed) modem-to-modem
The speed from your computer's serial port to your modem (DTE speed)

When you dial out and connect to another modem on the other end of the phone line, your modem often sends you a message like "CONNECT 28800" or "CONNECT 115200". What do these mean? Well, its either the DCE speed or the DTE speed. If it's higher than the advertised modem speed it must be the DTE modem-to-computer speed. This is the case for the 115200 speed shown above. The 28800 must be a DCE (modem-to-modem) speed since the serial port has no such speed. One may configure the modem to report either speed. Do some modems report both speeds ??

If you have an internal modem you would not expect that there would be any speed limit on the DTE speed from your modem to your computer since you modem is inside your computer and is almost part of your computer. But there is since the modem contains a dedicated serial port within it.

It's important to understand that the average speed is often less than the specified speed, especially on the short DTE computer-to-modem line. Waits (or idle time) result in a lower average speed. These waits may include long waits of perhaps a second due to Flow Control. At the other extreme there may be very short waits (idle time) of several micro-seconds between bytes. In addition, modems will fallback to lower speeds if the telephone line conditions are less than pristine.

2.8 Flow Control

Flow control means the ability to stop the flow of bytes in a wire. It also includes provisions to restart the flow without any loss of bytes. Flow control is needed for modems to allow a jump in flow rates.

Flow Control Explained by an Example

For example, consider the case where your 36.6k modem is not doing any data compression or error correction, you have set the serial port speed to 115,200 bits/sec (bps), and you are sending data from your computer to the phone line. Then the flow from the your computer to your modem is at 115.2k bps. However the flow from your modem out the phone line is at best only 33.6k bps. Since a faster flow (115.2k) is going into your modem than is coming out of it, the modem is storing the excess flow (115.2k -33.6k = 81.6k) in one of its buffers. This buffer would eventually overflow (run out of storage space) unless the 115.2k flow is stopped.

But now flow control comes to the rescue. When the modem's buffer is almost full, the modem sends a stop signal to the serial port. The serial port passes on the stop signal to the device driver and the 115.2k bps flow is halted. Then the modem continues to send out data at 33.6k bps drawing on the data it previous accumulated in its buffer. Since nothing is coming into the buffer, the level of bytes in it starts to drop. When almost no bytes are left in the buffer, the modem sends a start signal to the serial port and the 115.2k flow from the computer to the modem resumes. In effect, flow control creates an average flow rate (in this case 33.6k) which is significantly less than the "on" flow rate of 115.2k bps. This is "start-stop" flow control.

The above is an example of flow control for flow from the computer to the modem , but there is also flow control which is used for the opposite direction of flow: from a modem to a computer. This is the essence of flow control but there are many more details to explain. More details on this topic may eventually put into the Serial-HOWTO.

Symptoms of No Flow Control

Understanding flow-control theory can be of practical use. For example I used my modem to access the Internet and it seemed to work fine. But after a few months I tried to send long files from my PC to an ISP and a huge amount of retries and errors resulted (but eventually kermit could send a long file after many retries). Receiving in the other direction (from my ISP to me) worked fine. The problem turned out to be a hardware defect in my modem that had resulted in disabling flow control. My modem's buffer was overflowing on long outgoing files since no "stop" signal was ever sent to the computer to halt sending to the modem. There was no problem in the direction from the modem to my computer since the capacity (say 115.2k) was always higher than the flow over the telephone line. The fix was to enable flow control by putting an enable-flow-control command for the modem last in the init string.

Hardware vs. Software Flow Control

For modems, it's best to use "hardware" flow control that uses two dedicated "modem control" wires to send the "stop" and "start" signals. Software flow control uses the main receive and transmit wires to send the start and stop signals. It uses the ASCII control characters DC1 (start) and DC3 (stop) for this purpose. They are just inserted into the regular stream of data. Software flow control is not only slower in reacting but also does not allow the sending of binary data thru the modem which will likely contain the control characters DC1 and DC3 used for flow control.

Modem-to-Modem Flow Control

This is the flow control of the data sent over the telephone lines between two modem. Practically speaking, it only exists when you have error correction set. Actually, there is a command to enable software flow control between modems but it will interfere with sending binary data so it's not often used.

2.9 Data Flow Path; Buffers

Although much has been explained about this including flow control, a pair of 16-byte serial port buffers (in the hardware), and a pair of buffers inside the modem, there is still another pair of buffers. These are large buffers in main memory also known as serial port buffers. When an application program sends bytes to the serial port (and modem), they first get stashed in the the transmit serial port buffer in main memory. The size of this buffer is about 8k. The pair consists of both this buffer and a receive buffer for the opposite direction of byte-flow.

The serial device driver takes out bytes from this transmit buffer, one byte at a time and puts them into the small transmit buffer in the serial hardware for transmission. Once in that transmit buffer, there is no way to stop them from being transmitted. They are then transmitted to the modem which also has a fair sized buffer. When the device driver (on orders from flow control) stops the flow of outgoing bytes from the computer, what it actually stops is the flow of outgoing bytes from the transmit buffer. Even after this has happened and the flow to the modem has stopped, an application program may keep sending bytes to the 8k transmit buffer until it becomes fill. When it gets fill, the application program can't send any more bytes to it (a "write" statement in a C_program blocks) and the application program temporarily stops running and waits until some buffer space becomes available. Thus a flow control "stop" is ultimately able to stop the programs that is sending the bytes.

2.10 Complex Flow Control Example

For many situations, there is a transmit path involving a several links, each with its own flow control. For example, I type at a text-terminal connected to a PC with a modem to access a BBS. For this I use the application program "minicom" which deals with 2 serial ports: one connected to a modem and another connected to a text-terminal. What I type at the text terminal goes into the first serial port to minicom, then from minicom out the second serial port to the modem, and then onto the telephone line to the BBS. The text-terminal has a limit to the speed at which bytes can be displayed on its screen and issues a flow control "stop" from time to time to slow down the flow. What happens when such a "stop" is issued? Let's consider a case where the "stop" is long enough to get thru to the BBS and stop the program at the BBS which is sending out the bytes.

Let's trace out the flow of this "stop" (which may be "hardware" on some links and "software" on others). First, suppose I'm "capturing" a long file from the BBS which is being sent simultaneously to both my text-terminal and a to file on my hard-disk. The bytes are coming in faster than the terminal can handle them so it sends a "stop" out its serial port to the first serial port on my PC. The device driver detects it and stops emptying the 8k serial buffer in main memory to which minicom has been sending bytes to. When this 8k transmit buffer (on the first serial port) is full, minicom stops writing to it. But this also causes minicom to stop reading from the 8k receive buffer on the 2nd serial port connected to the modem. Flow from the modem continues until this 8k buffer too fills up and sends a different "stop" to the modem. Now the modem's buffer ceases to send to the serial port and also fills up. The modem (assuming error correction is enabled) sends a "stop signal" to the other modem at the BBS. This modem stops sending bytes out of its buffer and when this buffer gets fill, another stop signal is sent to the serial port of the BBS. At the BBS, the 8-k (or whatever) buffer fills up and the program at the BBS temporarily halts.

Thus a stop signal from a text terminal has halted a programs on a BBS computer. What a Rube Goldberg sequence of events! Note that the stop signal passed thru 4 serial ports, 2 modems, and one application program (minicom). Each serial port has 2 buffers (in one direction of flow): the 8k one and the hardware 16-byte one. The application program may have a buffer in its C_code. This adds up to 11 different buffers the data is passing thru. Since the small serial hardware buffers do not participate directly in flow control no mention of them was made in the previous paragraph.

If the terminal speed limitation is the bottleneck in the flow from the BBS to the terminal, then its flow control "stop" is actually stopping the program that is sending from the BBS as explained above. But you may ask, how can a "stop" last so long that 11 buffers (some of them large) all get filled up? It can actually happen this way if all the buffers were near their upper limits when the terminal sent out the "stop".

But if you were to run a simulation on it you would discover that it's usually more complicated than this. At an instant of time some links are flowing and others are stopped (due to flow control). A "stop" from the terminal seldom propagates back to the BBS neatly as described above. It may take a few "stops" from the terminal to result in a "stop" at the BBS. To understand what is going on you really need to observe a simulation which can be done for a simple case with coins on a table. Use only a few buffers and set the upper level for each buffer at only a few coins.

Does one really need to understand all this? Well, understanding this explained to me why capturing text from a BBS was loosing text. The situation was exactly the above example but modem-to-modem flow control was disabled. Chunks of captured text that were supposed to get to the hard-disk never got there because of an overflow at the modem buffer due to flow control "stops" from the terminal. Even though the BBS had a flow path to the hard-disk without bottlenecks, the overflow due to the terminal happened on this path and chunks of text were lost.

2.11 Modem Commands

Commands to the modem are sent to it from the communication software over the same conductor as used to send data. The commands are short ASCII strings. Examples are "AT&K3" for enabling hardware flow control (RTS/CTS) between your computer and modem; and "ATDT5393401 for Dialing the number 5393401. Note all commands are prefaced by "AT". Some commands such as enabling flow control help configure the modem. Other commands such as dialing a number actually do something. There are about a hundred or so different possible commands. When your communication software starts running it first sends an "init" string of commands to the modem to configure it. All commands are sent on the ordinary data line before the modem dials (or receives a call).

Once the modem is connected to another modem (on-line mode), everything that is sent from your computer to your modem goes directly to the other modem and is not interpreted by the modem as a command. There is a way to "escape" from this mode of operation and go back to command mode where everything sent to the modem will be interpreted as a command. The computer just sends "+++" with a specified time spacing before and after it. If this time spacing is correct, the modem reverts to command mode. Another way to do this is by a signal on a certain modem control line.

There are a number of lists of modem commands on the Internet. Web Sites has links to a couple of such web-sites. Different models and brands of modems have slight (but sometimes critical) differences in such commands. A few common command are listed in this HOWTO: Other Modem Commands