Understanding Personal Computer Hardware: Everything you need to know to be an informed · PC User · PC Buyer · PC Upgrader [1st ed.]
 978-0-387-98531-2;978-1-4684-6419-1

Table of contents :
Front Matter ....Pages i-xxi
Introduction (Steven Roman)....Pages 1-2
Overview (Steven Roman)....Pages 3-22
Bits, Bytes and Words (Steven Roman)....Pages 23-40
The PC Hierarchy (Steven Roman)....Pages 41-72
Motherboards and Buses (Steven Roman)....Pages 73-94
Input/Output (Steven Roman)....Pages 95-108
PC Supporting Systems (Steven Roman)....Pages 109-116
The Microprocessor (Steven Roman)....Pages 117-137
Memory (Steven Roman)....Pages 139-154
Keyboards (Steven Roman)....Pages 155-166
Mice (Steven Roman)....Pages 167-171
Display Monitors (Steven Roman)....Pages 173-186
Display Adaptors (Steven Roman)....Pages 187-202
Device Interfaces: Floppy, IDE and SCSI (Steven Roman)....Pages 203-229
Hard Drives I: Physical Characteristics (Steven Roman)....Pages 231-249
Hard Drives II: Logical Characteristics (Steven Roman)....Pages 251-280
Floppy Drives (Steven Roman)....Pages 281-290
The Parallel Interface (Steven Roman)....Pages 291-298
Printers (Steven Roman)....Pages 299-317
Asynchronous and Synchronous Transmission (Steven Roman)....Pages 319-324
The Serial Interface (Steven Roman)....Pages 325-334
Modems (Steven Roman)....Pages 335-358
Optical Storage (Steven Roman)....Pages 359-368
Back Matter ....Pages 369-440

Citation preview

Uoderstaodioe.

Personal Computer Hardware

Springer Science+Business Media, LLC

Understanding

Personal Computer Hardware Everything you need to know to be an informed • P(; User • P(; Buyer • P(; Upgrader

Steven Roman With 150 IDustrations ~D-ROM Included

.~.

~

Springer

Steven Roman

Cover photograph: David MuirlMaster File.

Library of Congress Cataloging-in-Publication Data Roman, Steven. Understanding personal computer hardware: everything you need to know to be an informed PC user/buyer/upgrader / Steven Roman. p. cm. lncludes bibliographical references and index. ISBN 978-0-387-98531-2 ISBN 978-1-4684-6419-1 (eBook) DOI 10.1007/978-1-4684-6419-1 1. Microcomputers-Popular works. 1. Title. TK7885.4.R65 1998 621.39'I6-dc21 98-17536 Printed on acid-free paper. © 1998 Steven Roman Ali rights reserved. This work consists of a printed book and a CD-ROM packaged with the book, both of which are protected by federal copyright law and international treaty. The book and CD-ROM may not be translated or copied in whole or in part without the written permission of the publisher Springer Science+Business Media, LLC except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. Production managed by Victoria Evarretta; manufacturing supervised by Jeffrey Taub. Photocomposed copy prepared from author's Microsoft Word files.

987654321

ISBN 978-0-387-98531-2

To Donna

Preface

In general, there seem to be two kinds of books on computer "literacy." The first type provides only superficial explanations of very general computer operations-albeit with very pretty pictures. However, these books hardly scratch the surface. While such a book may look impressive on the coffee table, it serves no real practical purpose (it doesn't help the reader make buying decisions, for instance), nor does it satisfy the truly curious. The second type of computer literacy book goes into far more detail than most interested persons would care to see-discussing at length such things as the actual electrical signals and pinouts of computer chips, or including long tables of statistical data on certain types of hardware. These books tend to be written with relatively little regard for those of us who do not carry soldering irons around in our pockets. This book attempts to strike a happy medium between these two extremes. It is for the reasonably intelligent, and definitely curious, person who wants to understand the workings of his or her personal computer, but who doesn't even own a soldering iron! The book is aimed at those who have at least a nodding acquaintance with an IBM-style personal computer, enough to copy a file or use a word processor. While computer experience is not as important as the desire to understand, it is easier to stay motivated after having some experience with personal computers. I do not intend this book to be an encyclopedia. It is definitely for reading, and then hopefully will serve as a future reference. The discussion will be heavily slanted to the operation of more recent PCs. Thus, for instance, while we will discuss the entire line of Intel microprocessors that were commonly used in PCs since their inception in 1981, we will go into some detail only for the Pentium and Pentium Pro microprocessors.

Why Read This Book? This book is devoted to perhaps the single most important and exciting technological event to occur in our lifetimes-the advent of the electronic computer.

viii

Understanding Personal Computers

Throughout history, there have been only a handful of technological events that can be called truly revolutionary. The inventions of the printing press (circa 1450), the steam engine (circa 1700), the internal-combustion engine (circa 1800), the telephone (circa 1870), the automobile (circa 1890), the airplane (circa 1900), the radio (circa 1900) and the television (circa 1925) are examples that come to mind (their dates did not come to mind-I had to look them up). However, since these inventions took place before most of us were born, they are second nature to us, and we cannot imagine what it would be like to live without them. We are now witnessing the beginning of a new technological revolution that will change the course of history. I don't know about you, but this is enough to peak my curiosity. I want to be, if not directly involved in contributing to the revolution, at least knowledgeable about what is happening. It is one thing to be able to use a computer, say to construct a letter, or enter a few numbers into a spreadsheet, or compute the balance in a checking account. This is certainly a valuable skill that everyone who wants to be successful in the modem world will sooner or later need to acquire. This level of familiarity may be likened to the ability to drive a car. Now, it may be argued that knowing how to drive a car is enough, but of course, you might get some disagreement from the person stopped at the side of the road because his or her car refuses to run. Maybe it is just a simple fuse that needs changing. But if the person doesn't know that a car uses fuses and that many cars carry spare fuses in their fuse boxes, then it may as well be a frozen engine! In any case, there is good reason to have more than just a passing acquaintance with the operation of a personal computer. Here are some of the reasons that come to mind: •

Normal human curiosity as to how things work, especially those things that seem like they can't possibly work!



Increased productivity. The more you know about how your computer works, the better you will be able to simplify its operation or squeeze out that extra bit of performance.



Frugality. Sooner or later you will probably be called upon to buy a PC, or to upgrade an existing PC (or even to make the decision whether it is better to upgrade or simply buy a new PC), if not for yourself then perhaps for your children. The more you know about PCs, the less you will be at the mercy of those hordes of computer salespeople, most of whom-it has been my sad

Preface

ix

experience--know almost nothing about what they are saying! Thus, knowledge is money. •

Survival. What if everybody else in your work environment knows more about computers than you do?

A Note About the Experiments I have included a number of do-it-yourself experiments throughout the book. These experiments are designed to allow you to actually get involved in the learning process. I suspect that you may find these experiments more fun than you might fJrst think, but not unless you take the plunge. The only thing you need to do before conducting your fJrst experiment is read the appendix entitled Trying the Experiments. . Be assured that all experiments are short and I will lead you through the steps one-by-one. All you really need to do is follow the instructions. Also, none of the experiments tamper with anything that should not be tampered with. They are primarily confmed to reading memory, or displaying characters on the screen, or fiddling with a floppy diskette. Have fun and become empowered. A Note About Writing Style I dislike books that are written in the first person singular (except for the preface). Perhaps it is just habit on my part, or perhaps it makes me feel that I am somehow not involved in what is going on. In any case, I want you to feel involved, so I shall use the term "we" to refer to you and me. On the other hand, there will be several occasions where I will want to refer to my own PC. I don't much like the sound of "the author's PC" and it sounds stupid to say "our PC" and so I will refer to "my PC." A Very Brief History of Computers The development of modem computing is taking place at what seems like an incredible pace. The first large-scale digital computer was conceived by Professor Howard Aiken of Harvard University in 1937, and was built by ffiM in 1944. It was called the Automatic Sequence Controlled Calculator, and mercifully referred to as the Mark I. This mostly mechanical computer contained more than 750,000 parts, was 51 feet long and weighed over 5 tons! It took about 6 seconds to perform a single multiplication. (A modem Pentium PC can perform this multiplication in about one ten-millionth of a second!) The Mark II was built by Aiken in 1947, and ffiM's SSEC computer was completed in 1948. The top speed of these machines was about 1 multiplication per second.

x

Understanding Personal Computers

The first all-electronic computer (with no mechanical parts) was constructed in the winter of 1944-1945. It was called the Electronic Numerical Integrator and Computer, or ENIAC. The ENIAC could do 300 multiplications per second, but it was still a cumbersome monster, containing about 18,000 vacuum tubes. Machine design continued to improve, and a computer industry developed in the 1950s. By 1960, there were about 5000 computers in existence throughout the world. The invention of the transistor (a miniature electronic switch) in 1947 sparked a revolution in computer design. Transistors were able to replace vacuum tubes in computers during the late 1950s. Since early transistors were about 1/200th the size of vacuum tubes, not only could computers be built on a much smaller scale, but they were much faster, since electricity did not need to travel as far. Computers based on solid-state transistors could do about 100,000 multiplications per second, and were far more reliable than those based on vacuum tubes. In the 1970s, it became possible to place tens of thousands, or even hundreds of thousands of transistors on small silicon chips no larger than the size of a thumbnail. These so-called integrated circuits produced another revolution in computer design, and made possible the world of personal computing. In 1981, IBM introduced its first personal computer. Since then, personal computers, or PCs for short, have gone through roughly five generations, each marked by improved performance and capabilities. Today's personal computers are no larger than a suitcase, and can perform over 100 million multiplications per second! I would like to express my thanks to several product managers and engineers at several companies, who have graciously spent time helping me to clarify some of the technical issues discussed in this book. In particular, my thanks go to: National Semiconductor Corporation, Adaptec, Buslogic, Intel Corporation, Micron Technologies, VESA, EIA, Power Magic, NEC, View sonic and SyQuest.

Chapter Headings

Preface

vii

Contents

xiii

Introduction 1. Overview

1 3

2. Bits, Bytes and Words

23

3. The PC Hierarchy

41

4. Motherboards and Buses

73

5. Input/Output

95

6. PC Supporting Systems

109

7. The Microprocessor

117

8. Memory

139

9. Keyboards

155

10. Mice

167

11. Display Monitors

173

12. Display Adaptors

187

13. Device Interfaces: Floppy, IDE and SCSI

203

14. Hard Drives I: Physical Characteristics

231

15. Hard Drives II: Logical Characteristics

251

16. Floppy Drives

281

17. The Parallel Interface

291

18. Printers

299

xii

Understanding Personal Computers

19. Asynchronous and Synchronous Transmission

319

20. The Serial Interface

325

21. Modems

335

22. Optical Storage

359

Appendix A 1. Trying the Experiments

369

A2. Cache Designs

377

A3. SIMM Chip Counts

381

A4. How Memory Works

385

A5. Real and Protected Modes

393

A6. Sector Translation

403

A7. How Data Is Encoded on a Disk

409

A8. Intel Microprocessor Quick Reference Guide

417

Index

429

Contents

Preface

v

Introduction

1

1. Overview

3

Physical Overview of the PC The System Unit Inside the System Unit Heat Problems The Power Supply The Back of the System Unit Ports Connector Types A Functional Overview of the PC A Peripheral Overview of the PC Connecting the PC Components-Buses Support for Peripherals Hardware Support-The Adaptor/Controller Card The Drawbacks of Built-In Support Software Support-The Driver The 110 Revolution Standards Organizations The PC Hierarchy The Hardware Level The BIOS Level The Operating System Level The Application Level

2. Bits, Bytes and Words Binary Strings Binary Strings as Characters Binary Strings as Numbers

3 4 4 6 7 7 8 8 10 12 13 14 14 15 16 16 17 18 19

20 21 21

23 23 25

29

xiv

Understanding Personal Computers

Binary Strings as Colors Binary Strings as Instructions The Word Length of a Computer Lots of Bytes A Byte of Confusion Binary Strings as Memory Addresses Hexadecimal Notation Converting Between Bases A Quick Peek at Memory Measuring the Performance of PC Components

3. The PC Hierarchy Registers and I/O Ports Microprocessor Registers Parallel Port Registers Addressing Registers The Language of the Microprocessor The System BIOS DOS Services Device BIOS and Device Drivers Summary The Interrupt Vector Table Trapping an Interrupt High-Level Languages More on the BIOS More on DOS Microsoft Windows Also Has Service Routines The PC Startup Process The Disk Boot Process The Built-In Setup Program and CMOS The BIOS Data Area

4. Motherboards and Buses General Remarks About Buses General Bus Types Data, Address and Control Lines Bus Arbitration MUltiplexing Bus Timing Wait States Burst Mode Pipelining The Motherboard Logical View of the Motherboard

30 30 30 31 32 32 33 35 37 38 41 41 42 43 44 45 49 53 55 58 59 61 62 64

64 65 66 67 68 70 73 73 73 74 76 76 77

78 79 80 81 82

Contents

PC Bus Types The ISA Bus The MCA Bus The EISA Bus The Vesa Local Bus (VL Bus) The PCIBus How Bus Speed Affects Performance PCs Are Interrupt-Driven Hardware Interrupts Installing New Devices in a PC Plug-and-Play

5. Input/Output I/O Modules I/O Processors The Mechanism of Communication Memory-Mapped I/O I/O-Mapped I/O CPU Involvement in the I/O Process Programmed I/O Interrupt I/O Direct Memory Access (DMA) Summary More on Ports Viewing Port Usage Parallel Port Confusion

6. PC Supporting Systems Chipsets Clocks And Timers Oscillators The Programmable Interval Timer

7. The Microprocessor Overview of the Intel Family of Microprocessors Intel 8088 Microprocessor Intel 80286 Microprocessor Intel 80386SX and DX Microprocessors Intel 80486SX and DX Microprocessors The Pentium Microprocessor The Pentium Pro Microprocessor The Pentium n Microprocessor Summary MMX

xv

84 84 86 86 87 87 87 89 89 91 93

95 95 98 99 99 100 102 103 103 104 105 105 106 106

109 109 110 110 110

117 117 117 118 118 119 120 121 121 121 122

xvi

Understanding Personal Computers

A Detailed Look at the Pentium Processor Different Models and Their Speeds The Internal View of the Pentium CISC Versus RISC A Closer Look at the Pentium Pro Processor A Closer Look at the Pentium IT Processor External (Level 2) Cache Cache Strategies Write Strategies Other Caches

8. Memory Memory Chips And SIMMs Random Versus Sequential Access Dynamic and Static RAM ROM PROM EPROM EEPROM Flash RAM Nothing Is Perfect VRAM Memory Speed Parity Checking SIMM Packaging SIMM Capacity SIMM Pin Count SIMM Labeling SIMM Chip Count Chip Labeling How Memory Works Logical Memory Organization Conventional Memory Upper Memory High Memory Expanded Memory Extended Memory Memory Blocks Real and Protected Modes Virtual 8086 Mode

9. Keyboards Physical Operation Logical Operation

123 123 125 129 131 133 135 136 136 137

139 139 140 141 141 142 142 142 143 143 143 144 144 145 146 146 147 147 147 148 148 149 149 150 151 152 153 153 154

155 156 157

Contents The Keyboard Buffer The Keyboard Status Bytes

10. Mice Physical Operation Mouse Protocols Mouse Interfaces Resolution

11. Display Monitors Monitor Features Display Size Tube Shape Power Management Radiation Emissions Monitor Controls Anti-Glare Coating Connector Types Display Data Control Monitor Construction Shadow Mask Monitors Aperture Grill Monitors Comparing the Two Display Resolution Higher Resolution Is Not Always Better How Images Are Displayed - Raster Scanning VESA Interlacing Relationship Between Scan Rates and Resolution Multiscanning Monitors

12. Display Adaptors Display Systems Overview of a Video Card Video Memory Text Modes Storing Text in Video Memory Paging ROM Character Generators Graphics Modes Video Memory Organization Describing Colors in Memory Graphics Accelerators and Processors Performance of a Video Card

xvii

158 162 167 168 169 170 171 173 173 173 174 174 174 175 175 175 176 177 177 180 181 181 182 183 184 185 185 186 187 187 188 189 190 191 193 193 194 196 198 199 200

xviii

Understanding Personal Computers

Dual-Ported Video Memory and Bandwidth

13. Device Interfaces: Floppy, IDE And SCSI Interface Philosophies The Floppy Drive Interface The Digital Output Register The IDE Interface Enhanced IDE FastATA Ultra ATA The SCSI Interface Device Classes SCSI Standards A Note About Performance SCSI Cabling SCSI Communication SCSI Drivers

14. Hard Drives I: Physical Characteristics General Operation of a Hard Drive Platters Head Actuators Floating Heads and Parking Heads Bit Density How Read/Write Heads Work Cylinder!Head/Sector Geometry Multiple Zone Recording Drive Performance Issues Physical Issues Logical Issues Low-Level Formatting

15. Hard Drives II: Logical Characteristics Disk Partitioning Primary and Extended Partitions The Master Boot Record and the Partition Table Extended Partition Tables High-Level Formatting and File Systems Making a Partition Bootable Visibility Multiple Primary Partitions Drive Letter Assignments Multi-Boot Configurations The DOS (FAT) File System

201 203 203 208 210 211 212 216 217 218 219 221 223 223 225 228 231 231 232 233 235 237 237 241 243 243 244 245 248 251 251 252 253 258 258 258 259 259 259 260 261

Contents Logical Sector Addressing The Location of Important File System Files The Boot Record DOS Partition Size The Root Directory Windows 95 Long Filename Directory Entries The File Allocation Table (FAT) The Large Cluster Size Problem FAT File System Errors FAT32

16. Floppy Drives The Floppy Disk Floppy Drive Operation Implementation ofa 12-BitFAT Floppy Cables Floppy Drive Alternatives Removable Hard Drives Other Types of Removable Storage CD-Rom Drives Magneto-Optical Drives

17. The Parallel Interface The Parallel Interface Cable and Connectors The Standard Parallel Port Registers Parallel Interface Standards Standard Parallel Port (SPP) Nibble Mode Byte Mode Enhanced Parallel Port (EPP) Extended Capability Port (ECP) Using a Nonstandard Parallel Port Mode

18. Printers Daisy Wheel Printers Dot Matrix Printers Ink Jet Printers Ink Jet Operation Ink Jet Characteristics Laser Printers Laser Printer Control Languages Laser Print Engine Operation Laser Printer Characteristics

19. Asynchronous And Synchronous Transmission

xix

261 262 263 267 268 270 273 277 278 279

281 281 282 284 285 286 288 289 289 289

291 291 292 294 294 296 297 297 297 298

299 299 301 303 304 305 309 309 312 316

319

xx

Understanding Personal Computers

Asynchronous And Synchronous Transmission Asynchronous Data Transfer Parity Check Bit Asynchronous Synchronization Synchronous Data Transmission

20. The Serial Interface Serial Port Address Assignments The Serial Interface TheVART Super I/O Chips The EIA/TIA-232 Protocol PC-to-Modem Communication Flow Control

21. Modems Digital Versus Analog Signals Modems Duplex And Echoplex Amplitude, Frequency And Phase Modulation Types Multibit Modulation Wave Harmonics Telephone Bandwidth Distortion Error Detection Block Parity Checking Data Compression Run Length Encoding Huffman Encoding Measuring Data Transmission Rates Restrictions on Bit Rate lTV Modem Standards V.32 V.32bis V.34 V.42 V.42bis MNP Modem Standards Modes of Operation The Modem Command Set Modem Registers Result Codes Initialization Strings

319 320 321 321 322

325 325 326 327 330 331 331 333

335 335 338 339 340 342 343 345 345 346 347 348 349 350 350 351 352 353 353 354 354 354 354 354 355 355 356 357 357

Contents 22. Optical Storage CD-ROM Drives CD-ROM Read Mechanism CD-ROM Disk Construction Laser Light The CD-ROM Read Operation CD-ROM Characteristics CD-R Drives Magneto-Optical Drives

xxi

359 359 359 360 361 362 364 366 366

Appendix

369

A 1. Trying the Experiments

369 369 370 370 370 375

Hexadecimal Numbers Registers Segmented Addressing Debug QBASIC

A2. Cache Designs Direct-Mapped Caches Associative Caches Set Associative Caches

377 377 378 379

A3. SIMM Chip Counts

381

A4. How Memory Works

385 389 389 390

More on DRAM Refresh Memory Cycle Time Improvements on Ordinary DRAM

A5. Real And Protected Modes Segmented Addressing

A6. Sector Translation Enhanced CHS Addressing-An Example First Enhanced CHS Addressing-The Details Logical Block Addressing (LBA)

393 393 403 403 405 406

A7. How Data Is Encoded on a Disk

409

A8. Intel Microprocessor Quick Reference Guide

417 417 426

Intel Microprocessor Evolution Microprocessor Ratings

Index

429

Introduction

To many people, it seems that computers are miracle workers. For instance, how can a word processor possibly know how to format a document as it is being typed? How can it check your spelling as you type? The answer is simple. A computer has an incredible amount of available time. Could you check the spelling of your document if you paused for, say 10 minutes, between each word that you typed? Of course. Now consider that a modem microprocessor can execute, say 100 million instructions per second. A fast typist can type at, say 100 words per minute. This is about 600 characters per minute, or 1 character every 1110 of a second. It follows that the microprocessor can perform 10 million instructions between characters. That's a lot of instructions to use for such things as on-the-fly spell checking. To give this a human perspective, if each instruction took 112 second to execute, then 10 million instructions would require about 58 days to complete. Imagine how much you could accomplish if you paused 58 days between each character that you typed. You could not only do the spell checking, but you could research the etymology of each word as you typed and even translate it into Papiamento! A computer is a device that does only very simple things with very simple objects. However, it can do them very rapidly-well beyond what humans can effectively imagine. This is one reason why the feats of a computer seem so miraculous. We will do our best to clarify the operation of many aspects of a personal computer, explain some of the plethora of current computer jargon and give you an overall feeling for how a computer works. Keep in mind, however, that very little in the computer world is categorical-things change rapidly, and this is compounded by the fact that terms are often used with subtle (or not so subtle) variation. As an example, we may say that a local bus is a bus that is attached

2

Understanding Personal Computers directly to the microprocessor, but that does not prevent a computer company from designing a bus that is connected only indirectly to the microprocessor, and referring to it as a local bus! Of course, we can only hope that the designers have at least remained true to the purpose of local buses, which is to provide a bighspeed access path to the microprocessor.

1. Overview

The first step in understanding a personal computer, or PC, is to get an overall view of the components, or devices, that make up such a system. This can be done in a variety of ways. We can view the devices strictly from a physical point of view. For instance, some devices are internal to the system unit and some are external. While this is useful (and we will do it in a moment), it is equally important to take a functional viewpoint, grouping devices according to their general purpose or function.

Physical Overview of the PC Figures 1.1 and 1.2 show the most basic externally visible components of a PC. Floppy Drive CD-ROM Drive

Exposed rr===M=o:=ni:,o= = r ===iI Drive Bays

Empty Exposed Bays Hard Drive (not exposed)

System Unit Mouse

Figure 1.1- The basic components of a PC system (tower format)

4

Understanding Personal Computers Mon itor I,F;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;~;-]

py Drive CD-ROM Drive

Printer Keyboard

Mouse

Figure 1.2 - The basic components of a PC system (desktop format)

The System Unit x The system unit houses many of the main components of the computer. System units have two orientations. A vertical orientation is referred to as a tower configuration and a horizontal orientation is referred to as a desktop configuration. There is really no functional difference between the two configurations, with the exception thatJull-sized tower units generally have more room for adding additional components. However, the more popular mini-towers have about the same room as desktop units. With the recent onslaught of exciting new peripheral devices, yours truly wishes he had a full-sized tower unit, although they have become increasingly hard to find in recent times. In any case, all system units have drive bays, which hold devices such as floppy drives, hard drives, CD-ROM drives and tape drives. Some bays are exposed so that the user has access to the device. This is clearly necessary for devices such as floppy drives, which require the insertion and removal of cartridges. On the other hand, a hard drive requires no such access, and can be placed in a non exposed drive bay. The number of exposed and hidden drive bays varies among computer systems, and can be an important purchasing issue, since it may limit future expansion possibilities.

Inside the System Unit X Figure 1.3 shows most of the physical components inside a system unit. Generally speaking, we can identify the following components.

1. Overview

5



The motherboard, or system board, which is a large printed circuit board that forms the connecting backbone of the PC system. Two of the most important parts of a PC are directly connected to the motherboard-the microprocessor (or CPU) and memory. All other PC components are connected through cables, either directly or indirectly, to the motherboard. The motherboard also contains a variety of support chips, to perform various functions. For instance, there is a clock on the motherboard to keep various PC functions synchronized.



Expansion cards, which are printed circuit boards that "plug into" the motherboard, thus in effect, extending the functionality of the motherboard.



The power supply, which supplies power to those devices that require it.



Internal peripheral devices, such as floppy drives, hard disk drives and CDROM players.



If you have ever opened the PC system unit, you will have no doubt encountered a cable or two. The cabling inside a PC can be divided into two groups. Control cables are designed to carry information, including data, addresses and command signals. These are usually wide, flat ribbon cables that contain many individually insulated wires. On the other hand, power cables supply power, from the power supply, to those components that need it. This includes the motherboard itself, all drives and any extra cooling fans. The actual mass of cabling inside a PC is so extensive that it would obscure the view of most of the system, so we have omitted all cabling in Figure 1.3. (The cabling nightmare inside a PC accounts for a significant proportion of the frustration involved in trying to add or upgrade an internal component!)

6

Understanding Personal Computers Memory (SIMMS)

drive

Motherboard Support Chips (Chipset)

Figure 1.3 - Inside view of a system unit

Heat Problems Having mentioned cooling fans, this might be a good time to also mention that heat is the major problem for the electronic components in a Pc. (Moving electrons produce lots of heat.) The power supply in a PC has a cooling fan, designed to help circulate air throughout the case. However, with today's component-packed PCs, this fan is generally not adequate by itself. Many systems now come with at least one other fan, perhaps placed behind an opening in the front of the case, or near the CPU. The microprocessor is one of the major heat-generating components of a PC. Early Pentium microprocessors ran at 5 volts and produced a large amount of heat. Newer Pentiums run at a cooler 3.3 volts. In either case, two devices are often used to keep the microprocessor from overheating (the symptoms of which are erratic behavior). One is the heat sink, which is a hunk of metal that sits in contact with the microprocessor and has a number of thin protruding metal fingers or plates, designed to increase the smface area and thus promote heat dissipation. The other is a C PU fan, which is a small fan that sits atop the microprocessor (or heat sink).

1. Overview

7

It is also worth mentioning that superfluous noise can often be generated by a poor placement of fans in the system unit, creating an air turbulence problem. For instance, if a fan lies just behind a grate in the front cover, it may be dragging air past that grate, thus producing extra noise. Note also that a fan is not the only noise producer inside a system unit. Hard drives spin constantly, and generate a high-pitched noise.

The Power Supply Power supplies are pretty boring, but also of obvious necessity. Power supplies vary in the amount of power (measured in watts) they produce. One assumes that computer manufacturers include a sufficiently powerful unit for the needs of their pes, but if you plan on loading down a system with a lot of extra components, it may be worth looking into a more powerful power supply, or at least making a call to technical support to see if the existing power supply is adequate.

The Back of the System Unit Figure 1.4 shows the rear view of a system unit. Power connector

SCSI adaptor card (50 pin connector)

Lock

Parallel Serial port ports (25 P· connector se In (PS2) and connector 9 pin) (PS2)

Expansion card access Rear of a sound card (with connectors)

Figure 1.4 - Rear view of a system unit

8

Understanding Personal Computers

Ports The back of a system unit has connectors, sometimes called ports, for a variety of peripheral devices. (As we will see, the term port is used in two different senses. The sense here is that of an actual physical connector, but it also refers to a special memory location.) For instance, there is a port for the keyboard, and most new PCs have an additional port dedicated to the mouse. The most common type of port for the keyboard and mouse is a small circular one called a PS2 port. The most well-known ports are the serial ports and the parallel ports. We will discuss serial and parallel ports in detail later in the book. For now, let us simply note that a serial port allows the connection of a serial device to the computer. Serial devices communicate with the system one bit at a time. The most common serial devices are mice, keyboards and modems. A parallel port allows the connection of a parallel device to the computer. Parallel devices communicate with the system 8 bits at a time. The most common parallel device is a printer, although many of the newer storage devices (tape drives and removable hard drives) have a parallel interface. A PC can support up to four serial ports, denoted by the names COM1, COM2, COM3 and COM4, where COM is an abbreviation for communication. PCs can also support up to three parallel ports, denoted by LPT1, LPT2 and LPT3, where LPT is an abbreviation for line printer. Most computer systems have one parallel and two serial ports built into the system, but it is possible to add additional parallel and serial ports by installing an I/O port expansion card in an expansion slot. (I/O stands for input/output.) Note that Microsoft Windows supports all four serial ports, but only two parallel ports. We will have much more to say about ports later in the book.

Connector Types PCs have a variety of external connectors (not limited to PS2, parallel and serial), and the sizes and shapes can be confusing at times, especially if you need to buy a cable. Figures 1.5-1.8 illustrate some of the most common types of connectors.

1. Overview

® .:::::: :::::: ® 25-Pin OB25 Parallel (IEEE 1284 Type A)

Cf? ~I:mmmm:m~) Of) 32-Pin Centronics Parallel (IEEE 1284 Type B)

32-Pin Parallel mini-connector (IEEE 1284 Type C)

Figure 1.5 - Parallel port connectors

AS-232 serial 9-pin O-shell

Figure 1.6 - Serial port connectors

I~

~I

50-pinl68-pin SCSI connector

Cf? \1:::::::::::::::::::::::::::0 Of) 50-pin SCSI connector

Figure 1.7 - Some SCSI connectors

PS2 connecto r

IBM style keyboard connector

connecto r

Figure 1.8 - Other types of connectors

9

10

Understanding Personal Computers

A Functional Overview of the PC In general terms, the purpose of a computer is to manipulate data: that is, to accept data as input, to compute with that data and to display or print data. With this view in mind, a computer can be thought of as consisting of three functional components, or subsystems, as illustrated in Figure 1.9.



The microprocessor (or CPU), which does calculations with the data. These calculations not only take the form of the usual arithmetic operations (addition, subtraction, multiplication and division), but also include logical operations (and, or, not, etc.) and comparisons (equal to, greater than, etc.). The CPU is also responsible for calculating memory addresses and issuing commands to other parts of the PC.



A storage subsystem, which is used to store data, either temporarily or persistently, where the latter term simply means that the data will remain valid even if the power is turned off. (Very few computer devices store data in a truly permanent state, where the data cannot be changed.)



The external communication subsystem. This subsystem allows the computer user (or another PC) to communicate with the PC.

Microprocessor

Storage

External •. .. \ .•..... w . . . . . . . . .. @ . . .

If you look closely at every other symbol in the keyboard buffer on the far right, you can see the x' s. You can also make out the very last keystrokes we entered in order to perform the memory dump, which also ended up in the keyboard buffer. (The reason that the keystrokes occupy every other character in the keyboard buffer will be explained in Chapter 9.)

72

Understanding Personal Computers

4. Now enter a different string ofletters, say by holding down the z key. -zzzzzzzzzzzzzzzzzzzzzzzz~ A

Error

and do another dump of memory. -d 0: 41e~

The results are similar, but with z's instead of x's. Keyboard buffer 0000,0410

7A 2C

0000,0420

7A 2C ODIC 64 20 20 39-30 OB 3A 27 34 05 31 02

0000,0430

65 12 00 1C 7A 2C 7A 2C-7A 2C 7A 2C 7A 2C 81 00

0000 : 0440

86 00 20 00 00 00 00 00 - 00 03

00 00

. . . ...... P .....

0000 : 0450 ~ ?O 00 00 00 00 00-00

00 00

0000 : 0460

00 00

.. •. . ) O.. E .. U • . •

01 01

· . ... . . ....

00 77

· . >.

E 00 00 04 03 . 29 30 C2 - 11 45

@ 00

0000: 0470

00 00 00 00 00

0000 : 04 0

IE 00 3E 00 18 ~ 00 60-F9 1 1

0000 , 0 90

80 07 07 00 00 ,0 1 0 12-AO 00

Current cursor position (column 0, row 18h = 24)

00-14 14

Number of hard drives installed

(.

.. .

• • • •• •• ••• @ • ••

Number of v di eo columns (SOh = 80 columns)

We have also pointed out a few other data values in the BIOS data area. 5. Quit debug using the q command. -q

~

I~ I End of Experiment

.W

4. Motherboards and Buses

General Remarks about Buses We have already alluded to the fact that the components of a PC are connected together (both logically and physically) by buses. We now want to describe how buses operate in general terms. Generally speaking, a bus consists of somewhere between 50 and 100 individual lines, which may be etched into a printed circuit board, or may be actual wires confined within a cable. A single bus usually connects several devices, each of which receives all of the signals placed on the bus.

General Bus Types Because there is such a performance discrepancy between devices, a modem PC system usually has more than one bus. As we will see, one of the most common PC bus architectures uses a three-tier design. The terms associated with bus types are unfortunately rather vague, but here is one possible interpretation. •

A local bus, or processor bus, is a bus that connects the microprocessor to memory (either main memory, cache memory or both) and runs at the microprocessor's external speed.



A system bus is a bus that connects major PC components, such as the microprocessor and memory, to other components of the system. A system bus generally runs more slowly than a local bus. A system bus may connect faster I/O devices, such as hard drives and video.



An I/O bus, or expansion bus, is a bus that connects slower I/O devices, such as a printer, modem, keyboard or mouse. Expansion buses typically have expansion slots, to accommodate expansion cards, such as sound cards. However, a system bus may also have expansion slots.

74

Understanding Personal Computers



A peripheral bus is a bus that is designed specifically to accommodate one or more types of peripheral devices. Unlike expansion buses, peripheral buses generally do not have expansion slots. For example, the SCSI bus (pronounced "scuzy") is designed to accommodate hard drives, CD-ROM drives, scanners and other devices. The IDE peripheral bus (although usually referred to as an interface, rather than a bus) was originally designed to accommodate only hard drives, but the enhanced version of IDE can accommodate some other devices as well.)

We emphasize again that these terms are used rather loosely in the PC world, and thus serve only as a rough guide to bus types. Of course, the individual buses in a PC system must be connected to each other. This is generally done by a device referred to as a bridge. On the other hand, a peripheral bus is usually connected to the system bus using an adaptor card, or through adaptor circuitry built into the motherboard.

Data, Address and Control Lines Despite the number of different bus types, the lines in a bus can generally be classified into three functional groups, referred to as the data bus, address bus and control bus. The lines in the data bus are called data lines, and similarly for the lines in the address and control buses. (Perhaps the terms data subbus, address subbus and control subbus would have been a better choice, but they are not used.) The width of a bus refers to the number of lines in the bus. For instance, a modem Pentium PC has a system data bus width of 64, meaning that 64-bits of data can be transmitted at one time on the system bus, and an address bus width of 32, meaning that there are 232 = 4,294,967,296 "'" 4 billion possible addresses. Of course, the data lines are used to transmit data from one device to another. For example, since a modem PCI system bus has a data bus width of 64, it follows that 64 bits of data can be transmitted at one time on the system bus. The data bus also carries microprocessor instructions, that is, code, to the microprocessor. The address lines are used to transmit the addresses of data. Addresses may be memory addresses, or addresses of data on a storage device, such as a hard drive. They may also be I/O port addresses, such as a printer port address, or a

4. Motherboards and Buses

75

serial port address. As already mentioned, the address bus on a modem Pentium computer is 32 bits wide, resulting in 232 = 4 Gig possible addresses. If each address represents a byte of data (8 bits), then memory can address a maximum of 4 GB of data. The control lines are used to control the operations of the Pc. Here are some examples of the types of signals that can occur on the control bus. •

Memory read-place data located at the address on the address bus onto the data bus.



Memory write-place data that is on the data bus at the location specified on the address bus.



1/0 read-retrieve a word of data from the I/O port whose address is on the address bus.



1/0 write-send the data on the data bus to the port whose address is on the address bus.



Acknowledge (ACK)-indicates that data has been placed on the data bus, or retrieved from the data bus.



Bus request-request to gain access to the bus.



Bus grant-indicates that bus control has been granted to a requesting device.



Interrupt request-indicates a pending interrupt.



Interrupt acknowledge-acknowledges that a pending interrupt has been recognized.



Clock-indicates a "tick" of the bus clock.



Reset-reset the bus.

Note that more specific bus types will have more specific control signals. For instance, a parallel bus will have control signals such as BUSY, PAPER OUT or ERROR to indicate certain conditions.

76

Understanding Personal Computers

Bus Arbitration Since only one device can have control of the bus at one time, when several devices are attached to the same bus, there will be contention for the use of the bus. To deal with this problem, some method of arbitration is necessary. In general, there are two approaches to bus arbitration. In one approach, a dedicated unit (or dedicated circuitry in the CPU) takes charge of deciding which device gets to control the bus at a given time. This unit is called a bus controller, or bus arbiter. In the other approach, each device has built-in bus arbitration logic, and the devices act together to decide which one should gain control of the bus. The latter approach is more common in networks, the former in PC systems. Once a device has control of the bus, it is referred to as a bus master. It can then initiate a data transfer. The device with which it communicates is then called the bus slave. Of course, these roles are only temporary.

Multiplexing In many PC bus systems, the address and data buses do not consist of physically separate lines. Rather, in order to save cost and space, a single set of lines serves both purposes. This is referred to as multiplexing. In this case, the control bus (which is not multiplexed) is used to determine the current purpose of the common address/data lines. As an illustration, the writing of data proceeds as follows on a multiplexed bus. First, the CPU places the address on the address/data lines of the bus and asserts a special address valid signal on the control bus. This signal tells all of the devices connected to the bus that the address/data lines contain a valid address. The bus components now have a certain amount of time to recognize the address as one of their own and acknowledge. Then the CPU deasserts the address valid line and places the data on the address/data lines, along with a write memory control signal. Since the same lines are used for address and data at two different times, this is referred to as time multiplexing. Clearly, multiplexing saves on bus width (which saves money), but only at the expense of more complicated bus logic. There is also a performance hit, since data and addresses cannot "overlap" on the bus.

4. Motherboards and Buses

77

Bus Timing Events that occur on a bus are tied to some form of timing. The bus usually includes a clock signal that does nothing but change voltage from low to high, and vice versa, at regular intervals, as shown in the timing diagram given in Figure 4.1.

I

Clock cycle

....

binary 1 ~ 1-----, high voltage binaryO ~ low voltage ,

i

Leading edge Trailing edge

1 Time~

Figure 4.1 - A bus clock signal The transition from low to high is called the leading edge of the cycle, and the reverse transition from high to low is called the trailing edge. Note that the leading and trailing edges are drawn as vertical lines. Although it does take a small amount of time for the signal to change, it is easier to read the diagram when we ignore this small transition time. A clock cycle is the time between successive leading edges (transitions from low to high). In general, PC events (such as reading memory) occur either synchronously or asynchronously. In synchronous operation, each event begins at the leading edge of a clock signal. To illustrate, Figure 4.2 shows the timing diagram for a simple read operation. In this figure, we have shown two complete read cycles.

78

Understanding Personal Computers

Read Cycle 1

-

Clock Address Valid Read

Read Cycle 2

-

J

,

I

Ji-!------l '--_ _....L.._ _---.'

Address Data

~f--._tdte;s _ ..t-----"f----i-_.tdte;s __---r- --+-----i i D31a

ACK

Figure 4.2 - Two memory read cycles (one wait state) During the first clock cycle, the CPU places an address on the address bus. It also asserts the ADDRESS VALID control line and the MEMORY READ control line. After a one-cycle delay, the device that recognizes the address places the requested data on the data bus and asserts the ACK (acknowledge) line. As you might imagine, timing diagrams for actual PC operations are quite a bit more complicated. The latest Pentium microprocessor sits inside a package with 273 pins, of which 267 are used for data, addresses and control signals (about 50 pins are grounded). That's a lot of control signals! We have shown you the timing diagram above in the hopes of clarifying the bus process, and in order to illustrate some fundamental concepts, such as wait states, burst mode and pipelining (coming next). We will not go into the rather complex details of actual bus operations, however.

Wait States Notice that, during the second clock cycle of each read cycle, the CPU sits idly, waiting for memory to place the data on the data bus and assert the ACK signal. This waiting period is referred to as a wait state. Depending upon the speed of the microprocessor and of memory, one or even two wait states may be required. If memory is fast enough, however, then a no wait state operation is possible. This is illustrated in Figure 4.3.

4. Motherboards and Buses

-

Read Cycle 1

79

Read Cycle 2

-

Clock Address Valid Read Address Data

JJl I

~

I!,1

-

----1 ~a

Figure 4.3 - Two memory read cycles (no wait states) Note that, in this case, memory can place the requested data on the data bus after only a short delay-less than one-half of a clock cycle.

Burst Mode Normally, for each piece (word) of data that the CPU requests, it must first supply an address. However, most data transfers involve more than one word, and usually these words are read from (or written to) successive memory locations. With this in mind, some buses support block transfers, in which a single address is placed on the bus, and then a predefined number of successive data transfers occur, starting at the given address. Block transfers save considerable time, and are also referred to as burst mode operation, as we will see when we discuss memory. (You may have seen the term burst mode in advertisements.) Figure 4.4 illustrates burst mode.

80

Understanding Personal Computers

Clock

Address Data

l

A:tiess I

Figure 4.4 - Burst mode data transfer

Pipelining Notice in Figure 4.4 that the second read cycle does not begin until the first cycle is completed. However, the ADDRESS VALID, ADDRESS and READ lines are all free after the first clock cycle of the first read cycle. Hence, it is possible for the CPU to start a second read operation before the first one has finished. This is known as pipelining, and is shown in Figure 4.5. As you can see, pipelining saves one clock cycle out of four, which is a 25% savings. Pentium processors use this strategy to improve performance.

4. Motherboards and Buses

81

Read Cycle 2 Read iCycle 1

c_ Address Va lid Read

-

J=1=P=Cj r

1

~~ I J Ii I ;

i

I

1

I ·

Address

'1

i

Pd1'ess

i

Data

Figure 4.5 - Pipelining

The Motherboard Figure 4.6 provides a close look at the components of a motherboard. We will discuss each of these components in due course. For now, let us discuss a few of the components in general terms. (Note that motherboard design differs substantially among manufacturers.)

82

Understanding Personal Computers

Mouse

-0

System BIOS

~

ISA expans ion slols

U

1/0 control ler

D

Memory controller

D

I ~m.m I L2 Cache

Power connector

Floppy connector -,====,~

I

=

I

Figure 4.6 - An example motherboard layout

Logical View of the Motherboard Figure 4.7 shows a logical view of a motherboard that is based on the PCI system bus. Note the three main buses-local bus, PCI bus and expansion bus, in addition to a SCSI peripheral bus.

4. Motherboards and Buses

PCI Expansion Slots

83

Local Bus PCI Bus Bridge

Bus Monitor

Floppy drive

fK~~~~~~~~~~=~~f1ttI8

Expansion and 16·bit Slots

Figure 4.7 - Internal logical view of system unit (PC I bus type) In a loose sense, PC buses tend to group devices by how fast they can communicate with the microprocessor. For instance, memory can communicate with the microprocessor faster than other devices. For this reason, memory is directly connected to the microprocessor via the local bus. At the next level of operation, we find the video and hard disk, for instance. These devices are connected to the PCI bus, either directly (in the case of video) or through a peripheral bus (in the case of the hard drive). Finally, we have slower devices, such as sound cards and I/O adaptor cards, that would be connected to the ISA expansion bus. In a PC with a Pentium processor running at a clock speed of 166 MHz, for example, the local bus runs at 66 MHz, the PCI bus at 33 MHz and the ISA bus at a lowly 8 MHz, for compatibility with older devices. The three main buses are connected together through bridges that mediate the speed differences. Thus, the entire system forms a single, interconnected unit. Incidentally, there is a reason why the local bus runs only at 66 MHz, even though the Pentium is running at 200 MHz or more and memory is also quite

84

Understanding Personal Computers

fast. The main problem is that a bus must be of a certain physical length in order to reach the various components. However, current flowing over a distance tends to produce electromagnetic interference. At the present time, a balance between the state of technology and the issue of cost-effectiveness dictates that maximum bus speeds are 66 MHz, but 100 MHz buses may have made their appearance by the time you read this.

PC Bus Types The bus design in Figure 4.7 is not the only one used in modem PCs. Over the years, several bus types have evolved. While it is not important to be familiar with the very early bus types (for the original 8088-based PC), which are no longer used, some of the early bus types (lSA, EISA and MCA) are still employed in today's computers, primarily for compatibility reasons. Thus, we should briefly discuss these bus types.

TheISA Bus Early PCs had only a single bus. The original AT bus (circa 1984) is called the Industry Standard Architecture bus, or ISA bus (also called the AT bus). This bus runs at 8 MHz and can transfer data at a rate of about 5-8 MB per second. The data width of the ISA bus is 16 bits and the address bus width is 24 bits. While it is certainly not our intention to go into the details of PC bus operation, it is instructive to take a quick look at the actual lines (the so-called pinouts) for the ISA bus, which is the simplest of the currently used PC buses. This is shown in Figure 4.8, in the format that shows how an ISA expansion card would connect to the bus. (The thick vertical line represents the opening in the expansion slot.) The ISA lines are labeled A1-A31, B1-B31, C1-C18 and DID 18. Thus, there are a total of 98 lines on the ISA bus.

4. Motherboards and Buses Ground Reset +5V IRQ9 -5V DMA_Request2 -12V OWaitStatesOK +12V Ground MemoryWrite(O-IMB) MemoryRead(O-IMB) I/OPortWrite I/OPortRead DMA_Ack3 DMA_Request3 DMA_Ackl OMA_Requestl Memory RefreshlnProgress SystemClock IRQ7 lRQ6 IRQ5 IRQ4 IRQ3 ·DMA-ficl& TC ALE +5V OSC Ground Requestl6BitDataBus IOCSI6 IRQIO IRQlI IRQ12 lRQl3 IRQ 14 DMA_AckO DMA_RequestO DMA_ Ack5 DMA_ Request5 OMA_ Ack6 DMA_ Request6 DMA_Ack7 DMA_ Request? +5V RequestToBeBusMaster Ground

BI B2 B3 B4 B5 B6 B7 B8 B9 BIO Bll BI2 B13 BI4 BI5 BI6 BI7 BI8 BI9 B20 B21 B22 B23 B24 B25 B26 B27 B28 B29 B30 B31 01 02 03 04 05 06 07 08 D9 010 011 012 013 014 015 016 017 018

Al A2 A3 A4 A5 A6 A7 A8 A9 AIO All AI2 A13 AI4 AI5 AI6 A17 AI8 AI9 A20 A21 A22 A23 A24 A25 A26 A27 A28 A29 A30 A31 CI C2 C3 C4 C5 C6 C7 C8 C9 CIO Cll CI2 C13 C14 CI5 C16 CI7 CIS

I/O Clock DataBit7 DataBit6 DataBitS DataBit4 DataBit3 DataBiL2 DalaBitl DataBitO I/O READY AddressEnable Address19 Address l8 Addressl7 Addressl6 Address 15 Addres514 Addres 13 Address 12 Address I I Address 10 Address9 Address8 Address7 Address6 Address5 Addre 4 Addre 53 Address2 Address I AddressO HighAddress23 HighAddress22 HighAddress21 HighAddress20 HighAddress 19 HighAddress 18 HighAddress 17 MemoryRead(0-16MB) MemorvWrite(0-16MB) DataBit8 DataBit9 DalaBitlO DalaBitll DataBil12 DalaBit13 DataBitl4 DataBitl5

Figure 4.8 - The ISA bus pinouts

85

86

Understanding Personal Computers

We have shaded the lines related to addressing, data, IRQ and DMA (the latter two concepts will be explained in due course). Some of the other lines shed light on the functioning of the ISA control bus. Note, in particular, the following lines: •

OWaitStatesOK-a signal from a peripheral device on this line indicates that the device is fast enough so that wait states are not necessary.



AddressEnable--to indicate that the value on the address bus is a valid address.



MemoryWrite, MemoryRead, I/OPortWrite and I/OPortRead-These lines control the reading and writing from and to memory and the I/O ports. The reason there are separate lines marked MemoryRead(O-IMB) and MemoryRead(0-16MB) is that the ISA bus is an extension of the 8-bit bus that was used in the original Pc. In order to maintain backward compatibility, the lines labeled CI-C18 and DI-DI8 were added to the original PC bus, extending the data path from 8 to 16 bits and the address path from 20 to 24 bits.



Request16BitDataBus-a request to use the entire 16-bit data bus, rather than just the 8 bits of the original PC data bus.



RequestToBeBusMaster-a request to be the bus master.

The MCA Bus In 1987, Compaq Computer Corporation introduced a two-bus PC design that

placed faster memory on a separate bus. This was the beginning of multiple-bus PC systems. Around the same time, ffiM introduced its MeA bus (micro channel architecture bus). This bus was far more sophisticated than the ISA bus, embodying some features of the much larger main frame computers that are ffiM's mainstay. However, due to lack of backward compatibility, and perhaps to marketing peccadilloes, the MCA bus did not catch on.

The EISA Bus In 1988, a group of computer manufactures, known as the Gang of Nine, consisting of AST, Compaq, Epson, HP, NEC, Olivetti, Tandy, Wyse and Zenith, developed the Extended Industry Standard Architecture bus, or EISA bus. This bus also runs at 8 MHz, but can transfer data at the rate of 33 MB per second, a 500% improvement over ISA and a 50% improvement over MCA.

4. Motherboards and Buses

87

IBM's response was to improve the MCA bus to a data transfer rate of about 80 MB per second. The EISA bus has a data and address width of 32 bits.

The VESA Local Bus (VL Bus) The next major advance in bus design was to add yet a third bus to the system, in the form of a local bus used strictly by the video system. Thus, PCs built on this design had two local buses, one for memory and one for video, along with the standard ISA expansion bus. In 1992, the Video Electronics Standards Association (VESA) , produced a local bus design called the VESA local bus or VL bus. This design placed memory and video on the same local bus, with an ISA bus for additional expansion.

The PCIBus At the same time, Intel developed the Peripheral Component Interconnect bus, or PCI bus. The PCI bus seems to have won the bus war, as you can see by looking at current computer ads. As you can also see from Figure 4.7, the PCI bus itself is not a local bus, in the sense of being connected to the microprocessor. However, the PCI design specification includes the local bus component and provides for a high-speed, semidirect, connection of PCI devices to the microprocessor. Thus, we may choose to think of the PCI/local bus combination as the PCI local bus. PCI is capable of coexisting with older bus technologies, such as ISA or EISA. Thus, the PCI specification allows for a three-tier bus system-ultra-high speed local bus for processor-to-memory communication, high speed PCI bus for fast components, and traditional ISA bus for slower components. The PCI bus design can produce peak transfer rates of an impressive 132 MB per second. The current version of the PCI bus has a data width of 64 bits and an address width of 32 bits. (Earlier PCI buses had a data width of 32 bits.)

How Bus Speed Affects Performance The issue of how bus speed affects performance requires a bit of discussion. Take the case of a local bus, which connects the processor and memory. The speed of the bus is governed by the local bus clock, say for the sake of illustration, running at 66 MHz. This means that each clock cycle takes 1/66,000,000

88

Understanding Personal Computers

seconds, or 15 ns (where ns stands for nanosecond and is 1 billionth of a second). Now consider a memory chip. In order to read a single bit (0 or 1) of data from that chip, several things must happen. For instance, the bits of data in a memory chip are organized in a rectangular array of rows and columns. In order to access a bit, the processor (actually the memory controller) must first send the row number of the bit in question to the chip. Then it sends the column number of the bit. This identifies the desired bit. Moreover, each of these two steps must happen at the beginning of a clock cycle. (This is synchronous operation.) Thus, the time between sending the row address and sending the column address, known as the RAS to CAS time (which is short for row address strobe to column address strobe time), must be a multiple of 15 ns; that is, it must equal one of 15 ns, 30 ns, 45 ns, 60 ns, etc. On the other hand, to save space within the memory chip, the row and column addresses are accepted at the same location within the chip. Thus, the chip must finish decoding the row address before accepting the column address. Put another way, the RAS to CAS time must be greater than or equal to the row address decode time. This time will no-doubt be improved by further chip technology, but will never be equal to 0, under the current chip design. Now suppose, for instance, that the row address decode time is 20 ns. Thus, we have two constraints. The RAS to CAS time must be a multiple of 15 ns. and it must also be at least 20 ns. Hence, the shortest possible RAS to CAS time is 30 ns. Thus, the design of the memory chip causes a IOns waste. Now, if the bus speed is increased to 10 ns, then the time interval must be a multiple of IOns but not less than 20 ns, so it can actually equal 20 ns, resulting in no wasted time. Put another way, a change in bus speed from 15 ns to 10 ns produces a 10 ns improvement in memory reads! On the other hand, if the bus speed is further increased to 5 ns, no further improvement is realized. Thus, the effect of bus speed on performance is a bit more subtle than you might first imagine. Put another way, the bus clock does not control the time it takes for an individual event to unfold-that is controlled by the design of the component, the laws of physics and the state of technology. However, most major events (such as reading from memory) are made up of smaller subevents, which must follow a certain order and which are triggered by the cycles of a clock. The trick is to find a clock speed that reduces wasted time as much as possible. This is very difficult, because a great many things are happening at the same time inside a PC.

4. Motherboards and Buses

89

pes Are Interrupt-Driven We have seen that the devices in a PC system are connected to one type of bus or another, and are thus able to communicate with each other, either directly or indirectly. We will discuss the communication process (the I/O process) in the next chapter. For now, however, we want to consider how a peripheral device, such as a keyboard, initiates a communication with the CPU. The short answer is that the peripheral interrupts the CPU, using a hardware interrupt. In theory, there are two common ways in which a peripheral device can get the attention of the CPU. One way is to have the CPU constantly monitor all peripheral devices, looking for a message from a device. This time-consuming process is called polling and is not used by PCs. The alternative approach is to have the peripheral device issue a hardware interrupt when it requires the attention of the CPU. Since the PC uses this approach, it is referred to as an interrupt-driven system. We have discussed at some length the fact that the hardware "is managed by a collection of software programs called service routines. We have also discussed the fact that these routines may be called in one of two ways. Recall that a software interrupt occurs when the microprocessor executes an instruction of the form int x from within a running program. On the other hand, a hardware interrupt is an unpredictable event requesting the attention of the microprocessor. Recall also that we made a philosophical distinction between these two types of interrupts as follows. In a hardware interrupt, a peripheral device is saying to the microprocessor "drop what you are doing and tend to a hardware issue." In a software interrupt, the program (or programmer) is saying to the microprocessor "include a certain service routine as part of this program, and execute that routine at this time." Let us take a look at the mechanism that is used for issuing hardware interrupts.

Hardware Interrupts PCs use a special chip called a programmable interrupt controller, or PIC, to process hardware interrupts. For example, when a user strikes a key on the keyboard, a small processor inside the keyboard sends a signal to the keyboard controller on the motherboard. The keyboard controller, in tum, sends a signal to the interrupt controller. The interrupt controller, which may be receiving other interrupts at the same time, prioritizes these requests for microprocessor attention and sends the interrupt of highest priority to the microprocessor. The CPU then executes the appropriate service routine to service the interrupt.

90

Understanding Personal Computers The interrupt controller that is used in a PC has a total of eight interrupt levels, which are designated by the names IRQO through IRQ7. Thus, IRQx is interrupt request x. The number indicates the priority level-interrupts coming in on IRQO have the highest priority; those coming in on IRQ7 have the lowest. For the original PC, this may have been enough IRQ lines, but later models (starting with the 80286), needed more, so a second interrupt controller chip was added, giving an additional eight interrupt levels. Each device that uses the interrupt mechanism needs to be assigned a unique IRQ level. There are circumstances under which an interrupt request level can be shared among devices, but this procedure can be problematic. Certain interrupt request levels are customarily assigned to standard devices, such as the keyboard. Other levels are available for such devices as a sound card or CD-ROM. Table 4.1 shows the standard IRQ line assignments, along with the actual interrupts (service routines) that are triggered by these IRQ lines. It is important not to confuse the interrupt request level with the interrupt number of the service routine.

Table 4.1 - IRQ Levels and Their Interrupts IRQ line 0 1 2 3 4 5 6 7 8 9 10 11 12 13

14 15

Description Reserved for system timer Reserved for keyboard Cascade from second PIC Serial port 2 Serial port 1 Parallel port 2 Floppy drive Parallel port 1 Real-time clock Video Available Available Onboard mouse port if present else available Reserved for math coprocessor Primary IDE if present else available Secondary IDE if present else available

Interrupt Int 8 lnt 9 IntlO Int 11 Int 12 lnt 13 lnt 14 Int 15 lnt 112 Int 113 lnt 114 lnt 115 Int 116 lnt 117 Int118 lnt 119

As you can see, most of the interrupts are assigned to familiar devices, such as the keyboard, system timer (discussed later), real-time clock, serial and parallel

4. Motherboards and Buses

91

ports, and floppy drive. An IDE-type hard drive or CD-ROM (discussed later) would use one ofthe IRQs (14 or 15) dedicated to the IDE bus. Note that there is a special interrupt, called the nonmaskable interrupt, or NMI, which has its own dedicated channel (no IRQ number). It is for very serious events and is called nonmaskable because it is not possible to tum off the NMI interrupt through software programming, as it is with the other interrupts. Programmers sometimes tum off hardware interrupts, using the assembly language instruction CLI (which stands for clear interrupts), when they do not want a certain portion of their program to be interrupted under any circumstances (except for a fatal NMI error). IRQ level 2 needs a bit of explaining. When the second interrupt controller was added to the AT system, the designers decided that, rather. than have it interrupt the microprocessor directly, which might get confused if two PICs were both interrupting it, the second PIC should interrupt the first PIC, which in tum would interrupt the microprocessor on its behalf. Thus, IRQ2 is used by the second PIC to interrupt the first PIC. This process is called cascading. If, for instance, the mouse controller causes an interrupt on IRQ12 of the second PIC, then this PIC issues an interrupt to the first PIC, on IRQ2. Then the first PIC interrupts the microprocessor. It is worth noting that there are not a lot of available IRQ lines in a Pc. In fact, only two lines are marked as available, after the standard devices get their IRQ lines. This can be a problem when a user wants to add additional devices that require IRQ lines. Fortunately, many systems do not have (or do not use) two serial ports and two parallel ports, so one or more of these IRQs can often be assigned to another device, such as a sound card (which, incidentally, can sometimes require more than one IRQ line). Unfortunately, many expansion cards limit the range of possible IRQ assignments, so even if you have a free IRQ line, a given card may not be able to use that line. Once the microprocessor is interrupted through an IRQ line, it needs to execute a service routine. For instance, when the microprocessor is interrupted by the keyboard on IRQ1, it executes interrupt 9, the so-called low-level keyboard service. We will have more to say about this service in the chapter on keyboards. As another example, when a PS/2 mouse interrupts the microprocessor on IRQ12, the processor executes interrupt 116, a low-level mouse service.

Installing New Devices in a PC You might be wondering at this point how the assignment of IRQ levels is made. Let us discuss this in a more general setting.

92

Understanding Personal Computers

We discussed earlier the fact that a newly installed device requires support, usually in the fonn of both hardware (a controller card) and software (a driver). When you add a new device, such as a sound card, to an existing system, the device may require one or more of the following resources: •

An IRQ level (perhaps more than one),



A DMA channel,



An I/O port address (perhaps more than one),



Some memory addresses in upper memory for the device's BIOS.

We have not yet discussed DMA channels, but we wi11later. In any case, this should not effect the present discussion. In general, it is up to the user to assign the needed resources in a way that avoids resource conflicts, that is, IRQ conflicts, DMA conflicts, I/O port conflicts, and memory conflicts. These occur when two or more devices are trying to use the same resources, such as the same IRQ lines. The documentation that comes with a device should clearly explain which resources are needed and suggest default choices for those resources (along with alternative choices). However, all too often, the documentation is confusing, especially to users who do not have a clear understanding of resource issues. Also, all too often, the default choices are taken by previously installed devices. This means some experimentation is in order. Unfortunately, the process of assigning resources is often a two-step process-you need to physically set some jumper settings or block switches (see Figure 4.9) on the controller card itself, which cannot be done whilst the card is in the computer. Then you need to make some software settings. This two-step process is tedious and often highly frustrating. (If possible, when buying, look for controller cards that are jumper-free and whose settings are completely controlled by software. Such cards are much easier to install.)

4. Motherboards and Buses

Jumper

93

ffl

J~~:,1( 1 [tJ pins

Closed position

Closed position

Switch block

Figure 4.9 - Jumpers and switch blocks Sagacious computer users should keep a list (on paper) of the resource settings for the various peripheral devices in their computer, starting at the time of purchase if possible, and add to that list whenever a new device is installed. (You can keep this list next to the list of CMOS configuration settings!) This can save a lot of grief during the installation of new devices. To create the initial list, there are programs available that will poll your system and produce a list of used resources. Microsoft Windows 3.1 comes with such a program, called MSD. Under Windows 95, you can see a list of used resources by choosing the System icon in the Control Panel. Then choose Device Manager, click once on Computer and choose the Properties button. You can print resource data for your system using the Print button on the Device Manager tab.

Plug-and-Play The problems associated with assigning resources to a newly installed device are no doubt one area that gave the PC a bad name when it comes to user friendliness. This is where Plug-and-Play (also cryptically denoted by PnP) comes in. The purpose of Plug-and-Play is to provide automatic allocation of resources to the expansion cards in your system. Plug-and-Play support requires several things. First, it requires an operating system that supports Plug-and-Play. Windows 95 supports PnP, but Windows NT does not. It also requires a system BIOS that supports Plug-and-Play. Finally, it requires adaptor cards that are Plug-and-Play compliant. (Microsoft refers to a BIOS that does not support Plugand-Playas a legacy BIOS and a device that does not support Plug-and-Play as a legacy device.)

94

Understanding Personal Computers

If the BIOS in a PC is Plug-and-Play compliant, then Windows 95 will automatically configure the resources of all of the compliant adaptor cards in the system. It will not configure legacy devices, however, but it will try to determine their resource usage, possibly by asking you, and store that information so that it does not attempt to assign these resources to a compliant device. (Plug-and-Play is by no means perfect-it has its own frustrations, such as when it refuses to recognize a device.) Someday, when PCs are truly Plug-and-Play compliant, and when all peripherals are also compliant, the difficulties associated with the assignment of resources may become a thing of the past (assuming it all works correctly-a nontrivial assumption). Until then, you may be able to resist the temptation to take a sledge hammer to your computer by staying ahead of the problem as much as possible, by keeping a list of currently used resources.

5. Input/Output

The term input/output, or just I/O, generally refers to communication between the microprocessor/memory component of a PC and a peripheral component, such as a hard disk, floppy disk, monitor, keyboard, modem, printer and so on. The purpose of this chapter is to give an overview of this I/O process, speaking both in general terms and giving a few specific examples. However, it is important to keep in mind that, while there are many general procedures for I/O, the specifics can vary considerably from device to device. The best we can hope for is to gain a general understanding of the principles of the input/output operation.

I/O Modules In simple terms, the microprocessor and a peripheral device speak different languages. There is no way that the microprocessor can directly control a printer,

for example. The microprocessor cannot execute an instruction such as EJECT A PAGE. What is needed is an intermediary device (perhaps more than one) that can understand and translate the languages of both parties to the communication process. In general terms, the intermediary between the PC core (microprocessor/memory) and a peripheral device is referred to as an I/O module. Figure 5.1 shows the general scheme of an I/O module. I/O Module Motherboard

,

Microprocessor/ t - - - - Host Memory bus Adaptor

"""

"-

~ Controller

Figure 5.1 - An I/O module

cable

Peripheral Device

96

Understanding Personal Computers

The host adaptor understands the languages of both the host PC and the controller (which is the same as the language of the peripheral device), and thus takes care of communication between the two components. Information travels between the host PC and the host adaptor over one or more of the PC buses (PCI, VL, ISA, EISA). The controller has several jobs. •

The controller controls the peripheral device, sending it commands to perform various operations. Most device controllers have microprocessors of their own (not nearly as powerful as the CPU), which can be programmed to carry out various functions pertinent to that device, such as fetching data from a hard disk, printing a page, and so on.



The controller translates the electromagnetic signals coming from the peripheral device into digital data that the host PC can understand. For example, the data on a hard disk is stored magnetically. When data is read from the disk, it takes the form of a continuously varying current. This analog signal must be translated into a sequence of O's and 1 'so A device that performs this type of translation is called a transducer. In the PC world, the term data separator, or data synchronizer is also used.



The controller must mediate the speed difference between the host and the peripheral device, usually through data buffering. For example, suppose the microprocessor wants to read some data from a hard disk into memory. Memory is much faster than a hard disk. Rather than have the CPU wait as each byte is retrieved from the disk and sent over the bus to memory, the CPU can make its read request to the hard disk controller and then go about other business. The hard disk controller then retrieves the data from the disk byte-by-byte and places it in its own data buffer. When the data buffer is full, the disk controller can interrupt the CPU to tell it that the data is ready. The CPU can then transfer the data from the controller's buffer into main memory. In this way, the CPU is only tied up during a buffer-to-memory transfer. In an effort to further improve data transfer efficiency, some controllers have built-in caches. These are similar to data buffers, but are generally much larger and also much smarter. We will discuss caches later in the book.



The controller is responsible for error detection and/or error correction. Because error correction is a function of the controller, it can be done without the host system even being aware that an error has occurred. For instance, some hard disk controllers are so smart that they can detect when a

5. Input/Output

97

certain portion of the hard disk is deteriorating and move the data on that portion to another location on the disk, without any intervention (or even knowledge) on the part of the host system! It is worth pointing out that the distinction between the host adaptor and the

controller is somewhat vague. Often, it is a logical distinction rather than a physical one. Put another way, often one physical device serves both functions. Let us consider some examples of I/O modules. Figure 5.2 shows a video I/O interface. In this case, the adaptor and controller are combined into a single component, usually called a video controller or a video adaptor. (Most computer users, and computer salespersons, are not aware of the distinction between an adaptor and a controller, and often the terms are used interchangeably). Motherboard Video Microprocessor/ Adaptor/ Memory ~ Controller cable

Monitor

Figure 5.2 - A video I/O interface Older hard drive types (ST506 and ESDI) use a similar I/O scheme, as we will see when we discuss hard drive interfaces in detail. Figure 5.3 shows the two interfaces used by today's most popular hard drives. In the SCSI interface, the host adaptor is typically an expansion card that plugs into a system bus. The controller electronics are physically part of the drive. In the IDE interface, the host adaptor is often built into the motherboard (although it does not need to be, and there are IDE adaptor cards). This stems from the fact that IDE is very popular, and that the host adaptor, which does very little, can be included on the motherboard very inexpensively. (We will discuss these interfaces at detail in a later chapter.)

98

Understanding Personal Computers Motherboard Microprocessor/ Memory

bus

SCSI Host Adaptor

SCSI Controller/ SCSI Hard Drive

Motherboard Microprocessor/ Memory IDE Host Adaptor

cable

IDEControlier/ IDE Hard Drive

Figure 5.3 - Some hard drive I/O interfaces Figure 5.4 shows a keyboard and a floppy drive interface. Both of these peripheral devices use a similar I/O interface design, where the host adaptor and controller are one unit, typically built into the motherboard, and simply referred to as a controller. Motherboard

Motherboard

Microprocessor/ Memory

Microprocessor/ Memory

Keyboard Controller cable (& Host Adaptor)

Keyboard

Floppy Drive Controller Callie (& Host Adaptor)

Floppy Drive

Figure 5.4 - Keyboard and floppy drive interfaces

1/0 Processors We have alluded to the fact that some controllers are a lot smarter than others. A good illustration of this comes from the video system. In the old days, when the CPU wanted to display a line or circle on the monitor, it had to compute which dots on the screen needed to be turned on in order to display the image. However, most modem video adaptors have their own microprocessors (called graphics coprocessors) and dedicated memory that can do this job for the CPU, which only needs to tell it the endpoints of the line, or the center and radius of the circle! Another example comes in the hard disk area. Smart hard disk controllers, such as IDE (which stands for Intelligent Drive Electronics) controllers and SCSI

5. Input/Output

99

controllers, are able to peIform many tasks (such as transferring data directly to memory) without the intervention of the CPU. A controller that has its own processor is often referred to as an I/O processor.

The Mechanism of Communication In a PC, the I/O communication mechanism usually takes one of two formsmemory-mapped or I/O-mapped.

Memory-Mapped 1/0 In memory-mapped I/O, the CPU and the peripheral device share a common set of memory addresses (although the CPU has additional addresses that the device cannot access). In a PC, the video system is the primary user of memory-mapped I/O. For example, when the video system is in one of its simplest video modes, corresponding to color text only (no graphics), it is using a block of memory starting at address B8000. (We will discuss all of this in detail in the chapter on video controllers.) Physically, this memory lies on the video controller card (and not on the motherboard). Any text characters that the CPU places in this memory address range will be automatically displayed on the monitor by the video controller! Why don't we test this out now?

[~ I Experiment-Memory-Mapped I/O In this experiment, we put some characters in shared video memory.

1. Save all current work and start debug, as described in the appendix (Trying the Experiments). C : \ WIN9 5 >debug+--l

2. Your PC should be in video mode 3, which is a color text mode. Hence, video memory starts at address B8000 or, in segment offset format B800:0000. Now, every even memory location is used to display a character. The corresponding odd memory locations describe the attributes of the

100

Understanding Personal Computers

character, such as its color or whether it is boldface or underlined. Hence, the following debug command -e b800:0000

45,4,45,4,45,4,45,4~

should alternately place the letter E (ASCII 45h) and the attribute value 4 (for red text) into memory. Thus, you should see 4 red E's appear at the top of the display. 3. Now try changing the character and the attribute byte -e b800:0000

46,2d,46,2d,46,2d,46,2d~

This should display four F's in bright magenta on a green background! 4. Quit debug using the q command. -q~

I~ I End of Experiment I/O-Mapped I/O In I/O-mapped I/O, the I/O module has its own memory space that is completely separate from the main memory of the PC. The term I/O port (or just port) is used for these memory locations. The PC sets aside a total address space of 64 KB for I/O ports. The port addresses are O-OFFFFh, and so it is important not to confuse, say, port address 0378h, which is used by the printer on LPTl, with main memory address 0378h, which is used by the system BIOS. To this end, when the CPU places a memory address on the address bus, it also sends a control signal that tells the bus whether the address is intended as a main-memory address or as an I/O port address. I/O ports are generally used for three purposes: •

to send data to and retrieve data from the controller,



to send control signals to the controller, and



to retrieve status information about the device from the controller.

5. Input/Output

101

For this reason, most controllers have more than one port, some of which may be read-only, write-only or bidirectional. From a programmer's perspective, the instructions to place data in main memory or to send data to a port are quite different. For instance, to place the value 17h in the main memory location with address 378h, the following instructions are used mov di,378 mov [di],17

;place address in DI register ;mov the value to the address in DI

This is a two step process. The first instruction stores the memory address in the di register and the second instruction says "put the value 17 in memory at the address contained in the di register."

On the other hand, to send the value 17 to the I/O port with address 378, we use the following instructions moval,17 mov dx,378 out dx,al

;place value in AL register ;place port address in DX register ;issue the OUT instruction

Thus, we can see that a programmer is not likely to confuse the two memory types. As we mentioned, I/O ports (that is, I/O port memory addresses) generally refer to memory locations that are physically on a controller card (which mayor may not be on the motherboard). There is often a cable, with corresponding physical connectors, between the controller and the device. This is the case, for instance, with a printer or external modem. Many computer users refer to the connector on the back of the system unit as a port. Hence, the term parallel port, for instance, is used to refer to this connector. Perhaps the simplest way to avoid confusion is to refer to the I/O port memory address as a logical port and the physical connector as a physical port, but this terminology is not standard. (Actually, the parallel interface requires three consecutive 16-bit port addresses.)

I~ I Experiment-Communicating with the Floppy Drive Through One of Its Ports Let us try communicating with a floppy disk drive using one of its ports. Port number 3F2h is used to control certain functions of the floppy drive, as we will see in more detail in the chapter on the floppy drive interface. This port is connected to a register on the floppy drive controller, called the digital output

102

Understanding Personal Computers

register. This is an 8-bit register of the following form. (Each box represents a single bit.)

I MotorO

MotorC

MotorB

MotorA

OMA

Reset

OR!

ORO

The important thing to note about this now is that bit number 4 (highlighted) turns the motor for drive A on and off. So let's put a 1 in this bit to turn on the motor, wait a few seconds and then put a 0 in this bit to turn off the motor! To turn on the motor, we want the register to contain the binary number 00010000, which is 16 in decimal. To turn it off, we want to put 00000000 in the register. 1. Save your work and start QBASIC, as described in the appendix (Trying the Experiments). 2. Type in the following program. The QuickBasic OUT instruction is used to send data to a port. OUT &H3F2, 16 FOR I = 1 TO 10000: NEXT I OUT &H3F2, a END

The first OUT instruction sends data (in this case 16) to the port whose address in given in the instruction (in this case 3F2h). The second line just does nothing 10,000 times, in effect causing a short delay so we can see the drive light go on. The third line sends 0 to the same port. 3. Run the program while watching the drive light on drive A. It should turn on for a moment and then turn off.

I~ I End of Experiment CPU Involvement in the I/O Process In general, there are three levels at which the CPU can be involved in the I/O process-programmed 110, interrupt I/O and DMA.

5. Input/Output

103

Programmed I/O In programmed I/O, the CPU takes complete charge of the I/O process, and devotes itself exclusively to the task. To illustrate, suppose that the CPU wants to read some data from a floppy disk into memory. A simplified version of programmed I/O would proceed as follows: •

The CPU places the starting address and number of bytes to read into the appropriate registers on the floppy disk controller.



The CPU then issues a READ command to the floppy disk controller, by setting the value in the controller's control register.



The CPU then goes into a loop, constantly checking the contents of the controller's status register for a signal that the transfer of a single byte from disk to controller has been completed. The byte of data is then placed in the controller's data register. Note that this process might take quite a while, since the controller must move the drive's read/write heads to the required location and wait for the disk to rotate until the data lies underneath the heads. At a speed of 360 RPMs, or 1 revolution every 0.017 seconds, the average time for this rotation is 0.0085 seconds. During this time, the CPU could have executed over 1 million instructions!



Once the data is ready to be retrieved, the CPU takes the data from the data register and moves it to main memory.



The cycle then repeats as long as there is more data to retrieve.

Interrupt 1/0 Programmed I/O is reasonably straightforward, but can consume a major portion of the CPU's processing time, especially with slow devices such as a floppy drive, CD-ROM drive, printer or modem. One alternative is to use the PCs interrupt system. The same task might proceed as follows using interrupt I/O. •

The CPU places the starting address and number of bytes to read from the disk into the appropriate registers on the floppy disk controller.



The CPU then issues a READ command to the floppy disk controller, by setting the value in the controller's control register.



The CPU then goes about other business.

104

Understanding Personal Computers



When the data has reached the controller's data buffer (perhaps a million CPU instructions later), the controller issues an interrupt to the CPU, which then fetches the data from the data buffer and places it in memory.



This process is repeated until no more data is to be transferred.

In the interrupt I/O scheme, the CPU is only involved in the I/O process when it needs to do something. This is generally a much more efficient use of the CPU's time than under programmed I/O.

Direct Memory Access (DMA) Direct Memory Access, or DMA, is a technique that allows the CPU to initiate a data transfer from a peripheral device to memory, and then let the DMA controller do all the work. The CPU can then go about other business, and will be informed, via an interrupt, only after the entire transaction is completed. This is clearly the most efficient use of the CPU's time. There is one wrinkle in the DMA system, however. There is only one bus, and so the CPU and DMA controller must share. This leads to a tradeoff between data transfer speed and efficient use of the CPU. There are two forms of DMA transfer, sometimes referred to as third party DMA and bus-mastering DMA. A third party DMA transfer takes place roughly as follows. •

An application makes a request to, for example, transfer some data from the hard disk into memory. The operating system sends this request to the device driver for the hard disk.



The device driver asks the CPU to initialize the DMA controller, giving it

+ + + + •

the starting address in memory where the data is to go, the amount of data to be transferred, the type of transfer (write, read or verify-in this case write) the DMA channel, which specifies the device from which to get the data (each device must have its own DMA channel: modern PCs have 7 available DMA channels),

The device driver then tells the hard disk controller to request DMA service (by activating a control line on the bus).

5. Input/Output

105



The DMA controller requests that the CPU refrain from using the bus. The CPU acknowledges the request and frees the bus (when it can do so).



The DMA controller determines the memory address for the next byte of data coming from the drive controller and transfers the data directly from disk controller to memory. After the byte is transferred, the CPU is given control of the bus again. The previous step is repeated. Thus, the CPU and DMA controller constantly swap control of the bus.

In bus-mastering DMA, the DMA controller is either built into the I/O device controller (in this case, the hard disk controller), or may be included in the chipset on the motherboard. A bus-mastering DMA controller takes complete control of the bus (making the CPU relinquish the bus) and performs the entire data transfer without CPU intervention. This process leads to much faster data transfers. Speaking in general terms, any device that can take control of the bus for purposes of data transfer is called a bus master. Thus, the microprocessor and (bus mastering) DMA controller are bus masters. These days, there are bus .mastering IDE hard disk controllers, bus mastering SCSI hard disk controllers, and even bus mastering parallel ports. The decision as to which bus master gets control of the bus at a given time is called bus arbitration, and may be handled by a bus controller.

Summary In summary, we can see that there are many strategies for implementing the I/O process. Moreover, emphasis is constantly shifting toward smarter controllers that can take much of the work away from the CPU. At the same time, modern CPU are becoming more capable of doing the work that is being taken away from them. This leads to a certain level of backlash that adds to the confusion concerning modern I/O processes.

More on Ports Let us make a few more remarks about I/O ports.

106

Understanding Personal Computers

Viewing Port Usage If you use Windows 95, you can take a peek at which ports are currently being used by choosing the System icon in the Control Panel. Then choose Device Manager, click once on Computer and choose the Properties button. Table 5.1 gives an example of some of what you might see.

Table 5.1-Common PC Ports and Their Uses Port Addresses Use OOOO-OOOF Direct memory access controller (DMA controller) Programmable interrupt controller (PIC) 0020-0021 0040-0043 System timer 0060-0060 Standard 1Ol/102-key keyboard 0061-0063 System speaker 0070-0071 System CMOS/Real-time clock 037S-037A Printer port (LPT1) Communications port (COM1) 03FS-03FF The numbers in Table 5.1 represent a range of addresses, since many devices use more than one 16-bit port. You can print this data, and more, using the Print button on the Device Manager tab. Parallel Port Confusion The PC design sets aside three I/O port address ranges to use as parallel ports. These address ranges start at 3BCh, 37Sh and 27Sh. During the PCs startup procedure, the BIOS checks eaGh of these port addresses, in the order just given, to see if there is an actual physical port (that is, a parallel port connector on the back of the PC or on an expansion card) connected to it. Let us refer to a logical port that is connected to a physical port as a logical/physical port pair. Any data sent to the logical port is transmitted through the corresponding physical port in the pair, through whatever cable is attached to the connector, and then to whatever device is on the other end of the cable. The BIOS copies the port addresses of any logical/physical port pair, in the order encountered, to the BIOS data area. Recall that the BIOS data area is an area of main memory (just after the interrupt vector table) containing important system configuration information. The first part of the BIOS data area is shown below.

5. Input/Output

400:

COMI port address

402:

COM2 port address

404:

COM3 port address

406:

COM4 port address

408:

LPTl port address

40A:

LPT2 port address

40C:

LPT3 port address

107

This assignment of logical/physical port addresses fixes the names LPTl, LPT2 and LPT3 to the corresponding logical/physical port pairs. The first logical port address 3BCh is usually not connected to a physical port in modem PCs. It was used with the original IBM PC, which had a physical parallel port built into its monochrome video adaptor card. Thus, a system with two physical parallel ports usually has LPTl assigned to port address 378h and LPT2 to port address 278h. Note that, if you buy an I/O expansion card that has a second physical parallel port on it, you usually need to set some switches or jumpers to assign the correct logical port address to that physical port. Incidentally, in the original PC, the address 40Eh was available to map to a fourth physical parallel port. However, in subsequent PCs, this port address was marked as reserved by the system BIOS, which does not check the address for a parallel device. Thus, PCs now officially support only three parallel devices (although Windows supports only two parallel ports). The same process assigns COM ports. Note that some PCs mark the two built-in physical serial ports on the back of the system unit with the words Seriall and Seria12. Which of these is COMI or COM2 (or COM3 or COM4, for that matter), depends on what value lies in the BIOS data area, and these values are easily changed with a little programming.

[~ I Experiment-Looking at Parallel and Serial Port Assignments To check out your parallel and serial port assignments, do the following: 1. Save all current work and start debug, as described in the appendix (Trying the Experiments).

108

Understanding Personal Computers

c: \WIN95>debug ...... 2. Do a memory dump at address 400h.

On my PC, which has two physical serial and two physical parallel ports, the results are -d 40:0 0040:0000

yy 00 00 00 OO???O 00 OC " COM1 COM2

........ x.x .....

LPT1 LPT2

Recall that on a PC, the memory bytes that make up a 16-bit word are stored in reverse order (little endian). Thus, the LPTI port address is 0378h.

I~ I End of Experiment By way of anecdote, in the old days, some programs were written to support only LPTI. That is, the program's print command always printed to LPTI. The knowledgeable computer user could fool the program into printing to either physical port, by writing a small program that would swap the port addresses in the BIOS data area. This swaps the assignment of logical/physical ports to the names LPTI and LPT2, thus fooling the program into printing to the other physical port! See what a knowledge of computer operation enables one to do? Note fmally that some operating systems (such as Windows) support only two parallel ports, which is why there are only two choices (LPTI and LPT2) when you configure a printer under Windows.

6. PC Supporting Systems

Chipsets A modern PC requires several so-called support chips to perform various routine but vital functions, such as refreshing memory, keeping all clocks going and processing hardware interrupts. Here is a partial list of support chips that may be found in a PC. •

Memory controller



Cache controller



Bus controller



DMA controller



Programmable interrupt controller



Programmable interval timer controller

There has been a trend in recent times to incorporate a PCs support chips (or support circuitry) into one or two chips, called the chipset for the Pc. These are integrated into the motherboard by its manufacturer. For instance, the Intel 82430HX PClset consists of two chips, called the 82439HX Xcelerated Controller (TXC) and one 82371SB PCI ISA/IDE Xcelerator (PIIX3) bridge chip. The former chip includes a CPU interface controller, a level 2 cache controller, a memory controller, and a PCI bus interface. The latter chip includes a bridge between the PCI bus and the ISA bus, a universal serial bus controller, and IDE interface, a DMA controller, a programmable interrupt controller, a programmable interval timer and the necessary circuitry for the NMI (nonmaskable interrupt). All this on just two chips! We have seen how the PIC and DMA controllers function, in general terms. Let us consider the issue of clock support in a PC.

110

Understanding Personal Computers

Clocks and Timers A modern PC contains oscillating crystals, clocks, and timers, and the difference can be a bit confusing.

Oscillators An oscillator is a quartz crystal that produces a regularly oscillating frequency, or clock signal. Clock signals are used for a variety of functions in a Pc. For instance, the microprocessor moves in step with a clock signal of a certain frequency. The buses also require a clock signal. Even the keyboard runs to the beat of a clock signal (to check for keystrokes). The first PCs (based on the 8088 microprocessor) used a single oscillating crystal with an output at 14.31818 MHz. This frequency was chosen because it is four times the subcarrier frequency of a color television, and it was thought at the time that PCs would use color televisions for multimedia. Special circuitry in the PC reduced this clock signal by various amounts to run different components of the PC. For instance, the clock frequency was divided by 3 to produce the 4.77 MHz clock signal for the 8088 microprocessor itself. This is why the original microprocessor ran at this unusual clock speed. Later PCs use more than one crystal to generate the needed frequencies. Some modern PCs use special chips that input a single frequency signal and output several signals, at different (but fixed) frequencies. One output can be used to pace the microprocessor, for instance, whereas another can be used to pace a bus. A chip that produces one or more frequencies as output, using the signal from an oscillator as input, is sometimes referred to as a clock, or clock generator. The clock in a PC is also sometimes called the system clock.

The Programmable Interval Timer The frequencies generated by a clock generator are fixed by the hardware. On the other hand, a timer is a device that produces regular signals, whose frequencies can be programmed through software. PCs have a timer that is referred to as the programmable interval timer or PIT, as shown in Figure 6.1.

6. PC Supporting Systems

Programmable Interval Timer

Programmable Interrupt Controller

BIOS Data Area (in memory)

pentium Clock input (1.19318 MHz)

111

Clock tick count at 46Ch

IRQ7

To memory refresh ;> To drive the PC speaker

Figure 6.1 - The programmable interval timer

The PIT receives a clock signal at a frequency of 1.19318 MHz, which is 1/12 of the frequency of the 14.31818 MHz oscillating crystal. It uses this signal to produce signals of other frequencies, for the following purposes (also shown in Figure 6.1). •

To supply a timer tick to the microprocessor at a rate of 18.2 times per second. This is done using IRQO to interrupt the CPU. This clock signal is used to keep track of the time and is also used by programmers to write programs that must perform certain functions at regular intervals.



To supply a clock signal for the purpose of refreshing the dynamic RAM chips that are used in the PCs main memory . (We will discuss this in the chapter on memory.)



For use with the PCs speaker.

The PIT contains three distinct 16-bit counters, which are l6-bit registers (memory areas inside the chip) that can hold numbers. These counters can work just like stop watches. A number is placed into a counter. When each tick of the input signal arrives at the PIT, it decrements the value in the counter. When the counter reaches 0, the PIT outputs a signal. Then the counter is automatically reloaded and the countdown begins again. Thus, for instance, if a counter is loaded with the value 2, then every other tick of the input signal will produce an output signal. Hence, the output will have a frequency of 1.19318/2 = 0.59659 MHz.

112

Understanding Personal Computers

The System Timer The largest value that a 16-bit counter can contain is 65,535 so the slowest output frequency of the PIT is 1.19318 MHz/65536 = 18.2 Hz, that is, 18.2 cycles per second. Counter 0 uses this value and so produces a clock tick 18.2 times per second. As illustrated in Figure 6.1, each of these ticks is sent to the programmable interrupt controller along IRQO. The PIC thus interrupts the CPU 18.2 times per second. For each such interruption, the CPU executes interrupt number 8. The purpose of interrupt 8 is to update the so-called system timer. This is simply a 4-byte memory location in the BIOS data area (at address 46Ch) that keeps track of the number of clock ticks since midnight. By this means, DOS can determine the time of day, for use in time-stamping new files, for instance. The real-time clock, mentioned earlier, is used during the startup process to initialize this timer tick count (as well as determine the date). It is then kept current by interrupt 8. The interrupt 8 service routine also calls another interrupt-number 1ch. This interrupt does nothing by itself, but can be used by programmers who wish to trap the interrupt, for use in their programs. Thus, a programmer can arrange it so that his or her program receives a regular signal 18.2 times per second!

I~ I Experiment-Check out the System Timer To check out the system timer, proceed as follows. 1. Save your work and start debug, as described in the appendix (Trying the Experiments). C : \ WIN9 5 >debug..-J

2. Do a memory dump at address 46Ch to see the current value of the system timer. -d 40: 6C..-J

At the time this was done on my PC, the result was (leaving out the ASCII portion) -d 040:6c 0040:0060

8e

AB

oe

00

6. PC Supporting Systems 0040:0070 0040:0080 0040:0090 0040:00AO 0040:00BO 0040:00CO 0040:00DO 0040:00EO

00 lE 80 00 00 00 00 00

00 00 07 00 00 00 00 00

00 3E 07 00 00 00 00 00

00 00 00 00 00 00 00 00

00 18 00 00 00 00 00 00

01 10 00 00 00 00 00 00

00 00 10 00 00 00 00 00

00-14 60-F9 12-AO 00-C2 00-00 00-00 OO-FC 00-00

14 14 11 OB 00 40 04 00 00 00 00 00 FC 30 00 00

28 03 00 CO 00 00 10 00

01 00 BO 00 00 00 00

01 00 FD 00 00 00 00

01 00 FF 00 00 00 00

113 01 77 FF 00 00 00 00

You should follow along using the value you get. The system timer is the first four bytes. As usual, PC memory holds these bytes in reverse order (little endian). Thus, the actual value of the system timer is 000CAB8Ch. Using a calculator, we get the decimal equivalent 000CAB8Ch = 830,348 Dividing this by 18.2 gives the number of seconds since midnight, which is 45,623 seconds, or 12.67 hours. Thus, the time was about 12:40 PM. (About time to break for lunch!) If you now do another memory dump, the value of the system timer should have increased by 18.2 times the number of seconds since the previous dump. Give it a try.

I~ I End of Experiment I~ I Experiment-Programming the Speaker In this experiment, we program the speaker using the PIT and its ports. This experiment is slightly more involved than the previous ones, but demonstrates very clearly the differences between programming the hardware and using the features of a high-level language. 1. Save any current work and start QBASIC. Enter the following program, exactly as written. f f

f

A program to activate the speaker using the Programmable Interval Timer This program will produce a middle C.

114

Understanding Personal Computers

, Put countdown value into counter 2 of PIT , the frequency 261.63 is middle C , So 1193280 / 261.63 = 4561 = 11D1h OUT &H43, &HB6 OUT &H42, &HD1 OUT &H42, &H11

'prepare PIT 'high order byte into port first 'followed by low order byte

, Save value at port 61h oldval = INP(&H61) , set bits 0 and 1 newval = oldval OR 3 , output new port value to turn on speaker OUT &H61, newval 'wait FOR i = 1 TO 10000: NEXT i , turn off speaker OUT &H61, oldval END

Let us explain the meaning of these lines of code. First, we must determine the counter value that will produce the desired tone frequency. Since middle C has a frequency of 261.63 Hz, we get the correct countdown by dividing the input frequency 1193280 Hz by the desired output frequency 261.63 Hz. This value is 11D1h. This value is placed in counter 2 of the PIT in three steps. First, we tell the PIT to expect the value. Then we send the value, one byte at a time, as shown below. OUT &H43, &HB6 OUT &H42, &HD1 OUT &H42, &H11

'prepare PIT 'high order byte first 'followed by low order byte

The next step is to send the output generated by the PIT to the speaker. This is done by placing a value into the port at address 61h. However, we want to set the two lowest-order bits (two rightmost bits) ONLY. This is done by first getting the value in that port, using an INP instruction

6. PC Supporting Systems

115

oldval = INP(&H61)

and then using the logical operation OR to change the two lowest bits: newval = oldval OR 3

To explain how this works, we need to look at the numbers in binary. The OR operation works as follows: 0 0 1 1

OR OR OR OR

0 1 0 1

= o = 1 = o = 1

The important thing to note here is that

o

OR 1 = 1

and

1 OR 1 = 1

so when we OR a bit with a 1, the result is always a 1. Moreover,

o

OR 0

=

0

and

1 OR 0

=

1

so when we OR a bit with a 0, it does not change the bit. Finally, then if we OR a byte with 00000011, the result will be to change the two rightmost bits to 1, but not affect the other bits. Since 00000011 is the decimal number 3, you now see the purpose of the instruction newval

= oldval

OR 3

Then we send newval to port 61h, which turns on the speaker. OUT &H61, newval

Then we wait for 10000 nothings. FOR i

=

1 TO 10000: NEXT i

and finally tum off the speaker by returning the register at port 61h back to its original value, which we saved earlier! OUT &H61, oldval

116

Understanding Personal Computers

2. Save the program (File, Save) and run it (F5 function key). You should hear the speaker play a reasonable facsimile of middle C. You can vary the duration of the tone by changing the value 10000 in the FOR instruction. You can also vary the frequency of the tone by changing the value sent to port 42h. 3. We can now really drive home the point that high level programming is much easier than programming the hardware directly. Start a new program in QuickBasic by choosing the File, New option. Enter the one line SOUND 236, 10

Run the program! Need we say more?

I~ I End of Experiment

7. The Microprocessor

Overview of the Intel Family of Microprocessors The microprocessor, or central processor unit (abbreviated CPU), is the centerpiece of the computer. It is responsible for handling arithmetic calculations, logical operations and comparisons, as well as calculating memory addresses and issuing commands to other components of the Pc. The microprocessor is located inside the system unit, and is no bigger than about 2 square inches, including its package-the actual chip itself is no bigger than a finger nail. Yet it may contain upwards of 7 million transistors, and be capable of executing well over 200 million instructions per second! As mentioned earlier in the book, microprocessors have internal clocks that pace the operations of the chip. The speed rating of this clock is referred to as clock speed, and is usually measured in megahertz. Most PCs use microprocessors made by a company called Intel. Since 1979, Intel has produced several different models of microprocessors, and personal computers have been designed around most of these models. Let us discuss the main Intel CPUs, starting with the least sophisticated (and slowest). Later in the chapter, we will take a more detailed look at the Pentium, Pentium Pro and Pentium II processors. The appendix contains a list of Intel's microprocessors (used both as CPUs and as support chips), along with some statistical data about each processor.

Intel 8088 Microprocessor The first IBM PC (1981) was based on the Intel 8088 microprocessor. This microprocessor contains 29,000 transistors and has a clock speed of 4.77 MHz. Even though the word length of the 8088 is 16 bits (that is, it has 16-bit registers), the data bus width for this microprocessor is only 8 bits, so the processor can only transfer its data 8 bits at a time. This was done in an effort to

118

Understanding Personal Computers

keep the price of peripheral devices down, since it was cheaper to build 8-bit devices (and data buses) than 16-bit devices. The fastest 8088 based PCs have a clock speed of 10 MHz. However, the 8088 is extremely slow by today's standards, and is capable of executing only about 330,000 instructions per second. Moreover, it is not capable of running much of today's software (including Microsoft Windows) and is therefore obsolete. The 8088 was not very good at doing numerical calculations with nonintegers-the so-called floating point arithmetic. Thus, Intel produced a numeric coprocessor called the 8087, which could be purchased and installed on the motherboard in a special socket. This chip would cooperate with the 8088 to produce much better speed for floating point calculations. Incidentally, we should mention that the immediate predecessor to the 8088 was the Intel 8086. The 8088 and 8086 are identical as far as internal operations are concerned, but the 8086 has a 16-bit data path. Intel also produced earlier processors by the names 4004,8008,8080 and 8085 (see the appendix).

Intel 80286 Microprocessor In 1984, mM introduced a PC based on a microprocessor known as the Intel 80286 (or 286 for short). mM called this computer the AT, which stands for Advanced Technology. The clock speed of the microprocessor in the original AT was 6 MHz, but clock speeds reached 20 MHz in later models. The 80286 is a 16-bit processor with a 16-bit wide data bus. The 80286 microprocessor has a much more efficient design than the 8088, and it is inherently much faster than the 8088, even when running at the same clock speed. An 80286 contains 134,000 transistors and can execute about 1.2 million instructions per second. An 80287 numeric coprocessor was also available. Although you may still run into a few 80286-based PCs, they are outdated by the fact that Windows 95 and Windows NT will not run on such a PC (and even if they did, you probably wouldn't have the time to wait for them).

Intel80386SX and DX Microprocessors In 1986, personal computers based on the Intel80386DX microprocessor (or 386 for short) were introduced. Compaq Corporation was the first to introduce such a computer. The 80386DX has a more sophisticated design than the 80286, and is specifically designed for multitasking, that is, for executing more than one

7. The Microprocessor

119

program at a time. Clock speeds on the 80386 range from 16 to 33 MHz. An 80386 contains 275,000 transistors and can execute about 6 million instructions per second (or 6 MIPS). Because 80386 based computer systems tended to be a bit expensive when they first appeared, Intel introduced a less expensive version of the 80386 known as the 80386SX (or 386SX for short). Internally, the 80386SX is just as fast as the 80386DX. However, it has a 16-bit data bus, rather than a 32-bit data bus. Thus, 80386SX based computers provided a compromise between speed and cost. Intel also produced an optional 80387 numeric coprocessor for floating point calculations. The 80386 microprocessor is also outdated by today's standards. Theoretically, Windows 95 can run on a 20 MHz 80386, and Windows NT (version 3.51) on a 25 MHz 80386, but from a practical point of view, these computers are really not powerful enough to run today's software in a productive fashion.

Intel80486SX and DX Microprocessors In 1989, the first PCs appeared based on the Intel 80486DX microprocessor (or 486 for short), running at 25 and 33 MHz. This microprocessor has 1.2 million transistors and can execute about 20 million instructions per second (20 MIPS). The 80486DX has a built-in numeric coprocessor, making an 80487 unnecessary. An SX version of the 80486 was also released in an effort to reduce the cost. The only difference between an 80486DX and a 80486SX is that the SX does not have a functioning numeric processor component. Intel subsequently introduced 50 and 66 MHz versions of the 80486DX. In order to gain more speed from its microprocessors, but not require computer manufacturers to alter their systems, Intel introduced a clock doubling technology in 1992. A clock-doubled 80486 is an ordinary 80486 that executes instructions internally at twice the speed of an ordinary 80486, but communicates with the bus at the normal speed. For instance, a 25 MHz 80486DX whose internal clock is doubled, and thus executes instructions at 50 MHz, is called an 80486DX2/50. Note that this is not the same as an 80486DX/50, which is actually a faster chip, since it moves at 50 MHz both internally and externally. Intel has also introduced an 80486DX4/100, which is a 33 MHz 80486 whose internal clock is tripled (despite the fact that it is denoted by DX4 and not DX3). The 80486 microprocessor has been made outdated by the Pentium processor. A fast 80486-based PC (66-100 MHz) is capable of running Windows

120

Understanding Personal Computers

95 or Windows NT and its applications moderately well. However, a 486-based PC may very well lag behind in its video and hard disk performance as well.

The Pentium Microprocessor The Pentium processor was introduced by Intel in 1993. The first group of Pentiums had clock speeds of 60 and 66 MHz (and ran at a hot 5 volts). There followed 75,90, 120, 133, 166 and 200 MHz processors. The 133 MHz Pentium, for instance, can perform 218.9 million instructions per second. After the 133 MHz processor, Intel stopped rating their microprocessors (at least publicly) using MIPS. Pentium processors contain 3.2 million transistors. The latest group of Pentium chips (133 MHz and above) use what is referred to as 0.35 micron technology to cram these 3.2 million transistors into a rectangular chip that measures 0.392 by 0.326 inches. This chip is encased in a package that measures 1.95 inches square. (See Figure 7.1.) The term micron technology refers to the width of the conductive and insulating lines in the chip. A micron is one millionth of a meter! In contrast, a human hair is about 50 microns thick and bacteria range from 0.5 to 10 microns in size. We will take a detailed look at the Pentium processor a bit later.

pentium Figure 7.1- Actual size of a Pentium chip (in black) and its package

7. The Microprocessor

121

The Pentium Pro Microprocessor The next processor in the Intel microprocessor line, following the Pentium, is the Pentium Pro microprocessor, with 5.5 million transistors and clock speeds of 150, 166, 180 and 200 MHz. These chips, introduced at the end of 1995, are specially tuned for the latest 32-bit programs, but actually do not perform significantly better than a Pentium on 16-bit applications. We will take a closer look at the Pentium Pro later in the chapter.

The Pentium II Microprocessor The latest processor in the Intel microprocessor line, introduced in May of 1997, is the Pentium II microprocessor, with about 7.5 million transistors and clock speeds of 233, 266 and 300 MHz. This processor is especially tuned for 32-bit applications, and for multimedia applications in particular, since it encorporates Intel's MMX technology, which we discuss later in the chapter. We will also take a closer look at the Pentium II later in the chapter.

Summary Table 7.1 summarizes, for comparison, some of the statistics of the Intel line of processors.

122

Understanding Personal Computers

Table 7.1 - Microprocessor Statistics Chip

Speeds (MHz)

8088 80286 80386 80486 Pentium

4.77-10 10-20 16-33 25-100 60,66,75,90, 100,120,133, 150,166,200 150, 166, 180, 200 233,266,300

Pentium Pro Pentium II

Number of transistors 27K 134K 275K 1.2M 3.2M

MIPS

5.5M

no longer measured no longer measured

7.5M

.33 1.2 6 20 200+

Data bus width 8 16 32 32 64

Address bus width 20 24 32 32 32

64

32

64

32

MMX In March of 1996, Intel announced a new enhancement, called MMX, to their line of processors. MMX is a trademark of Intel and is, according to Intel, not an acronym. Nevertheless, based on the expressed purpose of the technology, it appears that MMX may have at one time stood for multimedia extensions. The purpose of MMX is to accelerate multimedia and communications applications. To prepare the new technology, Intel analyzed several types of software applications, such as those related to graphics, video, music synthesis, speech synthesis, image processing and games, to determine what processor enhancements would increase their performance. The results of these studies led Intel to add 57 new instructions to the microprocessor that are designed to manipulate groups of data with a single instruction. This is referred to as Single Instruction, Multiple Data (or SIMD). For instance, graphics pixels (picture dots on the screen) are sometimes represented in memory as bytes. An MMX-compliant microprocessor can place eight of these bytes into a single 64-bit word, called a packed byte. When an MMX instruction operates on a pair of packed bytes, it performs the same operation on each of the eight byte pairs at the same time, and stores the results in a packed byte. This "eight-for-the-price-of-one" technique is a form of parallel processing. MMX technology includes four new packed data types, which are shown in Figure 7.2.

7. The Microprocessor

123

Packed byte

Packed word !16-bit word !16-bit word !16-bit word !16-bit word!

Packed doubleword

!

32-bit word

I

32-bit word

Packed quadword

I

64-bit word

Figure 7.2 - The MMX packed data types

The 57 new MMX instructions extend the usual operations (arithmetic, logical and so on) to these packed data types. For instance, the assembly language instruction PADDB will add the corresponding bytes in two packed byte pairs. There are also instructions for packing and unpacking these data types. For those readers who know some math, we mention that there are MMX instructions for performing dot products, which also has the effect of simplifying matrix multiplication.

A Detailed Look at the Pentium Processor Let us take a more detailed look at the Pentium processor.

Different Models and Their Speeds Currently, there are several different models of the Pentium available, distinguished by their clock speeds. Table 7.2 shows the different clock speeds, along with the corresponding local bus speed and PCI bus speeds. Note that the PCI bus runs at exactly half the speed of the local bus. (Of course, by the time this book reaches you, there may be some additional clock speeds available.)

124

Understanding Personal Computers

Table 7.2 - Pentium and Bus Speeds PCI bus Local bus Processor speed speed (MHz) speed (MHz) 25 50 75 60 30 90 66 33 100 30 60 120 66 33 133 60 30 150 33 166 66 200 66 33

It is interesting to observe, for instance, that because the bus speed of a 133 MHz Pentium is faster than that of a 150 MHz Pentium, the overall speed of otherwise identical systems based on these two processors is quite close, even though the 150 MHz processor is 13% faster. (See the discussion of iCOMP coming next.)

iCOMP The most common measures of microprocessor performance are clock speed and MIPS (millions of instructions per second). Unfortunately, neither of these statistics are really very good at measuring performance. In fact, Intel has stopped computing (at least publicly) the MIPS rating of their processors, after the 133 MHz Pentium. In 1992, Intel introduced a different measure of performance, called iCOMP, in order to mitigate the widespread misconception among PC users (and buyers) that a processor's clock speed is, as Intel puts it, "a linear measure of its performance." In other words, if we double the clock speed, we do not double the performance. The iCOMP rating, now in version 2.0, is a single number that represents a weighted average of several processor benchmarks. It should be noted, however, that the benchmarks use only 32-bit software. Thus, the iCOMP numbers would be different if it used a mix of 16-bit and 32-bit software, especially in the values for the Pentium Pro. One nice thing about iCOMP is that it can be used to compare processors of different designs, for instance, a 200 MHz Pentium against a 200 MHz Pentium Pro. Table 7.3 shows the iCOMP ratings of some of Intel's latest offerings.

7. The Microprocessor

125

Table 7.3 - iCOMP Ratings Processor iCOMP Ratin~ Pentium II 266 MHz Pentium II 233 MHz Pentium Pro 200MHz Pentium Pro 180MHz Pentium Pro 150MHz Pentium 200 MHz Pentium 166 MHz Pentium 150 MHz Pentium 133 MHz Pentium 120 MHz Pentium 100 MHz Pentium 90 MHz Pentium 75 MHz

303 267 220 197 168 142 127 114 111 100 90 81 67

The Internal View of the Pentium Figure 7.3 shows a simplified picture of the internal view of a Pentium processor.

126

Understanding Personal Computers

,----

64 -bit data bus )

)

- - - '"

132

-

Execution Unit

- :> a; :l.

0-

I\> 0 CD

I i l l l illl l l ll l ~l l l 01 1 1 I QI olololOll lilll lillll 132

'--

Microcode ROM

132

BB

OJ

(f)

32-bit address bus control bus

1256 bits Instruction Prefetch Decode Unit

c: -'-

pentium

Code Cache 8 KB

132

Floating Point Unit

-

~

Reg isters (16 & 32 bits wide)

Data Cache 8 KB

Figure 7.3 - Internal view of a Pentium (simplified) The bus interface unit (or BID) is connected to the system's local bus, and receives and sends data, addresses and control commands through this interface. From there, data is taken to a small holding area of size 8 KB called the data cache. Instructions are taken to a different holding area called the code cache. The two caches are referred to as level 1 caches, to distinguish them from caches that are external to the microprocessor. We will discuss these so-called level 2 caches a bit later. Most of the components of the Pentium are designed to get data and instructions into the execution unit, which is where the action really takes place. For instance, the instruction prefetch unit grabs instructions from the code cache. Some instructions are "hard wired" into the processor and can be executed directly. Other, more complicated instructions, actually need to be further decoded, by the decode unit, into a series of simpler instructions called microcode, which the ALUs can execute. The Arithmetic and Logical Units (ALUs) The Pentium execution unit has two arithmetic and logical units, or ALUs. Each of these can perform arithmetic and logical operations, but one of the ALUs, being less powerful than the other, can execute only some of the simpler instructions. Under some circumstances, both ALUs can be operating at the same

7. The Microprocessor

127

time, allowing the Pentium to execute two instructions simultaneously! This is called superscalar technology. Pipelines Each of the ALUs is actually a pipeline. This refers to the fact that an ALU can be working on more than one instruction at a time, in a manner similar to an assembly line, as shown in Figure 7.4. In fact, up to five different instructions may be in different locations within an ALU, undergoing different phases of execution. Get Instruction

Decode

Get Operands

Execute

Write Results

Cycle 1 I Instruction 1 I Cycle 2

I Instruction 2 I Instruction 1 I

I Instruction 3 I Instruction 2 I Instruction 1 I Cycle 4 I Instruction 4 I Instruction 3 I Instruction 2 I Instruction 1

Cycle 3

Cycle 5

I Instruction 5 I Instruction 4 I Instruction 3 I

Instruction 2 Instruction 1 I

Figure 7.4 - Instruction pipelining

Pipelining is a great boost to performance, but does have its difficulties. For instance, five instructions may be in a pipeline at a given time, when the ALU finally realizes that the first instruction in the pipeline is actually a jump instruction. This is an instruction that orders the microprocessor to execute instructions in a different location in the program (rather than in order). Thus, the other four partially executed instructions in the pipeline are not really the instructions that the processor should be executing! In this case, the processor must flush the pipeline, causing it to waste execution time. The Pentium processor has certain built-in logic, called branch prediction logic, that attempts to look ahead for jump (or branch) instructions, in order to avoid the aforementioned problem. If an instruction requires the manipUlation of noninteger numbers (also called floating point numbers), then a special unit called the floating point unit does the computations. The Registers

Inside the execution unit, we also find the registers, which are small "scratch pad" memory locations to hold the data and addresses that are used in

128

Understanding Personal Computers

calculations. These registers are very, very fast. Most of the Pentium' s registers that a programmer might use are shown in Figure 7.5. Each short block is 8-bits long General Registers EAX I

-

Segment Registers

AX AH

AL

~I

es

BL

I

E8

BX EBX I

BH

Ecx l

CH

ex

OS

88 CL

OX EOX

OH

OL

EBP

BP

E81

81

EOI

01

ESP

SP Flags Re gister I

EFlags

I

Instruction Pointer Reg ister EIP

Figure 7.5 - Some of the registers in the Pentium The first group of registers, called general registers, are for general purpose use by the programmer, although they have specific use with certain instructions. Let us briefly discuss each general register •

AX-a 16-bit register called the accumulator. It is used for doing arithmetic, for executing I/O instructions (IN, OUT) and in calling the BIOS and DOS service routines. The AH and AL registers are for 8-bit operations and the EAX register, called the extended accumulator, is for 32-bit operations. (The 32-bit versions were first introduced in the 80386 microprocessor. )

7. The Microprocessor

129



BX-the base register is sometimes used to point to the base (or beginning) of a group of data. The base register also has 8-bit (BH and BL) and extended 32-bit (EBX) versions.



CX-the count register is used for doing repetitive, or looping, operations. Specifically, it is used to keep a count of the number of times to perform the operation. The count register also has 8-bit and 32-bit versions.



DX-the data register is used in arithmetic and I/O operations. It also has 8bit and 32-bit versions.



SI-the source index register is used in moving strings of data from one location to another. It has a 32-bit (extended) version.



DI-the destination index register is used for the destination of the string moving operations. It also has a 32-bit version.



BP-the base pointer is used as a base address, in conjunction with the SP register.



SP-the stack pointer is used to point to a special location for temporary storage of data, called a stack.

The segment registers are used to hold the segment portions of an address. The CS register is called the code segment register, DS is the data segment register, ES is the extra segment register and SS is the stack segment register. Thus, a program can have access to four different segments of memory at one time. The individual bits of the flags register hold status values. A flag is a term used for a bit that is used to signal a particular event. For example, after an arithmetic operation, the carry flag will be set to 1 if the operation resulted in a carry. The overflow flag will be set to 1 if the operation resulted in a value that is too big to hold in a register. Finally, the instruction pointer (IP and EIP) is used to hold the offset of the address of the next instruction to execute. The segment of the next instruction is kept in the code segment register (CS).

else Versus RISe There are essentially two philosophies that one can take in designing a microprocessor, with regard to its instruction set, that is, the set of instructions that it understands.

130

Understanding Personal Computers



Design the microprocessor to perfonn relatively few instructions, most of which are simple. Hardwire all of these instructions, so that they can be executed directly (and thus very quickly) by the processor and do not need decoding into microcode. More complex tasks are perfonned by combining these simpler, hardwired instructions.



Design the microprocessor to accept more complex instructions, thus creating a larger instruction set, and requiring the use of a decoder to decode these instructions into microcode before execution.

A processor designed to have a small instruction set with simple instructions is called a Reduced Instruction Set Computer, or RISC for short. A microprocessor designed to have a large instruction set including more involved instructions is called a Complex Instruction Set Computer, or CISCo There are no absolute numbers that determine when a processor is a RISe and when it is a else and some processors are said to have characteristics of both. Note that RISe processors tend to have an overall simpler design, since the instructions are simpler. else processors tend to be more complex. For instance, RISe processors do not, in general, use microcode. In away, the ordinary instructions are microcode. On the other hand, the unit that executes the microcode in a else processor is a fonn of RiSe processor. On the whole, RISe processors tend to be faster than else processors. This is due to the general principle that most of the work done by a microprocessor is done by relatively few of the instructions. In fact, extensive statistical studies at IBM in the mid 1970s have shown that approximately 20% of the instructions in a else processor do approximately 80% of the work. (Up to 30% of the instructions in a "typical" program are some fonn of jump instruction.) Thus, it makes sense to design a processor to be as efficient as possible in executing the 20% most commonly used instructions, and require it to combine simpler instructions to achieve the effect of the other instructions. As we mentioned earlier, there is no clearly defmed boundary between RISe processors and else processors. The early offerings from Intel (8088, 80286 and 80386) are else processors. The 80486 and Pentium processors employ features of both technologies. For instance, these processors execute some instructions directly, whereas others are translated into microcode. Pipelining (employed by the 486 and Pentium processors) is also a general characteristic of RISe processors. RIse processors generally execute their instructions in one clock cycle, another characteristic partially shared by Pentium processors. In any case, the Pentium and Pentium Pro processors are so well designed that their perfonnance rivals that of more traditional RiSe processors.

7. The Microprocessor

131

A Closer Look at the Pentium Pro Processor The Pentium Pro incorporates some interesting new features in its design, such as the ability to execute certain instructions out of order, when possible. This is referred to as Dynamic Execution. In addition, the Pentium Pro incorporates a level 2 cache directly in the processor package (the right interior portion of the chip shown in Figure 7.6). (We will discuss level 2 caches in detail a bit later in this chapter.) The level 1 code and data caches are each 16 KB in size-double that of the Pentium processor. The Pentium Pro package contains a dedicated 64-bit bus connecting the processor with the internal level 2 cache. This bus is only about 0.5 inches long and can transfer data at over 1.2 GB per second at peak efficiency! Because the processor-to-L2 cache bus is separate from the processor-to-system bus (not shown in Figure 7.6), Intel refers to this bus design as the Dual Independent Bus Architecture (a term that is seen more often in connection with the Pentium II processor, discussed later.).

Processor

(0.5 In)

L2 Cache (256 KB)

Pentium Pro

Figure 7.6 - Pentium Pro with internal cache Figure 7.7 shows a portion of the Pentium Pro's internal design.

132

Understanding Personal Computers

pentium pro

Figure 7.7 - Pentium Pro internal design To illustrate the out-of-order (Dynamic Execution) features of the Pentium Pro, consider the instructions mov ax, amemoryloeation add bx, ax add ex, dx In order to execute the first instruction, a memory access must take place, which

is a slow process. Normally, the processor must wait until the data is retrieved from memory. However, the Pentium Pro will look ahead at the next instructions. It will notice that the second instruction cannot be executed, since it depends on the eventual value in the AX register. But the third instruction can (and will) be executed by the dispatch/execute unit, before the first instruction is completed. Typically, the Pentium Pro can look ahead 20-30 instructions. Instructions enter the code cache in order and are fetched into the instruction pool by the fetch/decode unit. The dispatch/execute unit can then execute these instructions, perhaps out of order, as long as no data dependencies are violated. The results are temporarily stored in the instruction pool (and not in the microprocessor's registers, since these are visible to the programmer). At some point, the instructions can be "retired" in their proper order by the retire unit and then placed in the L1 data cache for export through the BUI. The Pentium Pro is also specially designed to be used in combination with other Pentium Pro processors within a single Pc. This is referred to as

7. The Microprocessor

133

symmetric multiprocessing or SMP. There is special circuitry on the processor to help manage the increased requests to memory due to the presence of more than one processor.

A Closer Look at the Pentium II Processor Like the Pentium Pro, the Pentium II employs the Dual Independent Bus Architecture (separate processor-to-L2 cache and processor-to-system buses) as well as Dynamic Execution (out-of-order execution). However, unlike the Pentium Pro, the Pentium II is MMX capable, and runs at higher clock speeds, with versions running at 233, 266, 300 and 333 MHz. According to Intel, the Pentium II delivers a performance boost of between 1.6 to 2 times that of the Pentium Pro, with over twice the performance boost on multimedia applications. Indeed, the Pentium II combines the Dual Independent Bus Architecture (separate processor-to-L2 cache and processor-to-system buses) and Dynamic Execution (out-of-order execution)-both seen in the Pentium Pro-with MMX capabilities, which is not seen in the Pentium Pro. Figure 7.8 shows the Pentium II processor package.

Processor Core

Cache Tag I...-.!=---'

Pentium II

Connectors

Figure 7.8 - Pentium II processor

The Pentium II processor package accomodates 512KB of level 2 cache, which is twice that of the Pentium Pro. (The level 1 caches are the same as in the Pentium Pro.) Moreover, the cache speed is tied to the processor core speed, being onehalf the core speed. For instance, the cache speed of the 300 MHz Pentium II is 150 MHz, which is more than twice that of the Pentium (66 MHz). The Pentium

134

Understanding Personal Computers

IT has level 1 code and data caches, each of size 16KB (twice that of the Pentium). As shown in Figure 7.9, the Pentium IT has a totally new package design. In particular, the processor core and the cache memory is packaged in a Single Edge Contact cartridge (or S.E.C.). In this design, the components are mounted on a small board and completely enclosed in a plastic and metal cartridge. It is this cartridge that is connected to the motherboard, much like an ordinary expansion card. The board has a thermal plate to which a heat sink and fan can be attached. The total package size is approximately 5.5 inches wide by 2.5 inches tall.

Figure 7.9 - Pentium II package As expected, processor benchmarks show that the Pentium IT processor is faster than the Pentium and the Pentium Pro. For instance, the iCOMP rating for the 233 MHz version is 267. (See the iCOMP rating table earlier in this chapter for a comparison with other processors.) Because the Pentium IT processor supports MMX, whereas the Pentium Pro does not, it performs significantly better than the Pentium Pro with respect to multimedia applications. Indeed, Table 7.4 shows Intel's media benchmark for several processors.

7. The Microprocessor

135

Table 7.4 - Benchmarks Processor Media Benchmark Pentium II 266 MHz 354 Pentium II 233 MHz 312 Pentium Pro 200MHz 197 Pentium 200 MHz (MXX) 257 Pentium 200 MHz (no MMX) 157

Note that because the Pentium Pro does not support MXX, its benchmarks are even less than that of the Pentium 200 with MMX.

External (Level 2) Cache A Pentium-style processor is by far the fastest component in a computer system. In particular, it is much faster than main memory. Thus, the CPU will often sit by idly during part of a memory access, waiting for the data that it needs to perform an operation. As discussed in an earlier chapter, these waiting periods are called wait states. In order to alleviate the need for a large number of wait states, modem systems include a special quantity of very fast memory called an external cache (also called a level 2 or L2 cache). This usually consists of 256 KB to 512 KB of very fast memory, known as static RAM, or SRAM. (We will discuss SRAM in the chapter on memory.) This memory is also very expensive, which is why it is not used for main memory. Cache memory is used as follows. When the microprocessor requests data from memory, a special chip (or special circuitry in the PCs chipset) called a cache controller checks first to see if that data already lies in the cache. If so, then it can be sent to the processor very quickly. This is referred to as a cache hit. If not, then it must be obtained from main memory. This is called a cache miss. However, the cache controller fetches not only the requested data, but also a fixed quantity of additional nearby data from memory. This additional data is also placed in the cache. In order not to waste time, at the same time as the cache controller is fetching the additional data, it sends the requested data to the microprocessor. The theory is that, the next time the processor requests data, it is very likely (but not certain) to be data that resides close to the previously requested data, and

136

Understanding Personal Computers will thus have been brought into the cache during the previous data request. This means that, after a cache miss, subsequent requests are likely to result in cache hits. In fact, by cleverly designing the procedure for deciding how much extra data to read into the cache, and which older data to remove from the cache to make room for the new data, hit rates can be as high has 80-90%! Thus, external caches can have a significant effect on performance.

Cache Strategies Of course, caches, being small, will fill up quickly and so a strategy is needed to decide which data to remove in order to make room for newer data. For instance, one possible strategy is to remove the least recently used (LRU) data. Another strategy is to remove the least frequently used (LFU) data. These strategies are certainly not the same. Also, there is a question of how to organize the cache. This is a bit involved, so we save the details for an appendix, and only mention here that there are three general approaches, known as direct-mapped, fully associative and set associative. The so-called 4-way set associative caches seem to be the most efficient. It is also worth mentioning that, contrary to first impressions, larger caches are not necessarily better than smaller ones. There comes a point where the size of a cache begins to slow down the cache's performance. Current studies seem to indicate that cache size should be no more than 512 KB.

Write Strategies When it comes time for the CPU to write some data to memory, two strategies are possible with regard to the cache. One is to skip the cache and write directly to main memory. This is referred to as write-through caching. The other is to write the data to the cache only. Then the CPU can go about any other urgent business, sending the data from the cache to main memory after perhaps a short delay. This is called write-back caching. The latter procedure does produce more efficient use of the CPU's time, but is more dangerous since the data in the cache will not be an exact copy of the corresponding portion of data in main memory. Because the data in main memory is thus out of date, it is referred to as dirty memory.

7. The Microprocessor

137

Other Caches While we are on the subject of caching, we may as well mention that data caching is used with hard disks as well as with memory. The principle is exactly the same. Since a hard disk is much slower than memory, placing a cache between the hard disk and memory can increase the apparent speed at which data is transferred between these two devices. Disk caching is often done using software, rather than hardware. (Although many hard disk controllers have built-in hardware caches, Microsoft Windows also provides a software disk cache.) In particular, a portion of memory is used as a disk cache. Then, whenever data is read from disk to memory, the disk cache is checked first to see if the data is already there, in which case it can be fetched from the software cache (in other words, from memory). A memory-to-memory transfer may be 100 times faster than a disk-to-memory transfer. If there is a cache miss, then the data is read from disk, along with additional data, in anticipation of the next request. Disk caches can also employ either write-through or write-back caching. However, the stakes are even higher for disk caches when using write-back caching. The reason is that, when the power is turned off, the contents of main memory are permanently lost. Thus, if there is an accidental power outage (or if the user accidentally turns of the computer) during the brief period when the disk has not yet been updated from the cache, data that the user thought was saved to disk will be lost! Fortunately, this brief period is usually under 1 second and power outages are rare, so for many (but not all) situations, write-back caches are a reasonable risk. (A battery backup power supply can further safeguard against the possible loss of cached data.)

8. Memory

The purpose of memory is to store information in a manner that allows as rapid reading and writing as possible. These days, a modem computer usually has between 16 MB and 128 MB of memory, which is enough to store between 2000 and 16,000 pages of single-spaced text. On the other hand, it may not be enough to store a large, full color photograph. Memory stores bits using electric charges. These charges need a constant power supply to remain valid. Thus, as we have mentioned, all data in memory is lost when the computer is turned off and the current is removed. In short, memory is fast, temporary storage.

Memory Chips and SIMMs Figure 8.1 shows a typical memory chip, in actual size. The memory chip itself is a small sliver of silicon no bigger than a fingernail. However, since it is so sensitive to air that exposure would destroy it, the chip is hermetically sealed in a plastic case, as shown in the figure. The larger size of the case also provides enough room for the pins. Memory chips are generally placed in computers in memory banks consisting of 8 or 9 chips per banle In early PC systems, adding additional memory required carefully inserting these chips, one by one, into sockets on the motherboard or on a memory expansion board. If a fragile pin was bent, it could easily break, rendering the chip useless. These days, most PCs are designed to accept packages of chips that are permanently soldered onto their own small circuit boards. The most popular version of this is the single inline memory module, or SIMM, shown in Figure 8.2. SIMMs are much easier to install and remove than individual chips.

140

Understanding Personal Computers

Figure 8.1- A computer memory chip (actual size)

Figure 8.2 - A SIMM package One edge of the SIMM board has the metal contacts that take the place of pins. This edge snaps easily into a SIMM socket, eliminating the problem of breaking a pin. Thus, SIMMs are much easier to install and remove than individual memory chips, bringing the process of upgrading memory into the realm of the average user. Figure 8.3 shows how the letter E (as an ASCII codeword) might be stored in a SIMM module inside a Pc. E=0 1 0 00101

Figure 8.3 - The letter E in memory

Random Versus Sequential Access Data in main memory is arranged in such a way that any portion of memory can be directly read or written, without having to wade through other portions of

8.~emory

141

memory. This is referred to as random access. Accordingly, memory of this type is referred to as random access memory, or RAM. Data on a tape, for instance, is not random access, for it is not possible to access data in the middle of the tape without going through the beginning of the tape first. Thus, tape is said to be a sequential access medium.

Dynamic and Static RAM The most cornmon, and least expensive, type of memory uses a tiny capacitor to hold an electric charge that represents a bit of information. (A capacitor is simply a device that is designed to hold an electric charge.) The presence of a charge indicates a 1 and the absence of a charge indicates a 0, or vice versa, depending upon the manufacturer of the chip. Unfortunately, a capacitor can only hold a charge for a very brief time and so it must constantly be refreshed. For this reason, this type of RA~ is called dynamic RAM, or DRAM. A faster, and more expensive, type of memory uses very tiny switches, called flip-flops, to hold the value of a bit. Flip-flops are stable, that is, they hold their position until a new current is applied, and so refreshing is not required for this type of memory. RA~ that does not require refreshing is called static RAM, or SRAM. Unfortunately, SRA~ is too expensive to use in main memory, but its increased speed (approximately 3-5 times faster than DRA~) makes it very useful for relatively small level 2 cache memory.

ROM As we mentioned, when the current is removed from memory (either DRA~ or SRA~), all information is lost. However, it is possible to design memory that does not loose data when the current is turned off. Read-only memory, or ROM, is memory that is designed to hold data persistently, that is, after power has been removed. The information in early RO~ chips was literally burned into the chip, making the data truly permanent, but later technologies have allowed data to be stored persistently but not pennanently, as we shall see. RO~ memory is used in a PC for a variety of specific purposes. For instance the system BIOS is stored in a RO~ chip, hence the name ROM BIOS. Also, the device BIOS on an adaptor card is stored in RO~ chips. RO~ is used inside the Pentium microprocessor to hold the microcode that we discussed earlier. Note that there is a potential for some confusion in terminology here, since RA~ stands for random access memory, but RO~ is also random access. By

142

Understanding Personal Computers

common usage, the distinction between RAM and ROM is that RAM can be written to, but ROM cannot (at least not easily or repeatedly). There are a variety of different strategies for making ROM chips. Let us review some of the more common ones.

PROM One form of ROM chip is called a programmable ROM, or PROM, chip. The circuits in a PROM consist of small fuses. Initially, all fuses are in tact, allowing current to flow freely. The programmer of the PROM melts some of the fuses, using a strong current, to determine the operating characteristics of the PROM chip. This process is called burning a PROM. Once the PROM is burned, its information is completely determined. This is truly permanent storage.

EPROM It is now possible to design an erasable programmable ROM chip, or EPROM. In such a chip, a strong ultraviolet light can restore the links broken by burning the chip. EPROMs can be recognized by a small window in the middle of the chip's package, to allow the light to reach the chip. (The window is covered by a label for protection from unwanted light.) Incidentally, ordinary room light is not generally sufficient to erase an EPROM, but sunlight can be!

EEPROM Electrically erasable programmable ROM, or EEPROM, is ROM that can be erased using an exceptionally strong current, rather than ultraviolet light. This has the distinct advantage that the chip does not have to be removed from the computer to erase it. However, the current form of EEPROM can only be erased a modest number of times, and the entire contents of the chip must be erased at one time-it is all or nothing. Thus, EEPROMs are not a suitable replacement for RAM.

8. Memory

143

Flash RAM The latest wrinkle in the ROM saga is flash RAM. This is like EEPROM, but can be erased using ordinary levels of current. Flash RAM has found its way into the system BIOS of many modem PCs, as well as into the ROM BIOS of many peripheral devices, such as modems. It still suffers from the drawbacks of EEPROM, however, and so does not provide a substitute for ordinary RAM. Incidentally, when shopping for a peripheral device for a computer, it pays to consider whether or not the device uses a flash RAM BIOS, for this will make it possible to easily upgrade the BIOS if new versions appear, or bugs are fixed. This is certainly far superior to replacing the chip itself.

Nothing Is Perfect The permanence of ROM carries with it some interesting consequences, stemming from the fact that nothing is perfect. For instance, early IBM PCs suffered from a certain bug in their system BIOS that affected certain types of software. (It is probably safe to say that all system BIOS's have bugs.) This raised an interesting dilemma to software manufacturers. Should a software company deliberately alter their software, in effect introducing a deliberate bug, in order to compensate for a bug in the system BIOS? What if the next version of the BIOS no longer has the bug? Then the software will no longer run and it will appear to customers that this is due to a gratuitous bug on the part of the software company! On the other hand, if the software is not altered, it certainly will not sell, since it will not work with the defective BIOS.

VRAM As we will see when we discuss the video system of a PC, video images are formed in a special type of memory that resides on the video adaptor card, before being displayed. Special circuitry constantly moves the memory image to the monitor, translating the digital data into the analog signals required by the monitor. This typically happens 60-80 times per second. At the same time, whenever the image changes, which may be many times per second, the CPU needs to have access to the video memory to make the necessary changes. Thus, the same location in video memory may need to be accessed at the same time by two different sources. Since ordinary DRAM cannot be read from and written to

144

Understanding Personal Computers

at the same time, either the CPU or the display would have to wait. This may produce delays in the display of images. Low performance video adaptors often use DRAM for video memory, but higher performance adaptors use a special type of memory known as VRAM. A VRAM chip has two separate data paths for each bit. One path can be used for both reading and writing bits. This is the path used by the CPU. The other path is read-only and is used to refresh the image on the display. Thus, both processes can take place at the same time. This two-way design is often called dual-ported memory.

Memory Speed Memory chips come in different speeds. The access speed of a memory chip is the time between the presenting of an address to the chip and the time that the data is ready for output from the chip. The cycle time is the shortest time between successive requests for data from the chip. For DRAM chips, the cycle time is the access time plus some additional time it takes for the circuits inside the chip to prepare for the next request. This often means that the cycle time is 2 or 3 times the access time. SRAM chips do not need such preparations, and so the cycle time is the same as the access time. This is one reason why SRAM chips are faster than DRAM chips. Most DRAM access speeds are in the range of 60-80 ns (billionths of a second). This means that a single request for data from a chip can be honored in 60-80 billionths of a second. SRAM chips have access times in the 15-20 ns range. On the other hand, EPROMs have access times in the 120-250 ns range, and are thus slow relative to RAM. This is why many modern computers offer the option of moving important ROM BIOS code, such as video or system BIOS, from ROM to faster RAM, a process known as shadowing ROM. (The area of RAM that stores the code is called shadow RAM.)

Parity Checking Some SIMM packages have an extra chip that is used for parity checking. The purpose of parity checking is to detect errors that may occur when data is read. The principle of parity checking is extremely simple. To each byte (8 bits), we associate an extra bit, called an even parity check bit. This parity check bit is chosen so that there are always an even number of 1's in the 9 bits. Thus, if the

8. Memory

145

byte already has an even number of 1 's, the parity check bit is a 0, but if the byte has an odd number of 1 's, the parity check bit is a 1. For instance, the ASCII code for E is 01000101 and since this has an odd number of 1 's, the parity check bit is a 1 (bringing the total number of Is to 4, which is even). Figure 8.4 illustrates a SIMM with an even parity check chip. Now, here is the point. If a glitch should occur and exactly one of the 9 bits were to be altered, then there would be an odd number of l' s in those 9 bits. When the computer reads these 9 bits, it does a quick parity check and discovers that there are an odd number of 1'so This results in a parity check error. The user will then get a nasty error message. Repeated parity errors require that the SIMM be replaced. Fortunately, modem memory is generally quite reliable, and you may never encounter a parity check error.

E=01000101

Even parity bit

Figure 8.4 - SIMM with parity chip Note that the manufacturer of the computer decides whether or not to support parity checking. If a PC supports parity checking, all SIMMs inserted into that PC must have parity check support. Note also that parity checking can detect the presence of any odd number of errors, because this will result in an incorrect parity. However, any even number of errors will go undetected. We should also emphasize that parity checking can only detect errors. There are other, more sophisticated approaches that can not only detect single errors, but also correct them! This technology is available in some PCs, and is referred to as error-correcting code, or ECC.

SIMM Packaging Memory SIMMs are packaged in a variety of ways. From the consumer's point of view, the issues that matter are

146

Understanding Personal Computers



The capacity of the SIMM, usually measured in megabytes,



The number of pins on the SIMM,



Whether or not the SIMM supports parity checking.

SIMM Capacity The capacity of a SIMM refers, of course, to the amount of data that the SIMM can hold. SIMMs come in a variety of capacities, commonly ranging from 2 MB to 32 MB. One important practical issue related to SIMM capacity stems from the limited number of SIMM slots on a PC motherboard. The number is usually 4 or 6. Consider a PC with only 4 SIMM slots. If you purchase such a PC with 16 MB of RAM, the dealer may install four 4-MB SIMMs, since it may be cheaper than installing two 8-MB SIMMs. This will fill up all of the SIMM slots, which is bad news if you later decide to add more memory. For instance, on a Pentium PC, which requires that SIMMs be installed in matching pairs (we will see why in a moment), to add an additional 8 MB, you will need to remove two 4-MB SIMMs and replace them with two 8-MB SIMMs. To add 16 MB, you would need to replace all four 4-MB SIMMs! This is a major waste of chips and money. Thus, when buying a PC, look for a motherboard with 6 SIMM slots or, failing that, make certain that the dealer installs the smallest number of SIMMs that will give you the desired memory-in this case two 8-MB SIMMs. This will leave you free to upgrade without throwing away perfectly good memory.

SIMM Pin Count The pins (actually metal tabs) on the bottom edge of a SIMM slip into the SIMM slots on the motherboard. There are two common pin counts on SIMMs. Of course, the number of data pins on the SIMM must match the data bus width. However, a SIMM needs more than just data pins, so the pin count is larger than the data width. In fact, SIMMs designed for an 8-bit data bus have 30 pins and SIMMs designed for a 32-bit data bus have 72 pins. However, since a modem Pentium PC has a 64-bit wide data bus, 72-pin SIMMs must be installed in pairs in these PCs. (Some older Pentiums have 32-bit data buses.)

8. Memory

147

SIMM Labeling The labeling that memory sellers use to describe SIMM chips can be a bit confusing. A SIMM is often labeled using the following format: SIMM depth in Megs x Data width (including parity) in bits - Pin count To explain this format, we need to explain the SIMM depth. This is simply the number you need to multiply by the data width, not including parity, to get the SIMM's capacity. For example, an (8x32-72) SIMM has 72 pins, a data width of 32 bits (that is, 8 bytes), does not use parity, and has a SIMM depth of 8 megs. Hence, its capacity is Capacity =8 Megs x 32 bits =32 MB As another example, a (4x36-72) SIMM has 72 pins, 32 data bits (that is, 8 data bytes) plus 4 more bits for parity and a SIMM depth of 4 megs. The capacity is thus Capacity =4 Megs x 32 bits = 16 MB

SIMM Chip Count It can be a bit of a surprise to buy a new SIMM for a PC and fmd that, while all the previously installed SIMMs in your PC have 8 chips, the new SIMM has only 2 chips! Actually, it is possible to reach a given capacity in a variety of ways that involve a different number of chips on the SIMM. Sometimes, it is less expensive for the manufacturer to use fewer chips each having a larger individual capacity. (My PC presently houses 4 SIMMs with 8 data chips each and 2 SIMMs with 2 data chips each.) The details on which chip combinations are possible are a bit involved, and so we will save them for an appendix.

Chip Labeling Each memory chip is stamped with a label to indicate its properties. There are occasions when it is useful to understand at least a portion of this labeling. Here is an example of how the chip manufacturer Micron Technology labels their chips:

148

Understanding Personal Computers

MT4LC1M16ESTG-7 The important items to note here are the characters 1M16 (underlined by me for clarity) and the -7 (at the end). The former denotes the dimensions of the chip, which in this case is 1 Meg by 16; that is, the chip has a width of 16 bits and a depth of 1 Meg, for a total capacity of 16 megabits. (Chip width and depth are further explained in the appendix, in the discussion of SIMM chip counts.) Thus, 2 of these chips on a single SIMM board would produce a 1x32 SIMM with capacity 4 MB. The last digit (following the hyphen) refers to the speed of the chip. By appending a 0 to the end, we get 70 ns. Unfortunately, labeling varies among chips and among manufacturers, but at least this gives you something to work from if you ever want to read a chip label.

How Memory Works The story of how a memory chip actually works is a fascinating one. However, since it is a bit involved, we have placed it in an appendix. If you are curious, by all means take a look at that appendix now.

Logical Memory Organization Let us turn now from the physical layout of memory to its logical layout. The address space of a PC can be divided into three logical partsconventional memory, upper memory and extended memory-as shown in the memory map in Figure 8.S. Note that this is a logical division; that is, a division based on how the memory is used. Note also that, as previously mentioned, not all of this address space need be populated with actual memory chips.

8.~emory

149

Address

Extended memory

4 GB (386, 486, Pentium) Primarily for 16 MB (286) use by Windows-based (none for 8088) applications memory area Video RAMI ROM BIOS! OtherROMI Page frame for possible expanded memory (64 KB)! (some usually empty) HI

Upper memory (378 KB)

1024 KB = 1 MB

640KB Conventional memory (640 KB)

For applications Reserved for the operating system

OKB

(about 30- t 40 KB)

Figure 8.5 - A memory map

Conventional Memory The first 640 KB of memory space is called conventional memory. The first 1024 bytes of conventional memory is used for the interrupt vector table, discussed in a previous chapter. The next 256 bytes is used for the BIOS data area, which we also discussed earlier. The DOS operating system then occupies some memory, generally between 20 KB and 90 KB, depending upon how the system is configured (a task usually left up to the user, by the way). Next comes various software drivers (mouse, sound card, etc.). Finally, the remainder of conventional memory (between about 450 KB to 620 KB) is free to be used by applications-both DOS and Windows based. In fact, conventional memory was the only available memory for the first 8088-based pes. Note that 450 ~ of conventional memory is not sufficient for many applications, a problem that can be addressed by loading some software in upper memory, as described next.

Upper Memory The next portion of memory, called upper memory, is filled in a rather eclectic way. It is generally used by hardware adaptor cards, such as video and hard disk controllers, for their device BIOS code. Thus, there is usually no physical memory on the motherboard corresponding to these addresses. But, for example,

150

Understanding Personal Computers

the BIOS on a video card is assigned addresses in this range. We have already discussed some of the problems that may arise with memory conflicts in this address range. It is also often the case that some portions of upper memory are not being used. When this happens, it may be possible to move some of the operating system files and device drivers that normally place themselves in conventional memory into free "holes" in the address space of upper memory. This is referred to as loading software in upper memory. Free blocks of upper memory are called upper memory blocks, or UMBs. Note that actual physical memory must be borrowed from what would otherwise be used by running applications (extended memory) in order to populate upper memory. High Memory There is also a very small portion of memory addresses (a little less than 64 KB) that is always available to DOS-based programs running under all processors except the 8088. It is located immediately above the 1 MB range and called the higb memory area, or HMA. It is thus often possible to load software in high memory, helping to free up as much of the 640 KB of conventional memory as possible. If you are curious about where the HMA came from, here is the story. You can skip this discussion if desired. As we have seen, the 8088 has a 20 bit address bus, allowing for a 220 = 1 Meg address space. The linear address range is thus 0 to 220_1 = FFFFFh. However, the 8088 registers are only 16 bits wide. For this reason, addresses are generally expressed in segmented form. We will discuss this form in detail in the appendix on real and protected mode. Suffice it to say for now that segmented addresses are usually written in the form segment:offset. For example, FFFF:OOOO is an address in segment:offset form. Each portion (segment and offset) fits nicely into a 16-bit register. To convert a segmented address to a linear address, simply put a 0 on the right end of the segment and add the offset to the resulting number, as follows: FFFF:OOOO = FFFFO + 0000 = FFFFO Now, observe that the address (in segment:offset format) that corresponds to the very last linear address that the 8088 can reach (FFFFF) is FFFF:OOOF, since FFFF:OOOF = FFFFO + OOOF = FFFFF

8. Memory

151

Thus, because of the nature of the segment:offset scheme, the registers in an 8088 can store the addresses FFFF:OOIO =FFFFO + 0010 = 100000h = 1,048,576 decimal = 1 meg through FFFF:FFFF =FFFFO + FFFF = lOFFEFh = 1,114,095 decimal which consists of 65,520 bytes of memory beyond the 1 MB range of the 8088. In summary, the registers of the 8088, through the use of segmented address format, can recognize some addresses (64 KB minus 16 bytes) beyond what the actual address lines of the 8088 can access. These addresses cannot be physically addressed because the 8088 only has 20 physical address lines. However, the 80286 and later processors have more than 20 address lines! Thus, when an 80286, for instance, is operating in real mode, which is designed to emulate the 8088, special circuitry in the chip is used to tum off the 21 st address line (known as A20, since the first address line is AO). Thus, any attempt to address memory beyond 1 MB is defeated. (Actually, addresses starting at 1 MB and above are wrapped back around to 0.) However, by reactivating the A20 address line, the 80286 can physically access the roughly 64 KB address space that the 8088 cannot physically reach. This address space is the high memory area, or HMA.

Expanded Memory It did not take long in the PCs history for DOS based programs to run into the 640 KB conventional memory barrier. This prompted some large software and hardware companies to join forces and come up with a method for allowing DOS to access more than 640 KB of memory without realizing it. This trick is called the expanded memory specification, or EMS (which should not be confused with extended memory). The idea behind EMS is relatively simple, but we cover it briefly, since it is no longer a major tool for increasing memory. These days, it is used primarily because some older applications were written to use it. We should emphasize before beginning that expanded memory is a technique for increasing the effective address space of a PC. It is not a region of memory in the same sense as conventional, upper or extended memory. To implement EMS, you need two things: a special memory board, called an EMS memory board, containing some physical memory; and a software program called an EMS memory manager, to manage the EMS process. It is

152

Understanding Personal Computers

also possible to devote some RAM to use in place of the EMS board, if you care to spare the RAM. Note that EMS will work only with applications designed to take advantage of it. The EMS memory board is used as follows. A total of 64 KB of physical memory on the board is given addresses in a consecutive memory block of upper memory, called an EMS page frame. (See Figure 8.6.) Addresses in the page frame, being within the first 1 MB, are accessible even to an 8088 microprocessor. The rest of the memory on the EMS board is not directly accessible to the microprocessor. Now, if an EMS-aware program requests some data, the EMS memory manager checks to see if that data happens to lie in the page frame (an EMS page frame hit). If so, the program can retrieve the data. If not (an EMS page frame miss), the EMS manager finds it on the EMS expansion board and swaps it with data that is in the page frame. The swapping takes place in chunks of 16 KB. Once the swapping is complete, the program can access the data. Thus, in effect, the program has more memory than it thinks it does! (Note the resemblance to caching, but for a different purpose--caching is done for speed, EMS is to increase effective address space.) EMS board 16KB

f------i

EMS page

16KB 16KB

/ 1---------1

~ ~

frame • •16. K .e• • 1 - - - - -----1 1 - - - - -----1 1---------1 I=>=,.."..""""",,...j

Figure 8.6 - The expanded memory specification (EMS)

Extended Memory Extended memory refers to the memory addresses above the 1 MB range. Microsoft Windows and its applications are the primary users of extended memory. It is not uncommon for a computer system to have a total of 16-128 MB of memory. Thus, after taking into account the 640 KB of conventional memory, there is at least 15 MB for extended memory. Windows 3.1 requires the computer user to install an extended memory manager or EMM, called

8. Memory

153

HIMEM, in order to take advantage of extended memory. However, Windows 95 and Windows NT have an extended memory manager built into the operating system.

Memory Blocks If you speak much with hardware technical support personnel, sooner or later you

may discuss memory addresses. Most computer experts(?) use hexadecimal numbers for addresses and speak in terms of upper memory address blocks, as shown in Figure 8.7. Note that each block is 64 KB long. They are knowingly referred to as the A block, B block, C block, and so on. I am not aware of any analogy with the blocks of a prison, however. As we will see, the A and B blocks, and often the first half of the C block, are reserved for video memory.

FOOOO - FFFFF EOOOO - EFFFF 00000 - OFFFF COOOO - CFFFF BOOOO - BFFFF AOOOO - AFFFF

F block E block o block C block B block A block

960 KB - 1023 KB 896 KB - 959 KB 832 KB - 895 KB 768 KB - 831 KB 704 KB - 767 KB 640 KB - 703 KB

Conventional memory

00000 - 9FFFF

o KB - 639 KB

Figure 8.7 - Upper memory address blocks (64 KB each)

Real and Protected Modes If you have been around computers for a while, there is a reasonable chance you

have heard the terms real mode and protected mode. In general terms, all microprocessors, starting with the 80286, can run in either real or protected mode. The main functional differences are that in real

154

Understanding Personal Computers

mode, the processor can only address the same 1 MB as the original 8088 processor (as well as the HMA discussed earlier, for 80286 and later processors). Moreover, a running application has access to all of this address space, and can thus, for example, overwrite a portion of memory that is currently being used by the operating system (DOS). This usually brings the system to a halt, called a system crash, and requires that the user hit the reset button (or turn off the power), loosing all currently unsaved data. On the other hand, protected mode, which is used in a multitasking environment such as Microsoft Windows, is designed to prevent running applications from encroaching upon the memory reserved either for other running applications or for the operating system (Windows and/or DOS). Moreover, protected mode operation has a feature known as virtual memory management, which allows an application to think it has access to the entire 4 GB memory address space (for an 80386 and above)! Real mode exists for the purposes of backward compatibility with the thousands of applications that have been written for this mode. When a PC is first started, it powers up in real mode. Microsoft Windows will switch the processor into protected mode. Readers who are interested in more details about real and protected mode will find an appendix on the subject at the back of the book.

Virtual 8086 Mode Virtual 8086 mode was introduced in the 80386 in order to allow multiple DOS sessions. DOS applications cannot run in protected mode, since they do not understand the addressing scheme. On the other hand, real mode offers no memory protection. Thus, a new mode was created that uses the addressing scheme of real mode but some of the safeguards of protected mode. The main feature of Virtual 8086 mode is that it allows the processor to create multiple virtual machines, each of which appears to operate in real mode. Each virtual machine accesses 1 MB of memory using the real mode addressing scheme and can host a DOS session. Thus, multiple DOS sessions can be running at one time. Moreover, Virtual 86 mode allows the operating system to control the I/O operations of each virtual machine separately. Thus, for instance, each virtual machine has a virtual keyboard, virtual video memory, virtual parallel ports, and so on. In this way, the operations of each virtual machine can be kept separate, and the operating system can decide which machine currently has access to the real (physical) I/O facilities of the PC.

9. Keyboards

On the surface, a keyboard looks like a fairly simple device, and its function is quite simple. However, as we will see, underneath the surface, quite a lot happens when a key is struck. Figure 9.1 shows a close-up of a typical modern PC keyboard. Actually, keyboard configurations have varied over the years, starting with a PC keyboard that had only 83 keys. These days, most modern keyboards have a little over 100 keys. While there is still some variation in keyboard design, it does not affect the overall functioning of the keyboard, so we will not dwell on these minor differences. Function keys

- ....,,--'

\_~,,--

Typewriter keys

Cursor keys

Numeric keypad

Figure 9.1 - A typical PC keyboard The keys on a keyboard are divided into four main groups. With a few exceptions, all keys have what is referred to as a typematic action, which means that if you hold down a key, it will repeat until you release it. The original PC keyboard had a fixed typematic repeat rate of 10 characters per second and a

156

Understanding Personal Computers

fixed typematic delay rate of 0.5 seconds, before the typematic action would begin. Newer keyboards have adjustable repeat and delay rates. The main portion of the keyboard consists of the usual typewriter keys, along with a few special keys. The special keys in the lower row marked Ctrl are called the control keys. When one of these keys is held down, it changes the meaning of each of the ordinary character keys. This can be an extremely useful feature in many contexts. The keys marked Alt, and referred to as the alternate keys, have a similar purpose. The keys on the top row of the keyboard, marked FI through FI2, are called function keys. These keys are used for different purposes by different programs. For instance, it is more-or-Iess common practice for the FI function key to invoke the help system for an application. The keys on the far right of the keyboard comprise the numeric keypad and are used to enter numbers. These keys are arranged in the form of a calculator keypad, thus allowing for the rapid entering of large quantities of numerical data into programs for which this is appropriate (such as spreadsheet programs and accounting programs). The cursor control keys are used as positioning keys, which is why they are marked with arrows, and words such as Home, End, PgUp (for Page Up), and PgDn (for Page down).

Physical Operation Keyboards contain dedicated microprocessors designed to send information about which keys are pressed to the keyboard controller on the motherboard of the Pc. Inside the keyboard, there is a matrix of criss-crossed wires, referred to as xlines and y-lines, as shown in Figure 9.2. At the intersection of the x and y lines is a switch that is closed when the corresponding key is pressed. When the switch closes, it connects the underlying x and y lines. Circuitry in the keyboard constantly sends a current through the x-lines. If a switch is closed (key pressed), the current will flow onto the corresponding y-line. In this way, the keyboard processor can detect when a given key is pressed and also when it is released. In order to do this, the keyboard circuitry must constantly poll every y-line in the keyboard, to check for a current. This may seem inefficient. However, the circuitry is fast enough to not cause any serious delays.

9. Keyboards

157

x-line 0 x-line 1

• • •

x -line 7

-+--+--nI-rH"l-+--+--+-+----{""l-lIn-n--+--rl-f"l-ln-n o ell

.~

~

ell

.S

~~

. ..

Figure 9.2 - Inside a keyboard Each key on the keyboard has associated with it a special number, called a scan code (also called a make scan code). Once the keyboard processor has determined which key was pressed, it sends the make scan code for that key to the keyboard controller. Similarly, if a key is released, the keyboard processor adds 128 (= 80h) to the key's make scan code to produce the key's break scan code, which it then sends to the keyboard controller. Most scan codes are one byte long, but some are more than one byte. Note that each key has a unique scan code different from all other keys, even if the keys have the same labels. For instance, there are two shift keys on a keyboard, but they each have a different scan code. Knowing the values of the scan codes is not particularly important because even very few programmers work with these codes directly. Each byte of keyboard data is sent through the keyboard cable along a single wire. Thus, keyboards communicate serially, that is, one bit at a time, paced by a clock signal that reaches the keyboard through another wire in the keyboard cable. In fact, each keyboard byte is preceded by a start bit, which is always a 0, followed in succession by each bit of the byte, followed in tum by a parity check bit, and then finally a stop bit, which is always equal to 1.

Logical Operation Keyboards designed for the original PC supported only unidirectional communication; that is, all communication was directed from the keyboard to the

158

Understanding Personal Computers

Pc. Subsequent keyboards are more intelligent and partake in bidirectional communication. Thus, a keyboard can accept commands from the keyboard controller. This allows the user to adjust certain features of the keyboard, such as its repeat rate and delay rates, usually through the operating system (such as Microsoft Windows).

The Keyboard Buffer When a key is pressed, the keyboard processor sends the appropriate scan code to the keyboard controller, through I/O port number 96 (= 60h). The keyboard controller issues a hardware interrupt on IRQl. The microprocessor then executes the BIOS interrupt 9 service routine. This routine figures out the corresponding ASCII code for the key, if the key is an ordinary character key (letter, digit, punctuation mark, etc.). It then places the scan code and the ASCII code in a special location in the BIOS data area (in low memory) called the ~eyboard buffer. The rules for what gets placed in the keyboard buffer when a non-ASCII key is pressed are a bit involved, and are not important for our discussion. The keyboard buffer has room for 16 keystrokes. If you overrun the buffer by typing too fast for the application that is using the keystrokes, you will hear a warning beep and some keystrokes will be lost, since they never make it into the buffer. (There are methods for enlarging the keyboard buffer to hold more keystrokes, chiefly by relocating it in a different section of memory.) An application can retrieve keystrokes from the keyboard buffer by using a different BIOS service routine, namely, interrupt 16h. In short, interrupt 9 is used to flll the keyboard buffer and interrupt 16h is used to read the buffer. When the key is released, a similar procedure takes place, resulting in an interrupt 9. However, the interrupt 9 service routine ignores the event, which is why nothing special happens (usually) when a key is released.

1;;« I

Experiment-Looking at the Keyboard Buffer

In case you missed it in Chapter 3, let's take a look at the keyboard buffer.

1. Save all current work and start debug, as described in the appendix (Trying the Experiments).

c: \WIN95>debug +-ol

9. Keyboards

159

2. Next, let us fill the keyboard buffer with junk. Press and hold down the x key for a second, to generate several x's. Then hit the Enter key. You should get the following: -xxxxxxxxxxxxxxxxxxxxxxxxxx~ A

Error

3. Now do a dump of memory at the address of the keyboard buffer. -d O:41e

~

You should see the following: Keyboard buffer

\

0000:0410

78 20

0000 : 0420

00 lC 64 20 20 39 30 OB-3A 27 34 05 31 02 65 12

.. ct

0000 : 0430

00 lC 78 20 78 2078 20-78 2078 20 78 20 81 00

.. X-x-x-x-x-x- ..

0000:0440

B7 00 20 00 00 00 00 00-00 03 50 00 00 10 00 00

. . .. . . . P . . . . .

0000 : 0450

00 18 00 00 00 00 00 00-00 00 00 00 00 00 00 00

0000 : 0460

OE 00 00 04 03 29 30 C2-11 45 87 FF 23 OF 00 00

.. . .. )0 .. E .. # .. .

0000 : 04 70

00 00 00 00 00 01 00 00 - 14 1 4 14 28 01 01 01 01

......... . . ( .. .

0000 : 0480

IE 00 3E 00 18 10 00 60-F9 11 OB 03 00 00 00 77

,. > .... \ .... ... w

0000:0490

80 07 07 00 00 00 10 12-AO 00 40 00 BO FO

....•• @.•.

If you look closely at every other symbol in the keyboard buffer on the far right, you can see the x's. You can also make out the very last keystrokes we entered in order to perform the memory dump, which also ended up in the keyboard buffer. (The reason that the keystrokes occupy every other character in the keyboard buffer will be explained in the chapter on keyboards.) 4. Now enter a different string of letters, say by holding down the z key. -zzzzzzzzzzzzzzzzzzzzzzzz~ A

Error

and do another dump of memory. -d O:41 e ....-J

160

Understanding Personal Computers

The results are similar, but with z' s instead of x's. Keyboard buffer 0000 : 0410

7A 2C

0000:0420

7A 2C 00 1C 64 20 20 39-30 OB 3A 27 34 05 31 02

0000:0430

65 12 00 1C 7A 2C 7A 2C-7A 2C 7A 2C 7A 2C 81 00

0000 : 0440

86 00 20 00 00 00 00 00-00 03 50 00 00 10 00 00

0000:0450 ~ OO 00 00 00 00 00-00 00 0 0000 : 0460

00 00 00 00 00

E 00 00 04 03 29 30 C2-11 45 87

0000: 0470

00 00 00 00 00

0000:04 0

1 E 00 3E 00 18

0000 : 0 90

80 07 07 00 00

Current cursor position (column 0, row 18h =24)

® 00

F 55 EO 00 00

00-14 14 14 2

01 01 01 01

00 60-F9 11 OB 03 00 00 00 77 0 10 12-AO 00 40 00

Number of hard drives installed

• . . . . • •.. P . . . . .

0 FO

. ..• . ) O . E. . , U .. ,

........ .. . ( .. - . .. > ••. • • •.... . • w • •• •••• •• • @•••

Number of v di eo columns (S Oh =80 columns)

We have also pointed out a few other data values in the BIOS data area. 5. Quit debug using the q command. _q ....,J

End of Experiment Experiment-Trying the Interrupt 16 Keyboard Service Routine Service number IOh of interrupt 16h is used to read a character from the keyboard buffer. According to the documentation, this service routine will wait for a keystroke to appear in the keyboard buffer, and then place the scan code of the key in the AH register and the ASCII code (if there is one) in the AL register. Let's give it a try. 1. Save all current work and start debug, as described in the appendix (Trying the Experiments). C : \ WIN9 5 >debug....,J

2. Assemble the two-line program shown below.

9. Keyboards

161

-a 100...-1 116B:0100 mov ah,10...-l 116B:0102 int 16...-1 116B: 0104...-1

3. Run the program. -g=100 104...-1

The program is now waiting for a keystroke. Strike the A key. You should see something like the following: AX=lE61 BX=OOOO DS=116B ES=116B 116B:Ol04 10CD

cx=oooo DX=OOOO SP=FFEE SS=116B CS=116B 1P=0104 ADC CH,CL

BP=OOOO S1=OOOO D1=OOOO NV UP E1 PL NZ NA PO NC

Observe that the AH register contains lEh, which is the scan code for the A key, and the AL register contains 61 (= 97 decimal), which is the ASCII code for the "a" key (lower case). 4. Run the program again. -g=100 104...-1

Hold down the Shift key and strike the A key. You should see something like the following: AX=lE41 BX=OOOO DS=116B ES=116B 116B:Ol04 10CD

CX=OOOO DX=OOOO 8P=FFEE SS=116B CS=116B 1P=0104 ADC CH,CL

BP=OOOO 81=0000 D1=OOOO NV UP E1 PL NZ NA PO NC

Now the AH register still contains 1Eh, which is the scan code for the A key, but the AL register contains 41 (= 65 decimal), which is the ASCII code for the upper case" A" key. 5. Run the program once more -g=100 104...-1

Strike the ESC key. You should see something similar to the following:

162

Understanding Personal Computers

AX=Ol1B BX=OOOO DS=116B ES=116B 116B:0104 10CD

CX=OOOO DX=OOOO SP=FFEE SS=116B CS=116B 1P=0104 ADC CH,CL

BP=OOOO S1=OOOO D1=OOOO NV UP E1 PL NZ NA PO NC

The AH register contains 01, which is the scan code for the Escape key, and the AL register contains IB (= 27 decimal), which is the ASCII code for ESC. 6. Quit debug. -q~

[~ I End of Experiment The Keyboard Status Bytes The interrupt 9 keyboard service routine is fairly complicated, since it needs to keep track of a number of different cases. For instance, the BIOS checks to see if the key pressed is a shift key (Ctrl, Shift, or Alt keys) or a toggle key (such as Num Lock or Shift Lock). This information is recorded in a two-byte section of the BIOS data area at addresses 417h and 418h, as shown in Figure 9.3. These are referred to as the keyboard status bytes. This information, being in RAM, is easily accessible by all applications, and also easily changed by the programmer, as the next experiment shows.

Address 1047 (= 417h)

Address 1048 (= 418h)

Figure 9.3 - Keyboard status bytes

I~ I Experiment-Playing with the Keyboard Status Bytes

9. Keyboards

163

In this experiment, we examine the keyboard status bytes.

Debug Version 1. Save all current work and start debug, as described in the appendix (Trying the Experiments). C:\WIN95>debug 2. The e command is used to enter a value into memory. We will enter values at the location of the first keyboard status byte (at 417h). First enter 0 at address 417h as follows: -e 0:417..-J Debug will respond by displaying the address (in segmented form) along with the current value. You enter the new value, which in this case is O.

0040:0017

20.0..-J

Note that the status lights on your keyboard should all be off. Now repeat the process, entering values as shown below and observing the change in the keyboard status lights. Referring to Figure 9.3, to tum on only the CAPS LOCK key, we want the first keyboard status byte to contain the binary number 0100 0000, which is 40h. Thus, we place 40h into the first keyboard status byte. We then place 20h into the byte to tum on only the NUM LOCK key, and finally lOh to tum on only the SCROLL LOCK key.

-e 0:417 0040:0017 -e 0:417 0000:0417 -e 0:417 0000:0417

3. Quit debug. _q..-J

00.40 20.20 40.10

164

Understanding Personal Computers

QuickBasic Version 1. Save any current work and start QBASIC. Enter the following program and save it.

, To demonstrate the keyboard status byte CLS , Segment address of keyboard status byte DEF SEG = &H40 , Put 0 in the keyboard status byte POKE &H17, 0 'Nothing on STOP , Put 40h in the keyboard status byte POKE &H17, &H40 'Only Cap Locks on STOP , Put 20h in the keyboard status byte POKE &H17, &H20 'Only Num Locks on STOP , Put 10h in the keyboard status byte POKE &H17, &H10 'Only Scroll Locks on END The POKE command puts a byte into memory at the specified location, and is the equivalent of debug's e command. First, we put 0 in the keyboard status byte, which should tum off all toggle and shift keys. Referring to Figure 9.3, to tum on only the CAPS LOCK key, we want the first keyboard status byte to contain the binary number 0100 0000, which is 40h. Thus, we next poke 40h into the first keyboard status byte. We then poke 20h into the byte to tum on only the NUM LOCK key, and finally IOh to tum on only the SCROLL LOCK key. 2. Run the program by hitting the F5 function key. The program will stop several times. Each time it stops, check the keyboard status lights on your keyboard. You should see them change as the value in the keyboard status byte is changed. You can continue the program by hitting the F5 function key.

9. Keyboards

165

3. If you want to tum on more than one key at a time, just add the corresponding values. Thus, a value of 40h + 20h = 60h will tum on both CAPS LOCK and NUM LOCK.

I~ I End of Experiment Special Keystroke Combinations It is worth mentioning that the keyboard service routine for interrupt 9 ignores certain keystroke combinations (for some reason) so if a programmer wants to be able to use those combinations, he or she may need to replace the original BIOS routine with another routine (that is, trap interrupt 9). The BIOS keyboard routine recognizes certain special keystroke combinations. Note, however, that Microsoft Windows often uses these keystroke combinations for its own purposes, or ignores them altogether. Here are some examples. Ctrl-Alt-Del Under DOS, this key combination causes what is referred to as a warm reboot. This will perform a "minor" reboot of the computer, but, as discussed earlier in the book, will not run the POST. This feature is quite useful when a program locks up the computer, but will only work if the BIOS keyboard routines are still working. If not, a cold reboot, initiated by the reset button on the PC, may be necessary. This resets all values in memory. Windows 95 uses this key combination in a similar manner but is thoughtful enough to provide a warning message before rebooting the system. Ctrl+Break This key combination clears the keyboard buffer and issues an interrupt 27. This is often used to stop the execution of a program. (It also works in some Windows applications. ) PrintScreen and Alt+PrintScreen The BIOS issues an interrupt 5 when the PrintScreen key is struck. An application can trap this interrupt to perform whatever operation is desired. Under Windows, for instance, a PrintScreen copies the screen to the Clipboard. When Alt+PrintScreen is struck, the active window is copied to the Clipboard.

166

Understanding Personal Computers

Pause or Ctrl+Num Lock Under DOS, this will pause operation until another key is pressed. Windows applications seem, in general, to ignore this keystroke. Alt+Number from Keypad When you hold down the Alt key and type an ASCII value from the numeric keypad, the BIOS inserts the number into the keyboard buffer as if a key with that ASCII value were pressed (whether or not there actually is such a key). The same feature works under Windows, with a difference. Under Windows, you can hold down the Alt key, then type 0 followed by a three-digit number, to get highorder ANSI characters.

10. Mice

Mice have been around for some time, but were not made popular until Microsoft Windows became popular. These days, it is hard (if not impossible) to live without a mouse. Mice come in several varieties, characterized by such things as the number of buttons on the mouse, whether it is opto-mechanical or optical, how it communicates with the PC (protocol), how it connects to the PC (serial, bus, PS/2 port) and its resolution. Figure 10.1 shows a typical two-button mouse. These days, mice come in two-, three-, four- or even five-button configurations, sometimes with additional appendages. Buttons

Figure 10.1 - A two-button mouse

Note that any additional buttons beyond the traditional two, while often quite useful, do not have a "standard" role to play in application software. Put another way, Windows applications do not generally set aside specific tasks for these buttons, as they do for the first two buttons. Hence, the mouse driver must take care of assigning tasks to these buttons, hopefully allowing some options for the user. For example, a third button might be programmed to be equivalent to a double-click of the left button, or to be equivalent to scrolling the active window. Mice also come in a variety of shapes, and the choice of shape and button count is a matter of personal taste.

168

Understanding Personal Computers

Physical Operation As far as the physical operation of a mouse is concerned, mice come in two varieties-optomechanical and optical. In either case, the internal operation of a mouse is fairly simple. In the opto-mechanical variety (by far the most common) pictured in Figure 10.2, a hard rubber ball that protrudes through the bottom of the mouse shell is rolled along a surface (any surface with enough traction will do). The motion of the ball causes two cylindrical rollers, set at right angles to each other, to rotate. A small disk, called an index wheel, is attached to one end of each roller. This index wheel has holes in it (or is spoked) to allow light from an LED on one side of the wheel to shine through at periodic intervals as the wheel spins. The light is detected on the other side of the wheel using a photoelectric cell, that transforms the light into an electric current.

Figure 10.2 - An optomechanical mouse The information gained from the motion of the ball thus consists of two numbers-how many horizontal ticks (current changes) and how many vertical ticks have occurred in a given period of time. Optical mice produce the same data (number of horizontal and vertical ticks) but do so differently. In an optical mouse, a light beam shines onto the surface below the mouse. An optical mouse must be placed on a special pad that has a mesh of colored lines. An optical sensor in the mouse detects the motion of the light beam relative to the grid.

10. Mice

169

The disadvantage of an optical mouse is that it must be used on a special pad. Also, optical mice are a bit more expensive than optomechanical mice. On the other hand, an optical mouse does not have moving parts, in particular, it does not have a rolling ball that gathers dust and lifts it into the mechanism. (For this reason, mechanical mice should be cleaned regularly.) Moreover, optical mice are more accurate, making it easier to move the mouse pointer to a precise location (pixel) on the screen. Mechanical mice are more prone to "skip" pixels. This accuracy is important for high-level computer-aided design (CAD) users, for instance.

Mouse Protocols Currently, there are four standard protocols for mouse-to-PC communication, corresponding (naturally) to four major mouse manufacturers: Microsoft, Logitech, Mouse Systems Corporation (MSC) and IBM (called the PS/2 protocol). In general, these protocols all supply roughly the same information, which amounts to the number of mouse ticks in each direction (horizontal and vertical) since its last report, as well as whether or not various buttons are currently pressed. The difference is in the format in which the information is delivered. A mouse sends data to the PC in a format called a mouse data packet. The Microsoft protocol is a two-button protocol that uses a 3-byte data packet. The Microsoft mouse data packet is shown in Figure 10.3. The x-value is made up of the bits labeled x7 (leftmost) through xO (rightmost) and similarly for the y-value. Note that the bits in these values are scattered (for some reason) in more than one byte. These values give the number of ticks in the x and y directions since the last time a packet was sent to the PC. A negative x-value means the mouse moved left, and a negative y-value means the mouse moved down. The leftmost bit in each data packet byte is not used. Finally, the bits Lt and Rt are 1 if the corresponding button (left or right) is pressed, otherwise they are O.

170

Understanding Personal Computers Byte One

1 11 1Lt 1Rt 1y71yel X71 xel Rt=Right Button Lt=Left Button Byte Two

Byte Three

Figure 10.3 - Microsoft mouse data packet The MSC protocol is a 3-button protocol using a data packet containing 5 bytes and the Logitech protocol is also a 3-button protocol, but it uses a 4-byte data packet. The PS/2 protocol handles both 2 and 3 buttons, using a 3-byte data packet. The 5-button mouse from Mouse Systems Corporation uses an 8-byte packet. So much for computer standards! Fortunately, it is not important, from a user's perspective, to understand the mouse protocols, since essentially all mice these days will run under Microsoft Windows. However, one issue that must be dealt with is how the mouse is connected to the PC. We discuss this issue next.

Mouse Interfaces There are several possibilities for the physical interface (connection) between the mouse and the PC. A serial mouse connects to a serial port. The arrival of a data packet from the mouse signals an interrupt on whichever IRQ line is used by that port. The mouse driver can then process the packet and deal with special features such as orientation, acceleration and swapping mouse button functions (for left-handed folks). A bus mouse comes with a controller card that fits into an expansion slot, but is otherwise essentially a serial mouse. Thus, a bus mouse requires the use of an expansion slot, as well as an IRQ line. The real choice between these two options is whether to tie up a serial port or an expansion slot. Many modern PCs now come with a dedicated mouse port, in the form of a PS/2 port, that is connected to the keyboard/mouse controller on the motherboard. This port is capable of accepting certain types of mice without

10. Mice

171

using either an expansion slot or a serial port. Moreover, the mouse uses IRQ12, thus not taking up one of the IRQ lines usually used by serial devices. Some mice can only be plugged into a serial port, some can only be plugged into a PS/2 port, and some can be plugged into either a PS/2 port or a serial port. In the latter case, the mouse controller can sense, through the power pin in the connector, when it is connected to a PS/2 port, in which case it uses the PS/2 protocol. If it is connected to a serial port, it will use one of the other protocols. In shopping for a mouse, it is important to read the documentation (or box) carefully to see how the mouse can (and cannot) be connected to the PC.

Resolution Another characteristic of a mouse is its resolution, which refers to how many ticks (or dots) a mouse can register per inch of travel. Common resolutions are 300 or 400 dpi (dots per inch). Note that a mouse with a finer resolution simply reports a higher tick count to the PC than a mouse with a lower resolution-for the same distance traveled. This has the effect of moving the mouse pointer over a greater distance, resulting in a faster moving pointer. Thus, a finer resolution does not mean the mouse pointer moves in smaller increments (as you might at first suspect). Rather, it means that the mouse pointer travels a greater distance, relative to the motion of the mouse itself. Note also that the driver can do what it wants with the tick counts received from the mouse, so a higher resolution can simply be ignored if desired.

11. Display Monitors

The display monitor (or simply, monitor) is the main output device of a PC. (See Figure 11.1.) As we have seen, a monitor is usually connected to a display adaptor card or video adaptor card that fits into an expansion slot in the motherboard, although some motherboards have built in video support. In this chapter, we will discuss the construction and operation of the monitor, turning to video adaptor cards in the next chapter.

Figure 11.1 - A typical display monitor

Monitor Features Let us begin by discussing some of the overall features in a display monitor.

Display Size Monitors come in different sizes, usually described by giving the diagonal tube measurement. Typical sizes are 14,15,17,20 and 21 inch. It is important to note that often the actual viewing area is smaller than is indicated by these numbers,

174

Understanding Personal Computers

so care should be taken in comparing monitors. For instance, a 15-inch monitor might have an active display area of about 13.8 inches diagonally and a 17-inch monitor might have a viewing area of about 15.8 inches diagonally. A 21-inch monitor is very useful for graphics-oriented computing, but such monitors tend to be quite expensive and very bulky and heavy.

Tube Shape Monitor screens come in a variety of shapes. The flat square tube has a slightly curved tube that recesses at each of the four comers. The vertically flat tube, first used by Sony in its Trinitron tube, is curved only in the horizontal direction, like a portion of a cylinder. This tube shape is now used by other manufacturers as well. The vertically flat tubes are used in the so-called aperture grill design described below. These tubes tend to produce brighter, more saturated colors, but the images are not as precise because of the asymmetry of the tube. The choice is definitely a matter of personal taste.

Power Management Monitors consume a lot of power (about 100-200 watts) and create a lot of heat. Most newer monitors support a feature called power management, in which the monitor will, after a user-selectable length of time, reduce its power output. The federal government's EPA energy star compliant standard requires a reduction of power output to less than 30 watts during idle times. Sweden's TCO (Sweden's Confederation of Professional Employees) specification requires a second level of power reduction-to less than 8 watts, when the monitor is in its maximum low power state. Current power-saving monitors use VESA's (Video Electronics Standards Association) Display Power Management Signaling, or DPMS, to communicate with a compatible adaptor card to signal power reduction.

Radiation Emissions Computer monitors emit both magnetic and electric field radiation, in the socalled extremely low frequency, or ELF, range (5 Hz-2 KHz) and the very low frequency, or VLF, range (2 KHz-400 KHz). There is much uncertainty as to whether or not these emissions are harmful, but the prudent computer user would rather avoid them if possible. Both the TCO and SWEDAC (the Swedish Board for Technical Accreditation, previously known as MPR, or the National Board

11. Display Monitors

175

for Measurement and Testing), specify recommended levels of field emissions. A prudent buyer will look for a monitor that conforms to their latest recommendations. The current recommendations are MPR II or MPR 1990:8 (for August 1990) and TeO '92.

Monitor Controls Some monitors provide a wider range of controls than other monitors. In addition to the obvious size and position controls, there may be controls for such thing as color temperature, which refers to the warmth of colors, color matching (which attempts to match printed color output) and special display geometry controls, used to correct the problems illustrated in Figure 11.2.

DDOODD [:1'0 0 1"0 Ddi'=~~;~"D DO Pin cushion balance

Trapezoidal

Upper corner distortion

Parallelogram

Figure 11.2 - Geometric monitor distortions

Anti-Glare Coating Most monitor screens are coated with a silica coating in order to reduce glare. The purpose of this coating is to provide a jagged surface that defuses incoming light in various directions. More expensive anti-glare treatments use multiple layers consisting of different compounds. As with other aspects of a monitor, the quality of anti-glare treatments is a matter of personal judgment.

Connector Types There are two principal ways to connect a monitor to a video adaptor card, as shown in Figure 11.3. The standard 15-pin D-shell connector is most common, but high-performance monitors often have separate (between 3 and 5) BNC

176

Understanding Personal Computers

connectors, that are capable of transferring data more quickly than the standard D-shell, allowing for extremely high resolutions, with lots of colors. These connectors require an adaptor card that supports them as well. (Incidentally, there does not seem to be agreement in the literature on what the letters BNC stand for-one possibility is Bayonet Nut Connector. This is a common type of connector for coaxial cable. Your television may be connected to its cable in this manner.)

Vertical sync Horizontal sync 15-pin D-shell video connector

Horizontal sync

Vertical sync

Red signal

Green signal

Blue signal

BNC connectors

Figure 11.3 - Types of video connectors

Display Data Control A recent development in monitor technology is the addition of display data control, or DDC, the current incarnation of which is DDC2B. This standard allows bidirectional communication between the monitor and the adaptor card. In this way, the adaptor can query the monitor to determine which resolutions and scan rates it supports, thus enabling it to automatically send an optimal signal to the monitor. The DDC standard is also referred to as ACCESS.Bus, since it includes a method for connecting compliant devices, such as certain keyboards and mice, directly to the monitor, instead of to the system unit. At this time, very few devices are ACCESS .Bus compliant. Moreover, an alternative technology, called the universal serial bus, or USB, has recently made its appearance.

11. Display Monitors

177

Monitor Construction Shadow Mask Monitors The main component of a monitor is the cathode ray tube, or CRT, also called the picture tube. (See Figure 11.4.) This is a glass tube, small on one end and large and relatively flat on the other end, that is partially evacuated and filled with an inert gas at low pressure. The small end of the tube contains the electron guns, which consist of a negatively charged cathode that emits an electron beam toward a positively charged anode (also part of the gun). The beam continues past the anode and is directed by electromagnetic deflectors in both the vertical and horizontal directions, striking the inside of the tube at the other end, where it causes a phosphorous coating to glow, thus producing an image.

Screen Shadow mask

Figure 11.4 - A shadow mask CRT

The most common form of color CRT has a design in which the inside of the screen is covered with small trios of dots, called dot trios, dot triads or color triads, as shown in Figure 11.5.

178

Understanding Personal Computers

dot

dOy "

pitch

Figure 11.5 - Dot Triads The dots in a triad are made up of phosphors, that is, material that glows when struck by electrons. One dot in each trio glows red, one glows green and one glows blue. (You can see the phosphors in a monitor yourself by displaying a solid white image and looking at the screen through a magnifying glass.) The CRT contains three electron gun/deflector assemblies, one for each color. The phosphors in a dot trio can be illuminated at different intensities, by varying the intensity of the corresponding electron beams, to produce different colors. (The phosphors in a triad are so close together that, to the eye, they appear as one dot). The mixing of red, green and blue of different intensities to form various colors is referred to as the RGB color model. When each beam is at full intensity, the resulting mixture of red, blue and green results in a white image. To keep each electron beam from partially "spilling over" to a neighboring dot of a different color in the triad (thereby increasing that color's intensity), there is a metal screen, called a shadow mask, sitting close behind the glass screen. This metal screen has holes corresponding to the position of each triad and has the effect of protecting each dot in the triad from the other two electron beams. The operation of the shadow mask is pictured in Figure 11.6. Figure 11.7 shows a construction known as the inline gun tube, where the electron guns lie along the same horizontal line. This requires a different arrangement of color phosphors, but the effect is the same. (Inline guns allow a bit more precision.)

11. Display Monitors

Electron guns

oo 00 0 o

179

Shadow mask

Phosphor dot triad

Screen

Figure 11.6 - Shadow mask operation

Shadow mask guns e Phosphor dot triad

Screen

Figure 11.7 - Inline guns

Of course, the three electron guns must be aligned properly so that their beams strike the screen in precisely the right place. As you might imagine, this requires very precise positioning if it is to be effective over the entire screen. If the alignment is a bit off, the result is that white dots appear color-separated, an effect known as misconvergence. Shadow masks absorb a lot of heat from the electron beams. This heat can cause the mask to distort, thus moving the holes in the mask and affecting the picture quality. Higher quality shadow masks are constructed of a substance called Invar, a metal alloy that resists distortion due to heat. The smaller the dots in a triad, the less grainy the images will appear. Rather than quote the size of the dots, display manufactures quote the dot pitch, which is the distance between (and not the size ot) adjacent dots of the same color. (See Figure 11.5.) Monitors with a small dot pitch have small-sized dots and thus can

180

Understanding Personal Computers

produce less grainy pictures. Dot pitches range from about 0.42 mm to 0.22 mm. One should look very carefully at a monitor whose dot pitch is higher than about 0.28, however. (Incidentally, a typical television has a dot pitch of about 0.65 mm and a resolution of about 320 x 525.) A visible "dot" on the screen is referred to as a pixel, which is an abbreviation of picture element. Depending upon the resolution of the current image, a single pixel may be made up of several dot triads. Figure 11.7 illustrates the effect of pixel size on an image.

A

A Coarse resolution: 19x21 pixels per square inch

Fine resolution: 75x66 pixels per square inch

Figure 11.8 - The effect of video resolution

Aperture Grill Monitors An alternative to the shadow mask technology was invented by Sony Corporation, and used in their Trinitron tubes. Since its invention, other companies have begun using the technology and are now making aperture grill monitors. In an aperture grill monitor, the inside of the CRT screen is painted with vertical stripes of phosphors, in alternating colors, as shown in Figure 11.9.

11. Display Monitors

,, ,

181

stri pe pitch _ '>'

,~

,, ,

Figure 11.9 - Color slots In place of a shadow mask, aperture grill monitors have a grid of vertical wires, with slots in between, called the aperture grill. The wires in the grill shield the colored stripes from unwanted overspill. The distance between the slots in the aperture grill corresponding to different colors is called the slot pitch, and is slightly smaller than the stripe pitch, which is the distance between the center of two stripes of the same color. For instance, a slot pitch of 0.29 mm might be equivalent to a stripe pitch of 0.30 mm.

Comparing the Two The dot pitch of dot triad CRTs and the stripe pitch of aperture grill CRTs are not directly comparable. However, a stripe pitch of 0.25 mm is roughly equivalent to a dot pitch of 0.27 mm. In general, the shadow mask technology provides for a more dimensionally accurate image, generally preferred by users doing such things as drafting or computer-aided design. On the other hand, the aperture grill technology provides brighter images, with richer, more saturated colors. Note that aperture grills in a CRT are held taught by two thin horizontal wires. The shadow of these wires can be seen very faintly across the screen, especially when the image is a light color. You might want to check this out before buying an aperture grill monitor.

Display Resolution As mentioned earlier, images are made up of individual units called pixels, which is a contraction of picture element. Pixels may consists of several dot triads.

182

Understanding Personal Computers The resolution of a screen image is given by a pair of numbers. The fIrst number is the number of pixels that are displayed on each horizontal line of the screen and the second number is the number of horizontal lines of pixels displayed on the screen. The most common display resolutions are listed below. A resolution of 640x480 is called VGA resolution and all higher resolutions are collectively referred to as super VGA resolutions (or SVGA). Earlier PCs used resolutions ranging from 32OX200 to 640x350, but these are now obsolete. (Incidentally, there is some confusion over the term super VGA resolution. Officially, it refers to any resolution above 640x480, but some use the term to mean 800x600 specifIcally.) 1. 2. 3. 4. 5.

640x480 (VGA resolution) 800x600 1024x768 1280x1024 1600x1200

It is important to note that each monitor has a maximum resolution. Less expensive monitors have maximum resolutions of 1024x768 or 1280x1024. Higher-end monitors have a high maximum resolution of 160Ox1200 (or even higher).

Higher Resolution Is Not Always Better It might seem at first that a higher resolution is always an advantage, since

images are sharper. However, the issue is not quite that simple. Consider the situation of a Windows-based application. When a programmer writes a Windows application, he or she adds various controls to the application, such as text boxes, menus, command buttons, icons and so on. The size of these controls is often specified, by the programmer, in pixels. Consider, for example, a button that is 32 pixels across. When the display is set for a resolution of 640x480, this button will fill 1/20 of the screen (horizontally). This may seem the perfect size on a 14-inch monitor, but grossly too large on a 21-inch monitor. Moreover, the resolution is not affected by monitor size, and so the dots that are required to supply exactly 640 dots horizontally across the screen are much larger on a 21-inch monitor. Thus, the button will look grainy on the larger monitor.

11. Display Monitors

183

On the other hand, if the resolution is set for 1280x1024, then the button will only measure 1/40 of the horizontal screen size. This will probably appear much too small on a 14-inch monitor, and thus very hard to select with the mouse pointer. Also the icon or label will probably be too small to read. But on a 21inch monitor, the button will have a pleasing size, without appearing too grainy. In summary, it is too simplistic to say that higher resolution is always better. The fact is that a compromise must be reached between monitor size (or expense) and resolution. Larger monitors can comfortably accommodate higher resolution images, thus allowing more readable information to be placed on the screen at one time.

How Images Are Displayed-Raster Scanning The electron beams that sweep out an image on the monitor begin at the upper left comer of the monitor and move horizontally across each row of dot triads on the screen. The video card sends signals to the monitor to control the movement and intensity of the beams. For instance, the video card signals the monitor at the end of a row by sending a horizontal synchronization signal, or horizontal sync, to the monitor. The time it takes the beams to make one horizontal sweep is called the horizontal scan rate, or horizontal scanning frequency, and is usually measured in kilohertz (measuring scans per second-for example, 30 KHz is 30,000 scans per second). When the beams reach the end of a row, they momentarily shut off and move to the beginning of the next row. The process of returning the electron beams from the end of one line to the beginning of the next line is called horizontal retrace. The video card signals the monitor when all rows have been scanned, by sending a vertical sync signal. The beams then shut off again and return to the upper left comer of the screen, a process called vertical retrace. The time it takes the beams to scan the entire screen once is called the vertical scan rate, or vertical refresh rate, or just the refresh rate. It is also called the frame rate, since it is the time it takes the beams to completely repaint the entire screen (or frame). Higher frame rates lead to more stable-looking images and are less tiring on the eyes. The process we have described is called raster scanning and is illustrated in Figure 11.10.

184

Understanding Personal Computers

< < < < <
debug..-l

2. Do a memory dump at address 400h. -d 40: 0..-1

On my PC, which has two physical serial and two physical parallel ports, the results are -d 40'0

OO",~'OO

rr COM1

00 00 00

COM2

ooyyOO LPT1

00 OC 02

........ x.x .....

LPT2

Note that, on a PC, the memory bytes that make up a 16-bit word are stored in reverse order. Thus, the LPT1 port address is 0378h.

[~I End of Experiment

294

Understanding Personal Computers

Parallel Interface Standards Until fairly recently, the parallel port was used more or less exclusively for printing. In fact, Microsoft Windows still refers to a parallel port as a printer port. For printing purposes, the relatively slow data transfer rate of a standard parallel port (up to 150 KB/sec) is reasonably sufficient. Recently, however, a number of new devices, including CD-ROM drives, tape drives and removable hard and floppy drives, have been designed to use a parallel interface. Since the relatively slow speed of the original parallel interface is a serious bottleneck to these devices, changes were needed. This led to the appearance of parallel ports advertised as being "bidirectional" and "enhanced." Unfortunately, these terms are highly ambiguous. In 1994, the IEEE officially released a number of standards (or protocols) for parallel port operation, under the name IEEE 1284, (or, if you prefer, the full name IEEE Std. 1284-1994 Standard Signaling Method for a Bi-directional Parallel Peripheral Interface for Personal Computers). These standards allow for considerably faster data transfer rates (up to 2 MB/sec), as well as sanctioned bidirectional data transfers. Equally important, they remove the ambiguity by defining precise terminology for parallel port operation. Thus, while the term "enhanced parallel port" is still ambiguous (and not to be trusted), the term "IEEE 1284 enhanced parallel port" has a very specific meaning. Let us briefly outline the main features of each standard.

Standard Parallel Port (SPP) Standard Parallel Port (SPP) mode also known as Compatibility mode, or Centronics mode, is the original, and still most common, protocol used to transfer data to a peripheral (usually a printer). In SPP mode, the CPU communicates with three 110 ports (at consecutive addresses) that make up the parallel port interface. For instance, if LPTl is assigned the base 110 port address of 378h (as it usually is), then the three 110 ports used by LPTI are 378h, 379h and 37 Ah. Figure 17.2 shows the three registers at these addresses. The base port address (378h) is the address of the

17. The Parallel Interface

295

data register; the next address (379h) is that of the status register and the third address (37 Ah) is for the control register. Data register (at 378h for LPT1)

Status register (at 379h for LPT1)

76543210765432

7

6

5 432

1

1 0

0

Control register (at 37Ah for LPT1)

Figure 17.2 - The Standard Parallel Port Registers /

We have also shown the so-called pin assignments, primarily to emphasize the direct assignment of printer function to cable wire. For instance, the BUSY bit in the status register is connected to wire number 11 in the parallel cable. Thus, the peripheral asserts the signal on this wire when it is busy, and the CPU can read this register (through port 379h) to see if the peripheral is busy. Simple, huh? To send data to a peripheral in compatibility mode, the CPU places data in the data register, checks the status register for a BUSY condition and, if clear, asserts the STROBE line, which has the effect of sending the data to the peripheral. Finally, the STROBE must be de-asserted. If the IRQ bit in the control register is set, then the parallel interface will inform the CPU when the printer has acknowledged receipt of the character. This is usually done on IRQ7 for LPT1 and IRQ5 for LPT2. Once the IRQ is received, the CPU can prepare another character. This explains why a parallel port requires an IRQ line. On the other hand, use of an IRQ line can be avoided by simply testing the ACK bit of the status register directly (in which case the IRQ bit in the control register is not set).

296

Understanding Personal Computers

The only complication in the simple procedure outlined above is one of timing. Each signal must be timed very carefully (with pauses of an appropriate length between steps) in order for the process to work properly. Note also that a total of 4 register cycles (place data in data register, check for BUSY, assert STROBE, deassert STROBE) are required in order to send a single byte of data to the peripheral. This, together with the aforementioned timing constraints, account for the relative slowness of standard parallel ports (up to 150 KB/sec, but less in practice). We should also mention that many parallel interfaces have implemented a nonstandard form of compatibility mode that uses a FIFO data buffer. (FIFO stands for first-in first-out). This mode is then referred to as Fast Centronics or Parallel Port FIFO Mode. With this mode, the hardware takes care of the strobe cycles, thus improving data transfer rates up to about 500 KB/sec. This mode, however, is not defined in the IEEE 1284 standard.

Nibble Mode Standard compatibility mode is a unidirectional mode. That is, all data travels from the PC to the peripheral. Only status information is returned to the PC. The PC does have the ability to read the data register, but it will just see the last byte it placed in that register. The so-called nibble mode (a nibble is 4 bits, or one half of a byte) is a standard that provides communication from the peripheral to the Pc. Thus, when it is combined with compatibility mode, it provides bidirectional communication. The operation of nibble mode is quite simple. The standard parallel interface provides 5 status lines (Figure 17.2) that provide status information to the PC. Using these lines, a peripheral can send a byte of data by sending 2 nibbles in two successive data transfer cycles. One hitch is that the ACK (acknowledge) line cannot be used to carry part of this data, since this would confuse the PC into thinking that the peripheral is acknowledging receipt of data the PC did not send! Hence, the 4 bits of the data nibble are not grouped together in the status byte. This means that additional processing time is required to extract the nibble of data. In addition, this reverse channel communication requires several additional steps on the part of the participants, resulting in a maximum data transfer rate from peripheral to PC in the neighborhood of 50 KB/sec. This is not a problem with printers, for instance, but is very hard on parallel port storage devices. Its main advantage is that it works on standard parallel ports.

17. The Parallel Interface

297

Byte Mode Many standard parallel port interfaces are capable of disabling the function of the data lines that lead to the data register, thereby allowing those data lines to be used in bidirectional communication. This cuts the reverse direction transfer rate in half. This type of port is sometimes referred to as an enhanced bidirectional port, which certainly causes confusion with the Enhanced Parallel Port (EPP) , which we discuss next.

Enhanced Parallel Port (EPP) The Enhanced Parallel Port (EPP) protocol was originally developed by Intel, Xircom and Zenith Data Systems, as a high performance parallel interface that would be compatible with the standard parallel interface. Unfortunately, since many EPP-type parallel interfaces were implemented before the 1284 standard became official, there is some variation in the design of this interface. The EPP protocol is the first protocol to make explicit provision for parallel storage devices (and not just for printers). In fact, the EPP standard provides not only for data writes and reads, but also for address and control writes and reads. Moreover, the EPP protocol requires only a single instruction to transfer data, and can thus achieve transfer rates in the range of 500 KB to 2 MB per second, which is in the same speed range as the ISA bus, to which the interface is probably connected. To accomplish these enhancements to parallel port transfers, the EPP standard adds an additional 5 registers to the 3 registers used by the standard parallel interface. Thus, a total of eight I/O port addresses are devoted to the EPP interface. For instance, if the base port address is 378h, then the port will use port addresses 378h-38Fh. One EPP register is an address register and one is a data register. The other registers are available for 16-bit and 32-bit data transfers, but need not be used in this way under the standard. It is worth mentioning, however, that SCSI bus speeds far exceed EPP (or ISA bus) speeds. Thus, when considering a storage device that is available in both SCSI and parallel interfaces, it is important to keep in mind that the SCSI device will, in general, be considerably faster.

Extended Capability Port (ECP) The Extended Capability Port (ECP) standard was first proposed by Microsoft and Hewlett-Packard. Like the EPP protocol, ECP provides for high

298

Understanding Personal Computers

performance, bidirectional communication. The ECP protocol also provides for a form of data compression, known as Run-Length-Encoding (RLE), with compression ratios as high as 64 to 1, FIFO data buffering, and DMA as well as programmed I/O. As we have seen, the DMA and PIO modes are used by enhanced IDE disk drives, for instance. In addition, with ECP's multiple communication channels, an ECP compliant interface can handle multiple logical devices within a single physical device, such as, for example, a FAX/printer/modem combination. In this case, the modem could be receiving data while the printer is preparing an image for printing. The ECP protocol uses a total of 6 registers. As we have seen, the I/O port address space is 64 KB in size, with addresses from 0 to OFFFh. However, since the original PC used only the first 1024 port addresses (from 0 to 3FFh), newer PCs have followed suit. Thus, a whole range of port addresses are generally unused. The ECP protocol uses, for example, not only the standard parallel port addresses 378h, 379h and 37 Ah, but also the addresses obtained by adding 1024 (= 400h) to each of these addresses, to get 778h, 779h and 77Ah. One of these registers (called the ECR register) is used to set the mode of operation to any of the following: SPP, byte mode, fast Centronics, EPP or ECP. Thus, ECP is completely backward-compatible.

Using a Nonstandard Parallel Port Mode It is important to emphasize that, in order to use a non-standard parallel port

mode, not only must the parallel port and peripheral device support this mode, but a driver is needed for the mode. Fortunately, Windows 95 has at least partial support (drivers) for ECP mode. As time goes on, the various nonstandard parallel port modes should become much more common.

18. Printers

Printer technology has evolved dramatically since the early PC days. Printers can generally be divided into the following four groups. •

Daisy wheel



Dot matrix



Inkjet



Laser

Daisy Wheel Printers In the early days of the PC, there was only one way to get print quality that rivaled that of a typewriter. This was to get a "typewriter" that was controlled by the PC. The so-called daisy wheel printers worked on exactly the same principle as a typewriter with respect to putting ink on paper. Instead of the typical round print ball of a typewriter, daisy wheel printers generally used a flat, spoked wheel (resembling a daisy?) that had each letter carved into the head, as shown in Figure 18.1. (Typical wheels had upwards of 100 characters.)

Figure 18.1 - A daisy wheel print head

300

Understanding Personal Computers

As shown in Figure 18.2, the daisy wheel would spin in order to position the correct symbol between the ribbon and the print hammer. The hammer would then strike the daisy wheel, forcing the character into the ribbon and against the paper.

e Paper

o

Ribbon Hammer

I II

PI aten

o

-"It. :r-------,

""'0 Daisy Wheel

Figure 18.2 - Daisy wheel printer operation Daisy wheel printers have been made essentially obsolete by vastly improved print technologies, so let us quickly summarize the properties of daisy wheel printers and then speak no more about them. •

Speed-Daisy wheel printers were very slow. Low-end daisy wheel printers were unidirectional, printing in only one direction of printhead travel, at a rate in the range of 10-40 characters per second, or cps. High-end daisy wheel printers were bidirectional and could print up to about 100 cps. In any case, daisy wheel printer prices were in the thousands of dollars in the early 1980s.



Noise-Daisy wheel printers were quite noisy.



Print Quality-The print quality of a daisy wheel printer was equivalent to that of a good typewriter. In particular, the print quality was entirely suitable for a business letter.



Flexibility-Although there were a relatively large number of different print wheels available, it was necessary to pause the printer to change wheels if you needed a special symbol on a different wheel. Also, there was no way to

18. Printers

301

print graphics using a daisy wheel printer. (You could print subscripts and superscripts with a daisy wheel printer, but only somewhat problematically.) Because of the manner in which the print wheel would impact against the ribbon, daisy wheel printers fall under the category of impact printers. In addition, since a complete character was formed upon each impact, daisy wheel printers are also called character impact printers.

Dot Matrix Printers The only other printer choice in the early PC days was the dot matrix printer, which still survives modestly today, due primarily to the fact that dot matrix printers are extremely durable and that, being impact printers, they are capable of printing mUltipart forms. Instead of using print heads with fully formed characters, the impact end of a dot matrix print head consists of a vertical row (or rows) of small pins. Low end dot matrix printers have 9 pin heads whereas higher-end printers have print heads with 18 or 24 pins. Figure 18.3 shows the head-on view of a dot matrix print head assembly.

'"

~

0

C>

G

Q

0

.,e .,

.... '"

0

•0 0

c C

0

0 0 0 0 0 0

• • 0

Figure 18.3 - Dot matrix print heads (9-pin and 24-pin)

Figure 18.4 shows an exaggerated view of a character formed from a dot matrix print head. Note that, even with a 9-pin print head, not all of the 9 pins are used for each character, since some pins are used for descenders and others for ascenders.

302

Understanding Personal Computers

Figure 18.4 - A "T" formed with a dot matrix print head Figure 18.5 shows the print head layout for a dot matrix printer. When the print head is placed in position, only those pins that are required at that particular point in the character (or image) will strike the ribbon and hence leave an ink spot on the paper. This is generally done using an electromagnet to force the pin against the ribbon. A spring then retrieves the pin as soon as the electromagnet is turned off. Paper

Print Head

Figure 18.5 - Dot matrix printer operation The dot matrix print head assembly is a very simple and reliable mechanism. The properties of dot matrix printers can be described as follows. •

Speed-Low-end dot matrix printers print at about 200 cps in draft mode (lower quality output) and about 50 cps in letter quality mode. High-end dot matrix printers print at about 600 cps in draft mode and 300 cps in letter quality mode.



Noise-Dot matrix printers are very loud in comparison with ink jet or laser printers.



Print Quality-The print quality of a 9-pin dot matrix printer is not good. The letter quality dot matrix printers have decent print quality-you can print a passable business letter on such a printer, but you cannot hide the fact that it is printed on a dot matrix printer.



Flexibility-Since dot matrix printers can print individual dots on the paper (called all points addressable printing), they can print graphics.

18. Printers



303

Paper Handling-Dot matrix printers can generally handle a variety of paper types and sizes, including envelopes and, in some cases, card stock. Rather than feed each sheet of paper into the printer individually, cut sheet feeders can automatically feed sheets of paper from a bin. Also, dot matrix printers excel at handling continuous forms (sheets of paper attached end to end with perforations). Such paper can be advanced through the printer using friction feed. However, more reliable positioning is afforded by a tractor feed device that grabs small holes on the side of each sheet of paper, as shown in Figure 18.6.

Figure 18.6 - Tractor feed paper •

Other Issues-Dot matrix printers are generally very reliable, with print head lifetimes in the neighborhood of 100-200 million characters, or perhaps as much as 100,000 pages. (Also, print heads can often be replaced by the user.) Dot matrix print heads do get extremely hot, however. Power consumption is relatively modest, on the order of 100 watts during printing, equivalent to the power consumption of an average light bulb. When not printing, dot matrix printers sit quietly, using no power.

Ink Jet Printers Ink jet printers first appeared on the PC market in 1984, when Hewlett-Packard introduced the ThinkJet (short for Thermal Ink Jet) printer. As the name suggests, an ink jet printer literally sprays small dots of ink upon the paper. This is a nonimpact approach, in the sense that no moving mechanical parts strike a surface. As a result, ink jet printers are considerably quieter than impact printers, although you can hear the movement of the print head back and forth over the paper.

304

Understanding Personal Computers

Ink Jet Operation Figure 18.7 shows a typical ink jet cartridge. The business end of the cartridge has tiny holes through which the ink is forced out. Nozzles

Figure 18.7 - An ink jet cartridge Figure 18.8 shows a close-up of a cartridge nozzle. There are two general approaches used to force an ink drop through the nozzle opening.

INlo9zzlle~~1II ~k drop Piezoelectric crystal or heater element

Figure 18.8 - Close up of a nozzle In one case, a special crystal called a piezoelectric crystal is used. A piezoelectric crystal has the property that an electric current will cause the crystal to bend. The crystal (one per nozzle) is placed in such a position within the nozzle that its bending contracts the nozzle and forces ink through the hole. Thus, each rapid pulse of current produces an ink drop. (As an aside, it is precisely the opposite piezoelectric effect that is used in a phonograph cartridge. In particular, the crystal in a phonograph cartridge is bent by the grooves in a phonograph record, and thus produces an electric current that is amplified by the record player.) Alternatively, in the so-called thermal ink jet (also called bubble jet) approach, a heating element is used instead of a piezoelectric crystal. The heating element is heated quickly, causing the ink to expand and form a bubble at the

18. Printers

305

nozzle opening. When the bubble bursts, it fires an ink drop upon the paper. One advantage of the thermal approach is that the ink is heated, and therefore dries faster. Some current ink jet cartridges have as many as 300 nozzles, each of which is capable of firing up to 12,000 ink drops per second. Nozzles are arranged in a rectangular array, so that more than one nozzle bears on the same horizontal line on the paper. Thus, more than one horizontal location can be printed without moving the print head.

Ink Jet Characteristics Color Printing While there are black-and-white ink jet printers, the trend is definitely moving toward color ink jets. An ink jet printer forms various colors by combining inks of three particular colors--cyan, yellow and magenta-in various proportions. This is called the CYM color model. (Note that color monitors use an ROB color model, which increases the difficulty in getting the colors on a monitor to match the colors printed by a color printer.) When black is added to the CYM model, it is denoted by CYMK. There are three distinct cartridge configurations used in today' s ink jet printers. •

Ink jets that use a four-cartridge configuration accommodate four separate cartridges at the same time-one for each color in the CYMK model.



Ink jets that use a dual-cartridge configuration accommodate two cartridges at the same time-one black and one tricolor cartridge, with three separate chambers holding the CYM colors.



Some smaller (or portable) ink jet printers accommodate only one cartridge at a time. With a single-cartridge configuration, the user must make a choice-install a black only cartridge (foregoing color) or a CYM cartridge, in which case black is simulated by combining all three colors. However, the resulting composite black often does not look as "black" as print from a true black cartridge.

The four-cartridge configuration tends to be used in high-end ink jets. It may be the most cost-effective for the end user, since individual colors can be replaced without wasting ink of any other color. (Nevertheless, studies have shown that the CYM cartridges tend to run low at about the same rate, probably

306

Understanding Personal Computers

due to the fact that very little printing is done in any of the pure CYM colors.) The four-cartridge model also allows somewhat faster color printing, since the tricolor cartridge must split its nozzles among the three CYM colors, thus giving one-third as many nozzles to each color. Fewer nozzles mean slower printing. Finally, the four-cartridge model allows larger color ink volume. To illustrate, according to Hewlett-Packard, the HP 800 series ink jets, which use a dual-cartridge configuration, will print an estimated 461 pages of color (with a 15% coverage), whereas the HP 1600 series, which uses a four-cartridge design, will print an estimated 1600 color pages. The dual-cartridge configuration is quite popular in midrange printers, since it is more economical to manufacture and is more suited to smaller printers. The disadvantages are that the tricolor cartridge accommodates much less ink of each individual color, and uses fewer nozzles of each color. Nonetheless, it may be a worthwhile compromise over the more expensive and larger four-cartridge design. The disadvantages of the single-cartridge design are obvious. However, it does have the advantage of saving space, which is useful for portable printers. Printer Resolution The resolution of a printer is a measure of print quality, usually expressed by giving a number of dots per inch, or dpi, that the printer can print. Print resolutions are often different for black-and-white printing than for color printing. A typical low-end ink jet printer may print at a resolution of 600 dpi x 300 dpi in high-quality black-and-white mode, and 300 dpi x 300 dpi in highquality color mode. Printers generally also have lower-resolution modes, for more rapid printing of draft material. Higher-end ink jets may have resolutions in the neighborhood of 600 dpi x 600 dpi for black-and-white and 600 dpi x 300 dpi for color. Print Quality The question of actual ink jet print quality is a subject for debate. Some say that ink jet print quality can approach laser printer print quality. Others say it is usually inferior, having a somewhat smeared or faint appearance. One thing is certain-ink jet print quality varies considerably, not just from printer to printer, but also from paper to paper. One of the problems with ink jet printers is that both ink drying time and ink absorption rate are critical to print quality. If the ink is absorbed by the paper too readily, the effect may be light or faint print quality. If it is absorbed too poorly, the effect may be a bumpy surface and smeared print. Also, ink that dries too slowly may be prone to smudging.

18. Printers

307

All in all, to get print quality approaching that of a laser printer, some experimenting is generally necessary to get a good match between the printer and the paper. Moreover, to duplicate some of the very impressive full-color photographic images that are sometimes seen in advertisements, it may be necessary to use special high-gloss, high-cost paper (ranging from 50 cents to 1 dollar per sheet)-not something one would care to do in large quantity on a modest budget. Printer Speed Print speed for ink jet printers and laser printers is generally measured in pages per minute (ppm). However, when this number drops below 1 ppm (as it sometimes does for color printing), some printer manufacturers cleverly switch to minutes per page (mpp). Thus, a printer that prints at 1;2 ppm is printing at 2 mpp. (Why do you suppose that printer manufacturers do this?) Printer speed is not a simple issue, since it varies considerably with the complexity of the page. Thus, pages with graphics tend to print much more slowly than pages with no graphics. Print speeds for ink jet printers are generally different for black-and-white printing than for color printing. (Surprisingly, this may even be true for plain text printed in a pure cartridge color, such as magenta.) On the slower end, we find high-quality print speeds of about 2 ppm in black and 1,4 ppm (or 4 mpp) for color. On the faster end, we find high-quality print speeds of about 8 ppm in black and 1 ppm for color. These speeds also do not take into account the time it takes for paper to be fed into position from the paper tray. In addition, print speeds also depend upon software considerations, especially printer drivers. All in all, it is very difficult to judge print speed using just the manufacturer's specifications. Often, the only way to get a really accurate picture of the print speed of a printer is to clock it with a stopwatch, under actual working conditions. (One should also be very skeptical about the results of in-store test page demonstrations.) Paper Handling Issues Ink jet printers vary as to the quantity of paper they can hold, as well as in their ability to handle oversized or undersized papers, envelopes, transparencies, card stock and the like. These issues are worth checking into before purchase. Some recent model ink jet printers can even print banners. In particular, these printers can accept a special fanfold paper whose length is that of 8 individual sheets of letter sized paper. These printers (and their drivers) have been specifically designed to deal with the problems encountered at the paper folds.

308

Understanding Personal Computers

Print Areas It is important to note that every ink jet (and laser) printer has a maximum print area, which is usually less than the full size of the paper. Thus, for example, a printer generally cannot print a bleed off the edge of the paper. This is due largely to a matter of self-preservation on the part of the printer, since spraying ink inside the printer (but not on the paper) could lead to major problems. If you have special needs with regard to printing close to the edge of a sheet of paper, you should take a careful look at the printer's documentation (before buying) to determine the maximum print area. Postscript Another issue that is worth considering with regard to printer selection is the availability of Postscript. Postscript is a highly flexible and very powerful printer language that is especially useful when printing high-quality graphics. One never knows whether, down the road, it may become desirable to add Postscript capability to a printer. However, not all printers support the Postscript option. Printer Drivers As we have alluded to earlier, the printer's software driver can have a very profound effect on printer performance. This is not just a speed issue, but also involves the proper functioning of the printer itself. Many times I have encountered bugs in printer drivers that would not allow a certain feature of a printer to work at alL Such problems put the user at the mercy of the printer manufacturer, to first admit that there is a problem, and then to fix the problem. Ink Jet Durability Ink jet printers are generally fairly durable. However, the cartridges may be a different story. Once the seal is broken on an ink jet cartridge, various chemical reactions are set in motion that begin the cartridge's ineluctable slide to ruin, primarily by clogging of the nozzles. The general problem is that the ink jet ink must be designed for two contradictory goals-quick drying once the ink reaches the paper and slow drying so that the ink does not dry whilst in the cartridge nozzle. Some ink jets have the ability to clean the cartridges with use, or at least to protect them during nonuse, but this may only delay the problem, especially if the printer is not used regularly. It might not be a bad idea to figure in a bit extra expense for an occasional clogged cartridge that must be replaced before its time. Also, the total lifetime of an ink jet cartridge can be surprisingly low. For instance, the specifications on at least one ink jet printer from a well-known

18. Printers

309

company says that its cartridges have lifetimes on the order of 1000 pages of black (5% coverage) and 200 pages of color (15% coverage). If you are concerned about cost per page, it might be worth comparing these statistics (along with cartridge prices) to those of a laser printer. As to overall durability, the documentation for one midrange ink jet printer says that the printer has a 60,000 page lifetime, with a 20,000 page MTBF (mean time before failure). Furthermore, the duty cycle of this printer is 1000 pages of black-and-white and 160 pages of color per month. On the other hand, the same company's mid-range laser printers have duty cycles ranging from 12,00035,000 pages per month. In short, for either frequent medium to high-volume printing, or for infrequent printing, laser printers are probably a better choice. Ink jet printers are best suited to regular, low-volume printing.

Power Consumption On the whole, ink jet printers compare favorably in terms of power consumption, using on the order of 5 watts when idle and 15-50 watts when printing. On the other hand, a laser printer may use several hundred watts when printing.

Laser Printers The first laser printer for the PC market was introduced in 1984 by HewlettPackard, and called the LaserJet. Today, laser printers are the printers of choice for general, nonportable, black-and-white printing, especially since the price of personal laser printers has become much more competitive in recent years.

Laser Printer Control Languages One way to control a laser printer (or indeed any printer) is to send it ASCII characters. The following experiment provides an example.

I~ I Experiment-Sending characters to a printer 1. Save your current work and start Quick Basic. 2. Enter the line LPRINT CHR$(69) ;CHR$(12)

310

Understanding Personal Computers

which will send an upper case E (ASCII 69) followed by a form feed control character (ASCII 12) to the printer. 3. Run the program. It should cause the printer to print an E and eject the page.

I~ I End of Experiment The problem with sending only ASCII characters is that we have very limited control over the printer. We cannot change fonts or line spacing, or draw circles, for instance, just by sending ASCII characters. Thus, printers that are capable of complex printing tasks (and laser printers certainly fall into this category) use one or more printer languages. For example, one universally popular printer language is Postscript. It is the job of the Raster Image Processor, or RIP (which is part of the printer's electronics) to accept the printer language commands from the PC and determine the individual print dots that are required to produce the image on paper. That is, the RIP constructs a bitmapped image of the page in the printer's memory. To illustrate printer languages in a bit more depth, consider that case of Hewlett-Packard laser printers. These printers generally accept four types of printer commands: •

ASCII control codes



PCL commands



HP-GL commands



PJL commands

We have already discussed ASCII control codes (in Chapter 2). PCL (Printer Control Language) is a printer language used to control most aspects of the printer, other than those used for drawing geometric shapes, that is, the so-called vector graphics, which are controlled by HP-GL (Hewlett-Packard Graphics Language). PJL stands for Printer Job Language, and is used to control the print jobs themselves (as opposed to the actual dots on the paper). For example, PJL can change the printer's control panel settings, modify the messages that the printer displays on its LED message display and request information from the printer about such things as the printer's configuration and job status.

18. Printers

311

The peL Printer Language The latest version of peL is PCL 6, although PCL 5 is still the most commonly used at the time of this writing. All PCL printer commands begin with the escape character, which has ASCII code 27. For this reason, they are also called escape sequences. For example, the escape sequence ESC & 1 # D

where # is a number, sets the line spacing. Here is a QuickBasic experiment that you can perform if you have a printer that understands PCL.

I~ I Experiment-Using the peL printer language 1. Save your current work and start Quick Basic. 2. Enter the following program and save it (if desired). Note that the letter after the ampersand (&), appearing in two places, is a lower case L. 'first set line spacing to 1 line per inch LPRINT CHR$(27); "&llD" 'print two A's on different lines LPRINT "A" LPRINT "A" 'then set line spacing to 12 lines per inch LPRINT CHR$(27); "&112D" 'print 12 A's on 12 different lines FOR i = 1 TO 12 LPRINT "A" NEXT i 'reset the printer to its default values LPRINT CHR$(27); "E"

3. Run the program. You should get a page with 14 A's on 14 different lines, showing how the line spacing commands have worked.

I~ I End of Experiment

312

Understanding Personal Computers

The HP-GL Printer language HP-GL (Hewlett-Packard Graphics Language) is designed to print vector graphics. The following experiment will illustrate.

I~ I Experiment-Using the HP-GL graphics language. 1. Save your current work and start Quick Basic. 2. Enter the following program and save it (if desired). LPRINT CHR$ (27) i "E" 'reset printer 'enter HP-GL mode LPRINT CHR$ (27) i "%OB" 'initialize HP-GL LPRINT "IN i " 'select pen number 1 LPRINT "SPli" 'move pen to location (2400,2500 ) LPRINT "PA2400,2500i" 'draw circle with radius 500 plotter units LPRINT "CI500i" 'return to PCL mode LPRINT CHR$(27)i "%OA" 'reset printer LPRINT CHR$(27)i "E"

3. Run the program. You should get a page with a circle on it.

I~ I End of Experiment Laser Print Engine Operation The print engine of a laser printer is the component that is responsible for actually forming the printed image on paper, using the bitmapped image in the printer's memory. The operation of a laser print engine is quite different from the previous types of printers. In simple terms, a laser printer composes a page in a temporary "electrostatic" state on a metal drum before transferring it to paper. Even then, the initial transfer to paper is only temporary, and must be "fused" onto the paper. The process used by a laser printer is referred to as an electrophotographic process, or EP process.

18. Printers

313

Most (but by no means all) laser printers use a print engine made by Canon. For example, the original HP LaserJet and Apple LaserWriter printers use the Canon CX print engine; the HP LaserJet II and III and the Apple LaserWriter II use the Canon SX engine; the HP LaserJet lIP and IIIP use a Canon LX print engine; the HP LaserJet IV and Apple LaserWriter Pro 600 use the Canon EX engine and the HP LaserJet 4L, 4P and 5P use the Canon PX engine. Canon engines have an all-in-one design, which essentially means that most of the perishable parts of the engine, in particular, the photosensitive drum that is the heart of the printing system, lie in the cartridge, and are thus replaced whenever the toner runs out and the cartridge is replaced. While there are many laser print engine designs, their basic operation is the same. Before describing this operation, let us note the following: •

A special photosensitive (light sensitive) drum, made of aluminum and coated with an organic compound, is the heart of the print engine. It is where the image is first constructed electrostatically. This drum is referred to as the EP drum, or sometimes the OPC drum (which stands for organic photoconductive drum).



The toner material is actually a fine powder of plastic resin and organic compounds bonded to iron particles, and can be electrostatically charged, which will cause it to cling to (or be repelled from) an electrostatically charged surface.

Figure 18.9 shows the basic operating components of a laser printer engine. Note that the exact design of the cartridge varies from engine to engine. We have included some measurements from the Canon EX engine to get an idea of the relative sizes of some of the components.

314

Understanding Personal Computers

Charge rol ler (1 .5in)

Laser beam

Doctor blade

Cleaning pad

Heat:er~~~~~~:::;~~~i;~~ ____~dP~a~p:er~------lC:f roll~ ( Paper feed rollers

Pressure roller

Fusion rollers

Static eliminator

Transfer roller (2 ~ in)

Figure 18.9 - Laser printer engine improvements We can now describe the laser print process. •

Clean Drum-Before a new page can be composed on the EP drum, the drum must be cleaned of all clinging toner particles. A cleaning blade is used for this purpose.



Discharge Drum-On early print engines, the EP drum was also erased of all electrostatic charges using an erase light (not shown). However, newer engines that use charge rollers do not require an erase light.



Charge Drum-The next step is to charge the EP drum to an electric potential of -600 volts, across its entire surface. In older cartridges, this is done by applying a very large negative charge (about -6000 volts) to a small wire, called the primary corona wire, that rests near the rotating drum. However, the high voltage required ionizes the surrounding air, producing a large amount of ozone. Newer cartridge designs use a sponge-rubber charge roller in place of a primary corona. Because the charge roller actually touches the EP drum, it does not need as high a charge (about -1000 volts) as a primary corona, thus eliminating much of the ozone production. One disadvantage to the charge roller is that it can transfer dust and other impurities to the very sensitive drum.

18. Printers

315



Write Image-The next step is to "write" the dots that make up the image onto the drum. This is done by focusing a laser beam at each dot to be written, thereby discharging the drum (making the charge less negative, to about -100 volts) at those locations. The laser beam is redirected by a series of mirrors that can change position, and the laser beam itself can tum on and off about 30,000 times per second. This process is called write-black writing since each discharged dot will be printed as black. (In write-white writing, the areas not to be printed are discharged.) The laser beam is guided to various locations on the drum by moving mirrors. The sensitivity of the moving laser beam determines the resolution of the printer. For instance, an engine capable of 600 dpi x 600 dpi resolution has a laser that can be moved horizontally over the drum in increments of Ij600th of an inch, and the drum can be rotated l/600th of an inch.



Develop Image-At this point, the image has been electrostatically composed on the drum and is ready for developing. For this, toner is applied to the drum using a sponge-rubber developer roller. The developer roller contains a permanent magnet and thus picks up the toner particles, which are given a negative charge of about the same magnitude as the drum (-600 volts). As the transfer roller comes in contact with the drum, places where the drum has been discharged will attract the toner particles away from the developer roller. Thus, the drum is coated with toner precisely in those locations that represent image dots.



Place Image on Paper-In order to transfer the toner from the drum to the paper, the paper is given a positive charge using a transfer roller. This causes the paper to attract the toner particles away from the drum. Early print engines used a transfer corona wire rather than a transfer roller.



Discharge Paper-Once the toner has been attracted to the paper, the paper is electrically discharged using a static eliminator. This is important in order to prevent the positively charged paper from being attracted to the negatively charged drum, and to keep the positively charged paper from repelling other positively charged sheets of paper.



Fusing-The final step in the printing process is to fuse the toner to the paper. This is done by passing the paper between two fusing rollers. The top roller heats the paper to about 300 0 F, thus melting the toner, and the bottom roller squeezes the paper against the top roller.

316

Understanding Personal Computers

Laser Printer Characteristics Print Speed Laser printers have a wide range of print speeds. The slower (and less expensive) personal laser printers generally print about 4-6 ppm, midrange lasers print about 8-12 pages per minute and high-volume networking lasers print up to about 24 ppm. As with all other printers, this speed is a theoretical one, and not generally obtainable under average printing conditions. Noise Laser printers are certainly quieter than impact printers, and are on a par with ink jet printers. Their engines and fans do make a noticeable hum during printing. Different laser printers behave differently when powered on but not printing. Laser printers that remain in an "active" state between print jobs have a constantly running fan that makes some noise, and they also produce considerable heat and use significant power while in this state. Of course, the user can tum the printer off between print jobs, but this is not always practical, especially since each time the printer is turned on, it cycles through a warm-up and self-test period, which can last in the neighborhood of 30-90 seconds, depending upon the amount of memory in the printer. On the other hand, some newer laser printers can go into a "sleep" condition that uses only a few watts of power, yet the printer will respond quickly to a new print job. In fact, at least one printer company has gone so far as to produce a printer with no on-off switch -a questionable extreme, since there are times when a printer may need to be reset by manually turning it off and then on again. Print Quality Laser printers produce the highest quality print of all PC printers. Low-end printers have a resolution of 300 dpi x 300 dpi, whereas most high-end printers have a resolution of 600 dpi x 600 dpi. There are specialty printers that can produce higher resolution images (up to 1200 dpi x 1200 dpi), but 600 dpi x600 dpi should be more than sufficient for normal printing demands. (In many cases, even 300 dpi x 300 dpi is more than sufficient.) Note also that laser printers can, in general, produce good quality output on ordinary paper. To enhance image quality even further, many laser printers employ a resolution enhancement algorithm (sometimes called RES) that uses smaller than normal dots to "fill in" jagged edges in an image. This tends to increase the apparent resolution significantly.

18. Printers

317

Paper Handling Laser printers have relatively versatile paper handling capabilities. Most can handle envelopes and legal size paper, although this may require using either the limited manual feed bins or purchasing a separate paper tray for each type of nonletter-sized paper. In addition, high-volume laser printers can generally accommodate far more sheets of paper in their paper trays than ink jet printers.

Other Considerations For the printing of complicated graphics images, additional memory beyond the standard amount may be necessary. Generally, it is relatively easy to install such memory, in the form of SIMMs. Also, many, but not all, laser printers allow a Postscript upgrade which, incidentally, may also require additional memory. We mentioned earlier that laser printers are far more durable than ink jet printers. On the other hand, most laser printers are larger and heavier than ink jet printers. Also, color laser printers tend to be quite expensive, and are probably only justified if relatively large quantities of high-quality color printing are necessary.

19. Asynchronous and Synchronous Transmission

Whereas parallel communication sends more than 1 bit at a time, serial communication sends only 1 bit at a time. Thus, only one data line is needed for serial communication. It might seem at first that parallel transmission is always to be preferred over serial transmission, since sending 8 bits (say) at a time is more efficient than sending only 1 bit at a time. However, life is not quite that simple. First, parallel transmission does have certain technical problems not shared by serial transmission, one of which is skewing, which occurs when the 8 bits traveling in parallel do not arrive at the receiving point at the same time. Also, parallel transmission is more expensive, requiring 8 data lines, rather than only one. In any case, serial transmission is used in a variety of ways in PC systems. For instance, the keyboard and mouse communicate serially with the PC, as does a modem. There are also serial printers, although these have faded from popularity. We have already discussed the fact that the latest trend in SCSI interface technology is toward serial data transmission. In addition, a new technology, called the Universal Serial Bus (USB) is making its appearance. This is a serial PC bus that allows the user to chain serial devices, such as the monitor, keyboard, mouse, modem, audio speakers, printers and game-playing devices, directly to a USB interface. This frees the user from connecting each device separately to the system unit.

Asynchronous and Synchronous Transmission There are two common methods used to transmit the bits in a serial bit stream. These are known as asynchronous data transfer and synchronous data transfer. Asynchronous transmission is used primarily when the characters that are to be transmitted are created at unpredictable times, or cannot be buffered (collected) before transmission. The prime example of this is a keyboard, which produces characters whenever the user happens to strike a key.

320

Understanding Personal Computers

With respect to modems, asynchronous communication is used only in very slow modems. Today's high-speed modems actually convert data that has been prepared for asynchronous transmission, by the typical serial port commonly found in a PC, into a form suitable for synchronous transmission. Once received, the data is then reshaped into the asynchronous format, for delivery to the receiving PCs serial port. Before discussing the details of asynchronous and synchronous transmission, we should note that the terms asynchronous and synchronous can be very misleading. In particular, synchronous operation should not be confused with synchronized operation. The receiver and sender are not synchronized in the sense that the receiver responds after a fixed time period to the events of the sender. After all, there is no way to predict the exact time it takes for the data to reach the receiver, so it cannot possibly work in sync with the sender. In another sense, both synchronous and asynchronous transmissions are synchronized, because in both cases clocks on each end of the transmission line run at the same rate, although not in step with each other.

Asynchronous Data Transfer Character data that are to be transmitted asynchronously are formed into a small packet, as shown in Figure 19.1. Let us refer to this as an asynchronous data packet (ADP). (Everyone else makes up terms, why shouldn't we?) The first bit in the ADP is a start bit, followed by the data bits, then possibly a parity check bit and finally one or more stop bits. In an ADP, the number of data bits is most commonly either 7 or 8 and the number of stop bits is either one or two. For instance, a keyboard uses 8 data bits, an odd parity bit and 1 stop bit. Thus, a keyboard ADP is II-bits long, with an 8-bit character.

19. Asynchronous and Synchronous Transmission

F

r

321

Start Character Parity Stop bit ..-....- - - - - - - - - - - - - _ , bit bit(s)

Marking

,~"

-IJ---..}J

o

0

;-1- - i - - - j

00

1

001

1

1

l

Ulr--t----\n~-

Transition

/_ Marking state

Stop bit

s~na~

~gn~s

start of new character

return to marking state

Figure 19.1 - Asynchronous data packet (ADP)

Parity Check Bit Let us briefly recall the concept of a parity check bit. To each binary data string, we associate an extra bit, called a parity check bit. In an even parity scheme, the parity bit is chosen so that there are an even number of l' s in the bits (data and parity). Thus, for instance, if the data is 110100, then the parity bit is a 1, to bring the total number of 1's to an even number (4). In an odd parity scheme, the parity bit is chosen to make the total number of 1 's odd. Thus, in the previous example, the parity bit would be a O. The purpose of a parity check is to detect when an odd number of errors has occurred. (Of course, we are most interested in the case when a single error has occurred, since that is by far the most common case.) For if an odd number of errors occurs, then the number of 1 's will change from even to odd, or vice versa. In either case, the parity check bit will be incorrect. Thus, by looking at this bit, the receiver can detect the errors. Of course, the choice of even or odd parity schemes makes no difference, as long as both parties agree to use the same scheme!

Asynchronous Synchronization The purpose of the start bit in an ADP is to signal the start of the ADP to the receiving device. In asynchronous transmission, when the communication line is idle, the sending device sends a constant stream of 1'so This is referred to as the marking state. (In communication lingo, a 1 is referred to as a mark and a 0 is referred to as a space.) Thus, the start bit is always a O. It signals the receiver to start its own clock and read the data bits and parity bit (if present). The trailing

322

Understanding Personal Computers

stop bits, which are always equal to 1, are then read, and serve to ensure that the line will return to the marking state, ready for the next ADP. Thus, we see that, in asynchronous transmission, the sender must know how to signal the beginning and ending of each character. The receiver must know how these things are done as well. This allows the sender and receiver to remain in so-called character synchronization. In addition, the sender and receiver must agree upon the transmission bit rate, that is, the number of bits per second sent over the line. Hence, they are also in bit synchronization. Finally, the sender and receiver must agree upon the number of data bits per ADP, whether or not parity is used and if so, which type, and how many stop bits are used. This information is referred to as the communication parameters. In the case of a keyboard, this information is built into the components. However, for asynchronous modems, the communication parameters must be set through the communication software.

Synchronous Data Transmission When the data is available in large chunks, as is the case when sending a file over a communications line, then high-speed synchronous transmission is possible. This requires that the sending modem collect, or buffer, the data being transmitted. The buffered data is prepared for synchronous transmission by collecting it into blocks, or frames, as shown in Figure 19.2. Each frame begins with one or more start of frame characters and ends with one or more end of frame characters. To indicate the time between frames, one of two strategies is used. One strategy is to fill the time between frames with idle characters. The other strategy is to use one or more sync characters to signal the beginning and ending of a frame, as pictured in Figure 19.2.

Figure 19.2 - A synchronous data frame One major difference between asynchronous transmission and synchronous transmission is that, in synchronous transmission, the receiver must remain in bit synchronization (that is, maintain the same bit-per-second rate) during the entire

19. Asynchronous and Synchronous Transmission

323

message, not just during the receipt of a single character. This is generally accomplished in one of two ways. Before discussing these methods, we should emphasize that the sole purpose of bit synchronization is to allow the receiver to determine when each "bit" begins and when it ends, so that the receiver does not miss any data bits, or read any bits twice. While the sender's and receiver's clocks may be running at the same rate, they are not synchronized, in the sense that the leading edge of each clock cycle on the receiver's clock bears no relationship with the leading edge of a clock cycle in the sender's clock. This is illustrated in Figure 19.3.

RWWWWWL n n n n n :n:n:n:n:n R lJJLJJLJJLtJtIrL I

Sender's Clock

I

I

I

I

Receiver's +--I---HI--+-++-I-++--+----H-+-H---IClock I I I I I

Figure 19.3 - Clocks not in synchronization In one method for keeping the receiver's clock in sync with the incoming data bits, the sender's clock signal is actually embedded into the data stream and then extracted by the receiver. In the second method, the bit transitions in the data itself are used to keep the receiver's clock in synchronization with the incoming bits. This may require some adjustment to the bit stream, because a long string of consecutive Os (or l' s) does not provide enough transitions to maintain synchronization. Without going into too much detail about these two approaches to synchronous bit synchronization, Figure 19.4 illustrates one method of encoding the sender's clock signal into the data stream. This method is called bipolar encoding.

324

Understanding Personal Computers

Figure 19.4 - Bipolar clock encoding

As you can see, each bit cell is one clock cycle long. The data is encoded during the first half of each clock cycle-a 0 bit sends the encoded signal low and a 1 bit sends it high. The encoded signal always returns to 0 voltage for the second half of the clock cycle. Thus, it is a simple matter to electronically extract the clock signal from the encoded signal.

20. The Serial Interface

We mentioned in the previous chapter that asynchronous transmission is used by modems only when the bit rate is low. However, PC serial ports produce data for asynchronous transmission only, by grouping the data into asynchronous data packets. (Synchronous serial interfaces are available, but they are not in common use.) To send the data synchronously, a high-speed modem must first convert the asynchronous data packets coming from the PCs serial port into data frames, by stripping out the start, parity and stop bits, and reframing the data. On the receiving end, the modem must then reformat the data into asynchronous form, for receipt by the remote PCs serial port. All this trouble is worth it, since synchronous data frames have a much higher data-to-nondata ratio than asynchronous data packets. With this in mind, let us turn to the operation of a typical PC serial port.

Serial Port Address Assignments As we have mentioned before, the PC can support up to 4 serial ports, which the BIOS denotes by COMI-COM4. The base addresses of these ports are determined during the startup procedure and stored in the BIOS data area, starting at address 400h: 400:

COMI port address

402:

COM2 port address

404:

COM3 port address

406:

COM4 port address

408:

LPTl port address

40A:

LPT2 port address

40C:

LPT3 port address

326

Understanding Personal Computers

The BIOS uses the same procedure for assigning COM ports as it does for LPT ports. Namely, it scans the serial port addresses 3F8h, 2F8h, 3E8h and 2E8h, in this order, looking for physical ports. These addresses are then placed in the BIOS data area.

I~ I Experiment-Looking at Serial Port Assignments To check out your serial (and parallel) port assignments, do the following: 1. Save all current work and start debug, as described in the appendix (Trying the Experiments). C : \ WIN9 5 >debug...,J

2. Do a memory dump at address 400h.

On my PC, which has two physical serial and two physical parallel ports, the results are -d 40:0 0040:0000

yy COM1

00 00 00

COM2

oo?~ 00

00

oc

02

........ x.x .....

LPT1 LPT2

Note that, on a PC, the memory bytes that make up a 16-bit word are stored in reverse order. Thus, the COMI port address is 03F8h.

I~ I End of Experiment The Serial Interface Figure 20.1 shows a serial interface and its relationship to the host PC and a serial device, such as a modem. Although the most commonly used serial device

20. The Serial Interface

327

is a modem, there are other serial devices, such as printers, mice and scanners. Let us consider the case of a modem.

B CPU

DB 9-pin or DB 25-pin Connectors

I----------, ~

I I I Serial Data : Stream»I ( ! ) UART ~--~ Parallel I RS-232 Data I Control Stream I Signals I _ _Interface _ _ _ _ _ ...1I I_ _ _ Serial

-

erial Cable

Modem Data Communication Equipment (DCE)

Data Terminal Equipment (DTE)

Figure 20.1 - A serial interface As you can see from the figure, there are three distinct communication paths involved in the transmission process. First, data must travel from the CPD to the serial interface within the PC. Since this data is traveling along a PC bus, the data travels in parallel. The data is received (8 bits at a time) by the so-called UART chip. The primary purpose of this chip is to convert the parallel data stream into a serial stream (and vice versa), for transmission to the modem. The DART chip communicates with the modem using a protocol known as Recommended Standard 232, or RS-232 (now called EIA/TIA-232). The modem then converts the serial digital data to an analog signal, for transmission over a telephone line. Of course, the reverse process takes place at the receiving end of the transmission. TheUART The term DART stands for Universal Asynchronous Receiver/Transmitter. This chip forms the major portion of a serial interface. For data transmission, the DART receives the parallel data from the PC bus, reformats it into asynchronous data packets (ADDs) by adding start, parity and stop bits, and sends the resulting serial bit stream to the modem. On the receiving end, the DART chip does just the opposite. There have been several generations of DART chips since the original pc. The latest version is the DART 165500. Dnlike its predecessor chip, the 16450,

328

Understanding Personal Computers

the 16550 provides two 16-byte data buffers (one for input and one for output), which help avoid data overflow when the chip is receiving or sending data. Since the first bits that are placed into a buffer are the first bits taken out, the buffers are referred to as FIFO buffers, which stands for first in, first out buffers. A More Detailed Look at the UART 16550 Figure 20.2 shows a simplified block diagram of a UART 16550. Internal Data Bus

(

.1 16-byte Receive I. _I Receive Timing FIFOBufter ~ and Control +-DATAIN

a-Bit Data to/from CPU

~ ~

~~ine Control Registe~

~ Enable Reset Select and Read Write Control

DMA

1 Transmit Timing and Control

1~116-bYte Transmitl~DATA OUT FIFO Buffer

Logic

)

(

Modem Control Logic

IControl Registerl

~

Interrupt Logic and Registers

IStatus Register I

I

~RTS ~CTS ~DTR ~DSR ~DCD

~RI ~OUT1 ~OUT2

INTR

Figure 20.2 - A simplified block diagram of a 16550 UART chip As you can see from the figure, the UART 16550 chip is fairly complex. In fact, it has no less than 11 registers (not all shown) for communication and control. These registers use consecutive port addresses, beginning at the base address given in the BIOS data area. DataFlow The UART sends and receives data from the CPU one byte at a time. On the other side of the chip, data is sent and received one bit at a time. The value of the UART's line control register determines the number of data bits, parity bits and stop bits in the transmission. In particular, the register has the following form. Even/Odd

Enable

Number of

Number of

20. The Serial Interface

329

The number of data bits per ADD is given by the two least significant bits in the line control register, as follows: 00: 01: 10: 11:

5 data bits 6 data bits 7 data bits 8 data bits

(Hence, these are the only choices for the number of data bits.) If the Number of Stop Bits flag is set to 0, then 1 stop bit is used. If the flag is set to 1, then when transmitting 5 data bits per ADD, the number of stop bits sent is 1.5; otherwise 2 stop bits are sent. The Enable Parity flag is set to 1 to enable parity on sending and receiving. The Even/Odd Parity flag is set to 1 for even parity, and 0 for odd parity. The UART Bit Rate Special timing circuitry is used to pace the serial data output (and can also pace the input). In particular, the DART receives a clock signal from the PC at a frequency of 1.8432 MHz. This frequency is divided by 16, to get 115,200 Hz. Thus, 115,200 bps is the maximum bit rate at which the DART can transmit serial data to the modem. We should note that the BIOS serial port services may (depending on the BIOS) allow a maximum speed setting of only 19200 bits per second. Thus, to achieve a higher rate, the DART registers need to be manipulated directly. Also, the BIOS services themselves add overhead to the manipulation of the DART, possibly limiting the bit rate. The Role of CPU Interrupts The DART also has interrupt circuitry. Depending upon the setting of the Interrupt Enable Register, an interrupt to the CPD can be triggered by various events, as shown in the following list. •

Error receiving data (for example, buffer overflow or parity error).



Data has been received.



Receive buffer has reached the so-called trigger level, which can be set by the programmer to 1, 4, 8 or 14 bytes. This is used to inform the CPD that a specified number of bytes are in the receive buffer.

330

Understanding Personal Computers



Timeout. This means that no bytes have been placed in, or removed from, the receive buffer within the last 4 clock cycles, but there is at least one byte in this buffer.



The UART is ready for the next byte from CPU.



The modem is ready to send the next bit of data, or there is a ring signal on the telephone line, or a data carrier signal. We will discuss these signals a bit later.

Of course, in order to use the interrupt features of the UART, each serial port needs an IRQ line. (A UART chip will still function without an interrupt line, but its communication with the PC is seriously hampered.) It is most common to assign IRQ4 to COMI and IRQ3 to COM2. When more than two serial ports are available, it may be possible to double up on IRQ lines, but not if the two devices ever need the line at the same time. The only thing you can do is try it to see if it works. Good luck. Incidentally, while the UART 16550 is capable of using DMA for speedier data transfers, sources tell me that DMA has not been actually implemented for UARTchips.

Super I/O Chips As is the common theme in modem PC design, these days virtually all new motherboards incorporate various I/O functions (including the UART 16550 functions) in a single chip, called a SuperIlO chip. The concept of a Superl/O chip was conceived by National Semiconductor Corporation in 1990, and became widely used in 1992. Typically, a Superl/O chip for a desktop PC supports the following features: •

two serial interfaces,



one parallel interface,



a floppy disk controller,



possibly a standard IDE controller.

In addition, one of the two UARTS on a Superl/O chip may support Fast IR, which is a protocol developed by the Infrared Data Association (IRED). This organization has developed standards for infrared (wireless) communication between devices. Fast IR is capable of 4 Mbits/second data transfer rates, provided that a suitable clock is used to pace the UART function. In addition, the

20. The Serial Interface

331

Fast IR UART chip uses 32-byte FIFO buffers, instead of the standard 16-byte buffers.

The EIA/TIA-232 Protocol The EIA/TIA-232 standard, which was previously referred to as the Recommended Standard 232, or RS-232, is a standard developed by the Telecommunications Industry Association (TIA) , an offshoot of the Electronics Industries Association (EIA) , for communication between the computer's serial port (UART) and a modem. More generally, the standard describes communication between data terminal equipment (DTE) and data communication equipment (DCE). (See Figure 20.1.) The standard specifies both the physical connectors and the communication protocol. The RS-232-C standard was released in 1969, and was the current version during the first 7 years of the PC revolution. In 1981, the EIA released a new version, called EIA-232-D. The main improvements in the new version relate to testing issues, and will not concern us here. In 1991, the TIA released the latest version of the standard, called EIA/TIA-232-E. The International Telecommunications Union (lTU) has also released standards that are very similar to EIA/TIA-232, called V.24 and V.28. The division of the ITU responsible for standards development was called the CCITT and is now referred to as the ITU-T. Thus, outside of the United States, you may see these terms.

PC-to-Modem Communication The EIA/TIA-232 standard specifies the use of a 25-pin D-shell connector for the modem (DCE) and serial port (DTE). Alternatively, many PCs use a 9-pin Dshell connector (sometimes called an AT style serial connector). Note that the 25-pin connector is the same one that is used for a parallel interface, but the PC has a female connector for its parallel interface and a male connector for its serial interface. (Of course, the connectors on the cables have the opposite gender and the cables themselves are wired differently.) The reason that a 9-pin connector will work with a serial interface is that only 9 pins are needed. The matchup between pins, the signal names and abbreviations and the direction of signal travel are shown in Table 20.1. Before looking at the table, we should mention that the world of PC communications is quite complex and very few statements can be made that apply to all situations. Thus, the use of EIA/TIA-232 signals has not only evolved with changing

332

Understanding Personal Computers

technology, but its manner of use is also tied to the particular communications software in use. Thus, we can speak only in general terms.

Table 20.1 - EIAlTIA-232 Communication Protocol DB·9 Pin

DB·2S

Signal

Signal

Pin

Name

Abbreviation

I

8

Carrier

CD,DCD,

Detect

RSLDorCS

Purpose

Direction Modem-+PC

Modem signals PC that it is receiving carrier signal

from

remote

modem

2

3

Received

RDorRxD

Modem-+PC

Data

3

2

20

Transmit

TD,SDorTxD DTR,DTE

Terminal Ready

orTR

7

Ground

6

6

Dataset

PC-+Modem PC-+Modem

4

Data transfer from PC

Request

PC signals modem that is interface

serial

operational DSRor DCE

Modem-+PC

Modem signals PC that modem is operational

RTS

PC-+Modem

PC requests permission

Rea~

7

from

to modem

Data

5

transfer

modem to PC

Data

4

Data

of modem to send data

to Send

to modem

8

5

Clear

to

CTS

Modem-+PC

Modem signals PC it is ready to receive data

Send

from PC (in response to RTS)

9

22

Ring Indicator

R1

Modem-+PC

Modem signals PC that a ring is detected

Let us make a few comments about the signals in Table 20.1. •

Before any communication can take place between PC and modem, the PC asserts its DTR (Data Terminal Ready) line and the modem asserts its DRS (Dataset Ready) line. The use of these signals varies depending upon the particular communications software being used. DTR is generally asserted at power on. Depending upon communication software requirements, the modem can be set to either ignore DTR or hang up the line when DTR is deasserted, in response to a command from the software. As to DSR, it is usually just asserted at power on and then ignored. However, sometimes it is

20. The Serial Interface

333

asserted only when the modem has detected a carrier signal from a remote modem. •

The signals RTS (Request to Send) and CTS (Clear to Send), as described in the table, are for half-duplex communication (one direction at a time), where the PC must ask the modem if it is able to accept data, and vice versa. Since full-duplex communication (both directions simultaneously) is now the standard, these signals are now generally used instead for hardware flow control. In other words, both signals are normally high, but can be temporarily sent low when the corresponding device is unable to accept more data. Thus, the RTS (Ready to Send) line is normally asserted, but is deasserted if, for example, the CPU is otherwise occupied and cannot take data from the UART. Similarly, the CLS (Clear to Send) is normally asserted and deasserted if, for example, it encounters a data error problem that slows down its ability to accept data.



An incoming call causes the modem to assert the RI (Ring Indicator) line to signal the Pc. This line can be used, when the communication software is set to autoanswer, to notify the PC to signal the modem to answer the phone. The remote modem (the one that initiated the call) sends a carrier signal (discussed later) over the line. When the modem detects this carrier signal, it signals the PC by asserting CD (Carrier Detect), and asserting DSR (Dataset Ready) if this is not already asserted. The PC now knows that there is a modem making contact on the other end of the phone line.

Flow Control As mentioned, the current full-duplex use of the CTSIRTS lines just described is referred to as hardware flow control. This is a form of local flow control, between the modem and its corresponding Pc. End-to-end flow control refers to controlling the flow of data between modems, and is controlled by the errorcorrection features of the modem, or by the communication software. Flow control is sometimes referred to as handshaking (although this also refers to the process of determining a mutually agreeable set of parameters between the two communicating modems). Flow control is important for many reasons. For example, modem high-speed modems perform error detection procedures on incoming data. For this purpose, the receiving modem accepts incoming data into its buffer and checks for errors. If none are found, it can acknowledge the receipt of correct data. Otherwise, it must request retransmission. Accordingly, the sending modem needs to retain the data in its buffer, in case a retransmission is necessary. During that time, the

334

Understanding Personal Computers

sending modem may not be able to accept further data from its PC. As another example, the sending modem generally performs data compression on the data coming from its PC. This process can take varying amounts of time, perhaps requiring a temporary halt to the data flow. To use hardware flow control, your communication software needs to support it (and it must be enabled). An alternative to hardware flow control is software flow control, of which there are two common types. When XONIXOFF flow control is enabled, the modem sends one of two ASCII characters to the PC to turn data transmission on or off. The XOFF character turns data transmission off and the XON character turns it back on. XOFF is the ASCII control character with ASCII value 19, and is called DC3 or Device Control 3. XON is the ASCII control character with ASCII code 17 and is called DCI or Device Control I. The XON character can be entered at the keyboard using Qtrl-Q and the XOFF character can be entered using Ctrl-S.1n order to use XON/XOFF flow control, the communication software must support this protocol (and it must be enabled). Note that the PC may also send flow control commands to the modem, to temporarily halt the data flow in that direction. There is a major problem with XON/XOFF, however. Whenever either of these control codes appears in the normal data stream being received by the modem over the phone line (or if it is being echoed), it will trigger the flow control! Thus, XON/XOFF is best used when transmitting pure ASCII text, with no embedded control codes. An alternative to XON/XOFF is enquire/acknowledge (ENQ/ACK) flow control, used by certain Hewlett-Packard computers. In this case, an enquire signal is sent to enquire whether it is okay to send data. Upon receiving an acknowledge signal, a block of data is transmitted.

21. Modems

Let us now tum to the role of the modem in serial communication. Figure 21.1 shows a typical situation for the PC user.

PC . Data Terminal Equipment (DTE)

Modem Daia Communication Equipment (DGE)

Figure 21.1 - A typical modem connection In this figure, the PC plays the role of the Data Terminal Equipment (DTE) and the modem plays the role of the Data Communication Equipment (DCE). The terms DTE and DCE apply to other types of hardware, but we will concentrate on PCs and modems in our discussion. As you can see from the figure, a modem must deal with data from two sides-the PC side and the remote modem side. Moreover, data must flow in both directions-into the modem and out of the modem.

Digital Versus Analog Signals Generally speaking, there are two fundamental ways in which digital information (bits) can be transmitted over a communication line. In digital transmission, a signal, or absence thereof, represents a 1 or a O. For instance, over a fiber optic cable, a pulse of light can be turned on or off to represent a 1 or a O. This switching happens (more or less) instantaneously and, for this reason, is called a digital signal. Figure 21.2 shows such a digital signal, also called a square

336

Understanding Personal Computers

wave. The vertical portions of the graph represent the transitions from 0 to 1, or vice versa, which may occur only at regularly spaced time intervals. The horizontal portions of the graph represent a constant signal. Each horizontal segment representing a single bit is called a bit cell.

Digital Data

Digital Signal

o

o

o

I I

o

o

---'-----+-------'----il---+-n--+:~~~~~~~~~

time

Figure 21.2 - A digital signal (square wave) In digital communication, a traditional type modem is not necessary. The equipment on the receiving end senses the state of the line at preset time intervals and records a 0 or a 1. Of course, noise and other distortions on the line play a role in making this more complicated and less than 100% reliable. On the other hand, in the PC world, and in particular over the common phone lines that comprise the Public Switched Telephone Network (PSTN), data is carried over lines using an electrical signal that has a continuously varying voltage. It is just not possible to make an electrical signal change from say, +5 volts to -5 volts, instantaneously. This is an analog signal and provides analog transmission. (Analog signals can also be created by using a light wave whose frequency, that is, color, varies continuously, rather than being turned on or off.) Unfortunately, confusion often arises because an analog signal that is designed to carry, and is interpreted as carrying, digital data (that is, only two states), is often referred to as a digital signal. Put another way, when discussing the most common form of communication either within a PC or over a phone line, the signal is always analog, but the circuitry that interprets the signal does so digitally, that is, it interprets the signal as carrying only two states-representing 0 and 1.

337

21.~odenas

1

_~t=

I I

I

I I

:

o'~. .~ -1

-it

---+-

~I

,

I

: ,

I

I

~--------~~-=---':-:~~~~~-.-~

~ , ------i, I

'

I

I I

:

I

I

I I

I II

--.....

______

,I

I

---+-

~

I

,I

______

I

II

-.......o!I

'y----' -, , ., 0 ~

-1

I

...........-f.. I

.....--.......

I

.....--.......

I

I

I

==,=·==c==~========================~==========,==========

Figure 21.3 - Approximating a digital signal with an analog signal

338

Understanding Personal Computers

In order to make the job of the circuitry easier, we would like, in principle, to make an analog signal look as much like a digital signal as possible. Surprisingly, this is not hard to do. Figure 21.3 shows a number of pure analog signals that were derived mathematically (by yours truly, using a little elementary calculus) from the digital signal in Figure 21.2, and then plotted using Microsoft Excel. (We will define the term pure a bit later). At the bottom of the figure, these pure analog signals have been combined into a single analog signal, which can be transmitted over an analog line. Notice how closely this analog signal begins to approximate the digital signal. Remarkable, isn't it? There is a problem, however. While this analog signal may be suitable for transmission within the short confines of a PC bus, or even over certain cables, the signal has too high a frequency (varies too often) to be sent over a phone line. Thus, a slightly different approach is necessary. Namely, the digital signal is first used to change, or modulate, a pure analog signal, whose frequency is "acceptable" to the phone line. This signal is called a carrier signal. On the receiving end, the analog signal must be demodulated in order to retrieve the digital data. This is the main job of a modem.

Modems The term modem is a contraction of modulation/demodulation. As just mentioned, the main purpose of a modem is to convert the digital data stream coming from the serial interface (that is, the string of Os and 1 's) into an analog signal that can be sent over the phone lines. This is done by modulating an analog carrier wave. Of course, the reverse demodulation process must take place by the modem on the receiving end. It may help to clarify the role of a modem by thinking of the preparation of the signal to transmit as a two-step process. First, the modem uses the digital signal from the PC to modify (or modulate) an analog carrier wave. As we will soon see, the result is a "mixed" signal that is neither all digital nor all analog. The modem then filters the signal to produce an analog approximation to the mixed signal. This analog approximation is sent over the phone line. Modem modems do more than modulate and demodulate signals, however. Today's intelligent modems are capable of accepting commands from the user to perform various functions, such as automatically answering the phone after a certain number of rings. Also, in an effort to increase the effective speed of transmission of data, modems employ a variety of data compression schemes. Modems also add error detection information to the data stream. We have

21. Modems

339

already mentioned that high-speed modems convert asynchronous data to synchronous form before transmission, and do the reverse on the other end. In summary then, a modem serves the following main functions: •

Modulates a carrier wave in order to encode digital data into an analog form for transmission over a phone line.



Demodulates an analog signal to extract the digital data.



Employs data compression techniques to raise the effective rate of data transmission.



Adds error-detecting information to the signal and checks for errors on the receiving end of the transmission. If errors are found, the modem requests retransmission of the data.



Accepts commands from the user (or communications software) to adjust settings and perform additional functions.

Duplex and Echoplex When two distant modems are linked, it is usually over a normal telephone line, that is, over the PSTN. At least part of this connection is a two-wire connection, where one wire is for data and the other is for ground. Thus, data flows over only one wire. Therefore, the simplest way for two modems to communicate is to take turns. When communication can take place in both directions, but only in one direction at a time, it is referred to as half-duplex communication. On the other hand, when communication can take place in both directions at the same time, it is called full-duplex communication. Most PC modems use full-duplex communication. To accomplish this over a single line, slow-speed modems may use two carrier waves of different frequencies. Each one is modulated in order to carry data. On the other hand, high-speed modems may use a single carrier wave. Each modem is able to filter out its effect on the carrier, thus leaving only the other modem's signal. This process is sometimes called echo cancellation. Unfortunately, there is considerable confusion in the use of the terms fullduplex and half-duplex. This is primarily related to the issue of whether or how a signal might be echoed to the sending PCs monitor, a procedure referred to as echoplex. In full-duplex echoplex, the characters sent from the sender's PC are echoed back from the receiver, and then displayed on the sender's monitor. In

340

Understanding Personal Computers

half-duplex echoplex, characters are displayed directly on the sender's monitor before they are sent, rather than being echoed back. (This is local echoing.) Unfortunately, modem documentation often does not use the term echoplex, referring to these modes simply as full-duplex and half-duplex.

Amplitude, Frequency and Phase To understand how modems modulate carrier signals, we need to consider a few simple terms related to waves. Figure 21.4 shows two waves of different amplitudes. The amplitude of a wave is its height, and is a measure of the strength of the wave, that is, the signal strength, as measured in volts.

-1'---+---+---+----.--1

Am"it"d,

Amplitude

Figure 21.4 - Amplitude change

Figure 21.5 shows two waves of different wavelengths. The wavelength of a wave is the length of one complete cycle

n: rv·.

21. Modems

341

~~

1 sec

+---+-i---\---,rt----t-----jf---+----r-

time

Wavelength

Figure 21.S - Wavelength change The frequency of a wave is the reciprocal of the wavelength

1

Frequency = W I t h ave eng Thus, the top wave in Figure 21.5 has a frequency of one cycle per second (or 1 Hz), whereas the bottom wave has a frequency of 2 cycles per second (or 2 Hz). Figure 21.6 shows several waves that have been phase shifted.

-f----'\--------,f---~r_-__r_time

900 phase -+----+---+--j----+-----+-- time shift

1800 phase +----I------'\---------;jf------~ time shift

270° phase shift

-I-----+---+------.f---~;__-

Figure 21.6 - Phase shifting

time

342

Understanding Personal Computers

The waves are identical in amplitude and frequency, but "start" at a different location. For instance, the second wave is shifted left one-fourth of a complete cycle. This is a 90 degree phase shift. Shifting another 90 degrees, we get a total phase shift (relative to the fIrst wave) of 180 degrees. The fmal wave is shifted 270 degrees relative to the fIrst wave. We will refer to a wave that has constant amplitude, frequency and phase as a pure wave. Thus each individual wave in Figures 21.4-21.6 is a pure wave.

Modulation Types There are several ways in which a carrier wave can be modulated using a digital signal. Figure 21.7 illustrates amplitude modulation (AM), frequency modulation (FM) and phase modulation (PM). As you may know, AM and FM are used to modulate a carrier wave to carry radio signals. The AM frequency range (or passband) is about 540-1600 KHz whereas the PM passband is 88108 MHz. Each station is given a carrier frequency in this range, which is what you "tune into" on your radio. (The "music" then modulates this carrier wave.) I Digital Data

Digital Signal

o

o

1

1

_-I--:----r-----tl

0

1:

I I'---_

Amplitude Modulation

Frequency Modulation

Phase Modulation

I I I

New Start

I

New Start

0

I

New Start

Figure 21.7 - Types of modulation

I I

New Start

21.~odenns

343

In sinnple nnodulation, at each interval, the signal either changes state or rennains the sanae, giving two possibilities that can thereby encode the value of a single bit. In this context, annplitude nnodulation is called amplitude-shift keying (ASK), frequency nnodulation is called frequency-shift keying (FSK) and phase nnodulation is called phase-shift keying (PSK).

Multibit Modulation As a practical nnatter, analog signals can change state (anaplitude, frequency or phase) only a linnited nunnber of tinnes per second. The signal change rate of a signal is called its baud rate. The ternn is nanaed after J.M. Emile Baudot, 1845-1903, a French telegraph operator who developed a five-bit-per-character code for telegraphs. The absolute nnaxinnunn baud rate of a telephone line is sonnewhere in the neighborhood of 5000-6000 baud, with practical linnits closer to 3000 baud. Hence, to transfer data bits at a higher rate than 3000 bps, it is necessary to encode nnore than on bit in each signal change. This is referred to as multibit modulation or group coding. For instance, Figure 21.8 shows a nnultibit phase nnodulation, where each signal change can be in anyone of four phase changes: OQ, 90Q, 180Qor 270Q. Thus, it is referred to as quadrature phase-shift keying (QPSK). Because there are four possible phase changes, each signal change can represent 2 bits of infornnation. I

00

I

01

10

11

Pvf\v:~v~~V\fifv{l, ,

,

00

90 0

180 0

270 0

Figure 21.8 - QPSK modulation Further phase angles can be used to encode nnore bits per signal, but doing so increases the potential for error in deternnining the annount of phase shift. Thus, it is connnnon to connbine phase-shift keying with annplitude nnodulation. This is known as quadrature amplitude modulation (QAM). One possibility is to allow phase shifts in nnultiples of 45 Q, giving the 8 phase angles

344

Understanding Personal Computers

In addition, 4 amplitude levels could be used, denoted by AI, A2, A3 and A4. It is customary to indicate all of the 8x4=32 possibilities in a graph, as shown on the left side of Figure 21.9. Each dot represents a particular amplitude and phase angle. The difficulty is that the dots are fairly close together, which means that two phase angle-amplitude settings may be confused with each other. To avoid this problem, it is customary to eliminate dots in such a way that the amplitudes associated with consecutive phase changes (for example, 45 Q and 90Q) are different. This gives the graph on the right side, called a 16-point quadrature amplitude modulation constellation. Since there are 16 possible phase angle-amplitude pairs, each signal change can hold 4 bits of binary data.

Figure 21.9 - A quadrature amplitude modulation constellation In this QAM example, the bit rate is 4 times the baud rate, which would allow a 2400 baud rate modem to transmit at a bit rate of 9600 bps. There are even more sophisticated approaches to encoding bits into signal changes. Another scheme in use is to encode each consecutive sequence of 4 data bits, by adding a fifth bit. This, in effect, separates the nibbles (4-bit words) before encoding into signal changes. The corresponding 5-bit sequence can then be represented by one of the 32 dots in the constellation on the left in Figure 21.9. This is an example of trellis-coded modulation. Without going into the details, we simply mention that schemes such as these enable modem modems to transmit at up to roughly 28,000 bps (9 bits per signal change at 3200 baud is 28,000 bps).

21. Modems

345

Wave Harmonics You may have noticed that the modulated waves in Figure 21.7 have instantaneous changes, either in amplitude, frequency or phase. Thus, they are not analog signals and cannot be transmitted over an analog line. (Neither are they pure digital signals.) Moreover, even if they were analog, their frequencies would not be suitable for transmission over a phone line. Let us deal with the fIrst issue fIrst. Happily, there is a mathematical theory that says that any wave (digital, analog or neither) can be approximated by a sum of pure analog waves of varying amplitudes, frequencies and phases. (This is called the Fourier series of the wave.) Moreover, each wave in the sum has a larger frequency than the previous wave. These waves are called the harmonics of the original wave. The frequency of the first harmonic is called the fundamental frequency. The term overtone is also used for the harmonics after the fundamental. Thus, the second harmonic is the fIrst overtone. Incidentally, it is the presence of overtones that gives a sound its "richness" and distinguishes the difference between say, a violin and a piano, both playing the same note. Figure 21.3 shows the fIrst few harmonics for a purely digital signal. The principle is the same for the signal obtained by modulating an analog carrier signal using a digital signal, as shown in Figure 21.7. According to the mathematical theory, the more harmonics we take, the better is the approximation to the original signal by the resulting analog signal. However, we have a problem with the limited capabilities of phone lines.

Telephone Bandwidth The human ear is capable of distinguishing (sound) waves whose frequency ranges roughly from 20 Hz to 20,000 Hz. This is called the passband of the human ear. The low-frequency waves are heard as low-pitched tones and the high-frequency waves are heard as high-pitched tones. Thus, the bandwidth of the human ear, which is the total width of frequencies it can hear, is 20000-20 = 19980 Hz. (Note that, as we get older, we lose a great deal of our hearing range, especially in the higher frequencies.) A telephone line, on the other hand, is capable of transmitting waves in a passband of only about 400-3400 Hz, with a total bandwidth of only 3000 Hz. The reason for this restriction is economic-it is much cheaper to design a system with a limited frequency range. Moreover, telephone lines were intended

346

Understanding Personal Computers

only for voice communication-they were never intended for transmitting concert music, or digital data! Thus, in approximating a modulated carrier wave with an analog signal, we can use only those harmonics whose frequencies do not push the modulated carrier wave outside of the passband of the line. Common practice is to start with a carrier wave whose frequency is somewhere near the middle of the passband of the phone line, say about 2000 Hz. The digital signal is then used to modulate the carrier wave. The resulting nonanalog signal is then approximated using harmonics whose frequencies fall within the passband of the phone line. In fact, modems have filters that deliberately filter out high or low frequency signals, in an effort to provide consistency to the resulting signal. (Phone lines often vary quite substantially in their pass bands, and these filters prevent unexpected signals from coming through the line.) Of course, the idea is that there are enough harmonics passing over the line to reconstruct the original signal to a sufficient degree of accuracy. This is really not that hard, since the modem interprets the analog signal digitally, that is, it only needs to distinguish between two different states--O and 1. If the digital data happened to be decimal (10 states), rather than binary, we would need to distinguish between 10 different states, and would thus need a much more accurate reproduction of the signal, that is, a line with a much higher bandwidth. (At the current state of technology, we would also be in big trouble.)

Distortion In addition to the problems associated with a limited bandwidth, a signal traveling over a telephone line is subject to various other forms of distortion, as shown in Figure 21.10. We have illustrated the effect on the original digital data. Of course, it is the modulated carrier (or rather its analog approximation) that suffers the distortion over the phone lines.

21. Modems

Digital Data

Digital Signal

o

o

1

-.------l~:-----;I I

Bandwidth Limitation

1

---r--------+--,----+I---1-[ I

Attenuation

I I

-:1-_-

o

1

:

I

I

I

Ji\

~::

347

o

I It--I

I

I

I

:C1

~

:~

II~I~

Delay Distortion

Line Noise

~ ~ I I \:::!1 I \..-. I I I I

I I I Il..io...

I I I I,

I I I I I I I.~I

I I I I

.'.10 ."'.','" , ,. . ' ,"",' "lII/Il~l~ 1".It~ -i""''''''''~I~ I~IIIII I I I I

Received Signal

I

I

Figure 21.10 - Signal Distortion Attenuation tends to reduce the overall amplitude of the signal. (Actually, attenuation is more severe at higher frequencies.) Line noise can also be a major problem, and is the main reason why modems often transmit at lower speeds than they are physically capable of transmitting. As you can see from Figure 21.10, the various forms of distortion can produce a signal that can be easily misinterpreted (as the circle in the figure indicates). This is why errors in transmission are not uncommon.

Error Detection We have mentioned that one of the main functions of a modem is to deal with error detection and correction. There are two approaches one can take to the problem of errors in transmission. One is to "encode" the original data in such a

348

Understanding Personal Computers

way that errors can be not only detected, but also corrected, on the receiving end. The subject of so-called error-correcting codes is fairly complex and it would not be appropriate for this book. The other approach is to encode the data for error detection only, using an error-detecting code. If an error is detected, the receiver can request that the data be retransmitted. In either case, additional information must be included along with the original data. This additional information is referred to as redundancy. The redundant data serves only for error detection or correction, and does not contribute to the original message. Thus, the more redundancy, the lower the rate of transmission of the original data. As you might expect, there is a compromise involved here. On the surface, error correction would seem to be preferable to error detection, since it can be done at the receiving end, without requiring retransmission. Remember that if retransmission is a possibility, then the sending modem must keep a copy of the data, until it is assured that retransmission is not required. This has a tendency to slow down transmission rates. On the other hand, error-correcting redundancy is generally larger than error-detecting redundancy. Hence, error-correction has a greater negative effect on transmission rate than error-detection. Error-correction also requires more sophisticated logic on both ends to encode and decode the data. All in all, the current trend is to employ error-detection, and request retransmission if errors occur. Block Parity Checking We have already discussed the parity check scheme for error detection. As we have remarked, this method is capable of detecting any single error, or indeed any odd number of errors, because any such errors will change the parity (evenness or oddness) of the data. An improvement on parity checking can be obtained fairly easily. For example, during synchronous transmission of frames whose size is 64 bits, each frame can be arranged, at least logically, into a square of size 8 by 8, as shown in Figure 21.11.

21.~odems

349

Row

parity bits

t

101 1 000 0 1 o 1 1 1 100 0 0 111000111 101010100 1 1 1 1 1 1 000 o 0 0 0 0 0 0 0 0 001 1 000 1 1 Column 1 0 0 0 0 0 1 0 0 parity ~1 1 1 1 1 1 1 0 bits

Figure 21.11 - Block parity checking A parity check bit is then included for each row and each column. The parity check bits for the rows are called transverse (or row) parity bits and the parity check bits for the columns are called longitudinal (or column) parity bits. Now, if two errors should occur in any given row, then the row parity bit for that row will still miss the errors. However, as long as the two corresponding columns have no additional errors (or an odd number of errors), the two errors will be caught by the column parity bits. This affords significantly better errordetection than just the row parity bits of ordinary parity checking. A more sophisticated technique used in communications, and also in hard disk data storage, is called the cyclic redundancy check, or CRC. This is an example of a so-called cyclic error-correcting code. The subject of cyclic errorcorrecting codes is very involved, and we will not discuss it here.

Data Compression Another main function of moderns it to perform data compression. This should not be confused with the data compression that is often done, by software programs such as PKZIP, before the data ever reaches the serial port, or by a file transfer protocol, such as Z~ODE~. In fact, there can sometimes be ironic conflicts between software data compression and hardware (modern) data compression. These can actually lead to a situation where a file that has been compressed by a software utility actually gets longer when it is "compressed" again by the modern. The problem sterns from the fact that some compression schemes use probabilistic methods that work best when the data in a file has a certain typical pattern. However, softwarecompressed files generally don't have such typical patterns.

350

Understanding Personal Computers

Run Length Encoding One of the simplest approaches to data compression is run length encoding (RLE) , not to be confused with run length limited (RLL) data encoding, as discussed in an appendix. The idea behind RLE is very simple. A string of identical bytes is encoded by giving the number of bytes in the string. In particular, one technique proceeds as follows. Suppose that a sequence of identical bytes occurs in a message and that the sequence has length at least 3 and at most 250. Then this sequence is replaced by a sequence of just 3 bytes, followed by a fourth byte that gives the total number of bytes in the original sequence. Thus, for instance, a sequence of 15 A's would be encoded as 3 A's, followed by a fourth byte that has the numerical value 15. When the receiver sees three identical bytes in a row, it knows that the fourth byte is the number of bytes in the sequence.

Huffman Encoding The most famous method of data compression is called Huffman encoding. This method of encoding is, in some sense, the best that can be achieved. However, the phrase "in some sense" refers to the situation where the relative frequencies (or probabilities) of each possible symbol in the data stream are known in advance. Since this is usually not the case, improvements can be made by using a dynamic (or adaptive) form of Huffman encoding. As an illustration of Huffman encoding, consider Table 21.1. Studies have been made to determine the relative frequencies of the various letters occurring in typical English language prose. These probabilities are shown in Table 21.1. Since there are 26 letters and a space character, a so-called block code, where all codewords are the same length, would require binary codewords of length 5 (25 = 32> 27, but 24 = 16 < 27). One possibility is shown in Table 21.1. Now, by a relatively simple procedure, a Huffman encoding can be constructed, based on the given probabilities, as shown in the last column. Note that a Huffman code is a variable-length code, meaning that the codewords have different lengths. The point is that symbols with higher probabilities can be given codewords of shorter length. In fact, a simple calculation shows that the average codeword length of this Huffman encoding is only about 4.1 bits. Thus, by using a Huffman encoding, we can compress the data to 4.1/5 = 82% of the required block code size. As mentioned, the Huffman encoding depends upon the probabilities, and these may not be known ahead of time. In adaptive Huffman encoding, the

21. Modems

351

probabilities are adjusted after each symbol is decoded. We will not discuss the details of Huffman encoding in this book, however.

Table 21.1 - Huffman Encodingfor English Text Symbol (Space) E T A

0 I N S R H L D U C F M W Y P G

B V K X

Q J Z

Probability 0.1859 0.1031 0;0796 0.0642 0.0632 0.0575 0.0574 0.0514 0.0484 0.0467 0.0321 0.0317 0.0228 0.0218 0.0208 0.0198 0.Q175 0.0164 0.0152 0.0152 0.0127 0.0083 0.0049 0.0013 0.0008 0.0008 0.0005

Block Code 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101 01110 01111 10000 10001 10010 10011 10100 10101 10110 10111 11000 11001 11010

Huffman code 111 010 1101 1011 1001 0111 0110 0011 0010 0001 10101 10100 00001 00000 110011 110010 110001 100011 100010 100001 100000 1100000 11000011 1100001011 1100001010 1100001001 1100001000

Measuring Data Transmission Rates The issue of serial transmission speed can be quite confusing, perhaps partly because of a forced effort to summarize the entire process with a single number. To understand what is involved, we simply follow the data path.

352

Understanding Personal Computers

Let us assume that the CPU is fast enough (that is, not too busy elsewhere) to keep the UART operating without delay. Then the UART will output a serial stream at a certain UART output bit rate. As we have seen, the maximum bit rate for a UART is 115,200 bps, when using the traditional PC clock. Keep in mind, however, that this includes not just the data from the CPU, but also the start bit, parity bit (if required) and one or more stop bits. Thus, even at the maximum rate, with 1 start bit, 1 parity bit, 1 stop bit and 8 data bits, the UART output data bit rate is (8/11)xl15200 = 83,781 bps. The modem will then generally apply data compression to the data, effectively raising the (original) data bit rate. It then adds additional errordetecting or correcting bits to the data, which effectively lowers the (original) data bit rate. Finally, the data is encoded into the signal changes of the analog signal. As we have seen, each signal change may represent several bits of information. Thus, the modem output bit rate may be a multiple of the modem's current baud rate. So we see that the original data from the CPU undergoes several expansions and contractions before it is sent out on the telephone line. Note that it may also be compressed by data compression software, before it ever reaches the UART. This makes determining the actual rate at which the original data is transmitted essentially impossible. We should also inject a note of caution. The terms bit rate and baud rate are bandied about in conversation and in modem documentation (and in other books), sometimes with little regard for preciseness. This is aggravated by the fact that early modems did not use multibit encoding, and so the modem output bit rates and baud rates were the same.

Restrictions on Bit Rate We have already seen that the limited bandwidth of the phone line has a limiting effect on the quality of the analog approximation to the original digital data signal. To reiterate, we approximate the digital signal by some of its harmonics. Since successive harmonics have higher and higher frequencies, we are limited in the number of harmonics that can be used, since high-frequency harmonics will modulate the carrier outside of the passband of the phone line. In addition, higher bit rates imply higher frequency harmonics. Thus, if the desired bit rate is too high, then all the harmonics of its digital signal may have too high a frequency to be used. This would mean that the signal would be completely filtered out by the modem. In other words, there is a limit to the bit rate possible over a phone line (or indeed any line).

21. Modems

353

There is, in fact, a formula that relates the maximum bit rate of a channel to the bandwidth and the number of bits per signal change. It is max bit rate = 2 x bandwidth x number of bits per signal change This formula was first derived by Nyquist, and bears his name. For example, with 4 bits per signal change, and a telephone bandwidth of 3000 Hz, the maximum bit rate is 2x3000x4 = 24000 bps. In practice, however, we can seldom reach this theoretical maximum (which assumes, for instance, that there is no noise on the line.)

lTV Modem Standards We have mentioned that the International Telecommunications Union (ITU) , formerly called the CCITT (Consultative Committee for International Telephone and Telegraph), is a standards bearing organization for telecommunications. The standards released by the ITU bear names that begin with a "V.". These standards specify such things as baud rate, modulation type, error-detection/correction methods and data compression methods. (Some standards specify only one of these items, others specify more than one.) Let us briefly comment on some of the more recent standards. Note that a single modem may support more than one standard.

V.32 The V.32 standard is relatively old, but still in common use. This standard specifies a modified quadrature amplitude modulation (QAM) technique and is designed for full-duplex 9600 bps asynchronous transmission over ordinary telephone lines. The carrier signal used by V.32 has a frequency of 1800 Hz and a modulation rate of 2400 baud. Since the V.32 QAM modulation encodes 4 bits per signal change, we arrive at the 9600 bps bit transfer rate. Some V.32 implementations support trellis-coded modulation. V.32 creates two channels of communication, sharing roughly the same bandwidth, using echo cancellation. Some implementations of V.32 in external modems support synchronous transmission, converting asynchronous data before transmission.

354

Understanding Personal Computers

V.32bis The V.32bis standard is very similar to V.32, but specifies a bit transfer rate of 14,000 bps. V.32bis supports both asynchronous and synchronous transmission.

V.34 This standard, once unofficially known as V.fast, provides full-duplex, 28,000 bps transfer rates. To achieve this rate, each signal change can encode 9 bits of data, operating at a baud rate of 3200 signal changes per second. Both asynchronous and synchronous transmission are supported. It is backward compatible with V.34bis and V.34. The V.34 standard provides line probing, that is, it can choose a carrier frequency between 2400 and 3429 Hz, based upon current phone line conditions. Moreover, V,34 can change the carrier frequency during a transmission. V.34 also allows the receiving modem to inform the sending modem about line distortions.

V.42 This is a standard that specifies one of two types of error detection and correction (through retransmission).

V.42bis This is a data compression standard that requires the VA2 error-detecting standard. The compression algorithm can compress data by as much as 8 to 1 under certain conditions.

MNP Modem Standards In addition to the lTV standards, a modem manufacturer named Microcom developed a series of standards whose names begin with MNP. For instance, MNP4 is an error-detecting/correcting protocol with some data compression and MNP5 is a data compression protocol that can compress data up to 50%. MNP7 is a Huffman encoding data compression protocol.

21. Modems

355

Modes of Operation In general, a modem can operate in one of two operating modes. In command mode (also called terminal mode), all data coming from the PC is interpreted as modem commands for the local modem (discussed soon). Thus, data are not transmitted. In communications mode or data transmission mode, all data flowing to the modem from the PC, with the exception of one string (so the modem can be switched back to command mode when necessary), are modulated and sent out over the phone line. Switching from communications mode to command mode is usually accomplished by sending the string "+++", consisting of three plus signs. However, this must be preceded and followed by a one second guard period of silence, the theory being that this would never happen within a real data stream. Actually, the modes of a modem can be broken down further into 4 modes. When the modem is first powered on, it is in local command mode. The term local refers to the fact that no connection has been made to a remote modem. When a connection is first made, the modem switches to handshaking mode. Handshaking is the process that takes place when a modem examines signals from a distant modem, in an effort to negotiate a common baud rate, along with other communication parameters. The two modems must agree on such things as error-detection method and data compression method. To transfer data over the connecting line, the modem must be in data transmission mode (communications mode). Finally, if the modem receives the +++ signal and returns to command mode, it is said to be in online command mode. Thus, the difference between on-line command mode and local command mode is that, in the former case, a connection has been established with a remote modem. Unfortunately, on-line command mode is also called offline command mode. And you thought PC terminology was confusing!

The Modem Command Set Most modems use a set of commands similar to that first developed by the Hayes Computer Corporation. These are referred to as Hayes commands. Hayes has extended the original set of commands. However, not all Hayes-compatible modems support the extended Hayes command set. Also, some modem manufacturers have added additional commands, or use the existing commands in slightly different ways. The important thing is that your communication software (which usually issues the commands) is compatible with your modem.

356

Understanding Personal Computers

Each Hayes command begins with the attention characters AT. For this reason, the Hayes command set is also referred to as the AT command set. (There was a time when all Hayes commands had to be in upper case, but now many modems allow all lower case characters as well.) To get a feel for the Hayes command set, here are some selected AT commands. (Don't take these values for granted on all modems. They are just to give you an idea of what modem commands can do.) Note also that several commands can be combined on a single line, following a single AT command. A: Answer phone A/: Repeat previous command A>: Repeat dialing DTnumber: Dial phone using tone dialing En: Echo on (n=O) or off (n=1) Fn: Duplex full (n=1) or half (n=O) Hn: On hook (hang up) n=1; off hook (pick up receiver) n=O Ln: Loudness of speaker: 0 lowest, 3 highest Mn: Speaker condition: n=O for off, n=1 for on for dialing then off when carrier detected, n=2 for always on, n=3 for on after dialing until carrier signal Sn=x: Commands to access the modem's registers Sn?: Display contents of register Vn: Verbose mode (n=1) or nonverbose mode (n=O) Z: Reset modem &Cn:Carrier detect handling: 0 = CD always on, 1 = CD reflects carrier detect by modem &Dn: DTR handling: O=ignore DTR &Hn: Hardware flow control &Mn: Asynchronous/ synchronous mode &Nn: Set connection rate &Rn: RTS/CTS control &Tn: For modem testing

Modem Registers It is possible to access the registers on a modem, called the S-registers because they are accessed using the ATS command, in order to control certain aspects of modem operation. Modem modems often have several dozen registers. As with commands, there is no complete agreement on what each modem register should do, so you need to consult your modem's documentation. Note

21. Modems

357

also that some registers take numerical values, whereas others are bit-sensitive, that is, each bit has a different function. Here are some examples. SO=x: Answer phone on ring number x S3=x: Character x used as carriage return S6=x: Wait x seconds for dial tone S7=x: Wait x seconds for carrier S8=x: A comma in the command string pauses x seconds (used for pauses during dialing)

Result Codes When the modem needs to pass information to the PC user, it issues a result code. Result codes have a numerical designation as well as a verbal description. By setting the modem to verbose mode (with the command ATV1), the verbal descriptions will be returned. Here are a few of the standard result codes. 0: OK 1: CONNECT 2: RING 3: NO CARRIER 4: ERROR 5: CONNECT 1200 6: NO DIAL TONE 7: BUSY 8: NO ANSWER 9: CONNECT 2400 10: RINGING

Initialization Strings When a modem is first contacted by your communications software, the software generally sends an initialization string to the modem. For example, here is what I see on my monitor when I start my communications software. AT OK

358

Understanding Personal Computers

AT&F&Cl&D2X7&Hl&R2&K3&Bl&A3 OK AT1flL2S7=60S11=55S0=0 OK The third line is a long string of commands, described as follows. Note that when a parameter is missing, it is assumed to be O. Thus, for example, ATE is the same asATEO. &F: Load factory configuration &Cl: 1fodem Controls CD &D2: DTE Controls DTR X7: Set which result codes to display &Hl: Hardware flow control on &R2: Received data to PC only on RTS &K3: 1fNP5 compression disabled &B 1: Fixed serial port rate &A3: Error-control and data compression enabled (V.42bis) There follows some speaker and register settings: 1fl: Speaker ON until connect L2: 1fedium loudness S7=60: Wait 60 seconds for carrier signal Sl1=55: Spacing for tone dialing SO=O: Number of rings on which to answer. SO=O means auto answer is disabled. Of course, your modem's initialization string may vary considerably from this one, even if you have the same modem but used different communications software.

22. Optical Storage

CD-ROM Drives These days, almost all new PCs have CD-ROM drives. Not only are these drives important for multimedia applications, but more and more nonmultimedia applications are available on CD-ROM and some are available only on CDROM. Installing an application from CD-ROM is far easier than it is from floppy disks. As an example, I know from personal experience that installing Microsoft Office (version 4) from its 31 diskettes can take more than an hour of constant attention to shuffling diskettes, whereas CD-ROM installation takes just a few minutes and requires very little user intervention.

CD-ROM Read Mechanism The operation of a CD-ROM is actually quite elegant. Figure 22.1 shows the main components involved. As we have seen, data are placed on a hard disk in concentric circles called tracks, which are further divided into sectors. Also, the disk spins at a constant rotational speed. On the other hand, a CD-ROM disk has a single track that spirals out from the center of the disk toward the outer perimeter. The track is divided into sectors, each of which has equal length. Thus, in order to insure that the read mechanism takes the same length of time to traverse each sector, the drive motor must vary the rotational speed of the disk, slowing down in rotations per minute (rpm) when the read mechanism is near the outside of the disk. Thus, the read mechanism retains a constant linear velocity over the track, as opposed to the constant rotational velocity of a hard disk. (In fact, a CD-ROM spins about 2.65 times as fast when the read mechanism is on the inside track, as compared with the outside track.)

360

Understanding Personal Computers

Disk

Motor

a

Current

LaserA Beam

Laser Diode

Figure 22.1- CD-ROM drive mechanism

CD-ROM Disk Construction A CD-ROM disk itself has several layers, as shown in Figure 22.2. The bottom layer of the disk (the disk substrate), consists of a plastic substance, usually polycarbonate. The top surface of this layer contains pits and lands, which hold the binary data. Next comes a very thin coating (between 0.05-0.15 microns) of a reflective metal, such as aluminum, silver or gold that evenly coats the pits and lands. (A micron is one-millionth of a meter, or about 1/25,000 of an inch.) On top of this metal coating is a protective lacquer coating, which is between 10 and 30 microns thick. Finally, a label may be printed on top of this coating.

22. Optical Storage

/

/

Land Pit

/

/

/

/ Labell

/

/

/

Protective lacquer ccating

/

/

/

361

5 11m 10-30 11m 50-100 nm

"~

1.2mm

Disk substrate

Reflective metal piVland coating

Laser beam

Figure 22.2 - Laser light reflecting off CD-ROM disk

As you can see from Figure 22.2, the laser beam strikes the CD-ROM from below, so that, from its perspective, a pit is actually a bump.

Laser Light To understand how the pits and lands are relevant, we need to understand a bit about light, in particular, laser light. (Our discussion will, as you might imagine, be a bit oversimplified.) As we discussed in the chapter on modems (for electrical signals), light waves have a frequency, amplitude and phase. As you may know, the frequency of a visible light wave is seen as color. Lower-frequency light waves (about 4x108 MHz) appear as red light. As the frequency increases, the light changes color to orange, yellow, green, blue and finally violet (at about 7x10 8 MHz). Figure 22.3 shows two light waves of different frequencies.

Figure 22.3 - Light waves of different frequencies

362

Understanding Personal Computers

Also, two light waves can be in phase with each other, or be out of phase, as shown in Figure 22.4, where waves A and B are in phase, and waves Band C are out of phase (all three waves have the same frequency).

A B

c Figure 22.4 - Light waves in phase and out of phase Ordinary white light, say from a flashlight or from the sun, is composed of light waves of varying frequencies and phases. On the other hand, laser light is highly monochromatic, meaning that it consists of light waves of a single frequency and also highly coherent, meaning that the light waves all have the same phase.

The CD-ROM Read Operation Figure 22.5 shows a close up of a laser beam striking part of the CD-ROM track. Note that the laser beam is about 1.7 microns in diameter, considerably thicker than the thickness ofthe track (about 0.5 microns).

I

c=)

0

1.6 J.lm

c=)

o c::::J

(fill Land (

0

c=l 0

0

Laser beam (1.7 J.lm)

Figure 22.5 - Close-up of CD-ROM track

O.5J.l m

22. Optical Storage

363

When a light wave from the laser beam strikes the metal coating, It IS reflected back, eventually to a photoelectric cell, where it produces a current. However, a light wave that strikes a pit has less distance to travel than one that strikes a land. (Remember that the laser beam hits the track from below; see Figure 22.2.) In fact, the depth of a pit is about one-quarter of a wavelength of laser light. Hence, a light wave that strikes a pit travels one-half of a wavelength less in distance. Figure 22.6 illustrates two light waves, one striking a pit and the other striking a land.

Figure 22.6 - Destructive interference of light waves

As you can see, one wave is shifted one-half wavelength with respect to the other. Accordingly, because these waves have the same frequency, they will cancel each other out! This is called destructive interference. Now, looking at Figure 22.5, we can see that, as the disk spins, the proportion of light that strikes a pit varies. Thus, the proportion of light that undergoes destructive interference varies, from a minimum of about 10% up to a maximum of about 75%. Put another way, the amount of light that eventually reaches the photoelectric cell varies between 10 and 75 % of the light that is initially emitted from the laser. This produces a varying current in the photoelectric cell. This current variation allows the CD-ROM drive circuitry to determine the beginning and ending of each pit; that is, to read the disk. The information can then be translated into digital information. (The binary data is encoded on the disk in a rather complicated way, involving sophisticated error correction and other matters-it is not correct to say that a land is a 1 and a pit is

a 0.)

It should also be clear that the destructive interference effect relies upon the fact that all of the light coming from the laser diode has the same wavelength and the same phase. In fact, this is precisely why a laser beam is used!

364

Understanding Personal Computers

CD-ROM Characteristics Disk Capacity Originally, CD-ROMs were used strictly for storing audio data, in which case it made sense to describe the capacity in terms of playing times. There are two standard playing times for a CD-60 minutes and 74 minutes. The latter time is obtained by using the outer 5 mm of the CD track, where it is more difficult to control defects arising in the manufacturing process. (The 74-minute format was decided upon jointly by the developers of the CD standard-Sony Corporation and Philips Corporation-in order to hold the complete Ninth Symphony of Ludwig von Beethoven.) In any case, there are four possible capacities, depending upon whether or not the full 74-minute capacity is used and whether or not error correction is employed. A 60-minute format allows 270,000 CD-ROM sectors (not the same as hard disk sectors) and a 74-minute format allows 333,000 sectors. The use of error-correcting code allows 2048 bytes of data per sector, as opposed to 2336 bytes of data when no error correction is used. Arithmetic now shows that, for example, a 6O-minute error-correcting CD holds 270000x2048

= 552,960,000 "" 527 MB

of data and a 74 minute error-correcting CD holds 333000x2048

= 681,984,000 "" 650 MB

of data. Data Transfer Rates Early CD-ROM drives had a data transfer rate, also called throughput, of 150 KB/sec, which is very slow compared to hard disk drives (with throughputs in the megabytes per second range), but almost 3 times faster than the 62.5 KB/sec throughput for floppy drives. Such drives are called single-speed drives. To achieve higher data transfer rates, manufacturers built their drives to spin faster. Double-speed drives have throughputs of 300 KB/sec. There followed triplespeed drives (450 KB/sec throughput), quad-speed drives (600 KB/sec throughput), six-speed drives (900 KB/sec throughput), eight-speed drives (1200 KB/sec throughput), ten-speed (1500 KB/sec throughput) and twelvespeed (1800 KB/sec throughput) drives.

22. Optical Storage

365

Access Time Unfortunately, methods used to determine access times can vary, making comparisons difficult. One of the complications involves the fact that, unlike with hard disks, the access time should include the time it takes, not only for the read mechanism to reach a given point on the data track, but also the time it takes for the disk to reach the correct spin rate for that portion of the track. The so-called 1I3-stroke method proceeds as follows. Starting from random points on the disk, the read mechanism is moved a distance equal to 1/3 the radius of the disk. The time it takes to move the mechanism is added to the time it takes for the disk to reach the correct operating speed. This experiment is repeated many times and an average is taken. This average is the access time. Access times have steadily decreased since the advent of CD-ROM drives, from about 450 ms in the early days to about 150 ms today. More recently, some manufacturers have begun to use a more-or-Iess random point-to-point move, rather that a 1/3-radius length move. Of course, access times computed in this way cannot be directly compared to the 1/3-stroke method access times. Note also that a CD-ROM's seek time does not include the time it takes for the drive to reach operating speed, and is thus less than its access time. CD·ROM Drive Buffers Most CD-ROM drives have built-in data buffers (data caches). The size of the buffer varies from about 64 KB up to about 1 MB. Of course, the size of the buffer is not the only factor in determining cache efficiency (hit rate). The other factor is the algorithm used in implementing the cache. We discuss cache algorithms in an appendix. CD·ROM Caddies A CD-ROM caddie is a small plastic case that holds a single CD-ROM. It has a sliding metal door (similar to those on a 3.5 inch floppy diskette), that allows access to the CD-ROM. CD-ROM drives divide into two categories-those that require a CD-ROM to be placed into a caddie and those that use a tray to hold the bare CD-ROM. There are rather obvious advantages to both systems. On the one hand, caddies protect the CD-ROM from direct handling, which is important, since scratches can render a CD-ROM useless. On the other hand, disk caddies cost money, increasing the cost of operation of the CD-ROM. Also, if you do not choose to purchase a caddie for each CD-ROM (an expensive proposition), you will still need to handle the disks when shuffling them in and out of the caddies (which, incidentally, is a two-handed operation). We also note that some tray drives can only operate in a horizontal position, whereas others have special fingers that

366

Understanding Personal Computers

allow the CD-ROM to spin in a vertical orientation. All-in-all, the choice between the two systems is one of personal taste.

CD-R Drives It is clear from the previous description that the data on a conventional CD-ROM is permanently etched into the disk, and cannot be erased. Thus, ordinary CDROM drives are read-only devices. On the other hand, recent developments have led to drives that can write data to a blank CD-ROM. A CD-R (short for CD-Recordable) disk uses a layer of gold covered by a layer of dye, in place of the reflective aluminum layer of an ordinary read-only CD-ROM disk. The dye is translucent, allowing the gold to reflect the laser beam, just as a CD-ROM land would do. To write to a CD-R disk, a laser beam bums a spot in the dye layer, thereby reducing the reflectivity at that spot. Thus, when reading, the laser beam is not reflected back to the photoelectric cell, when striking a burnt spot. This simulates the effect of destructive interference. Of course, this type of writing is permanent, so a disk can only be written to once. Because of this, it is referred to as Write Once Read Multiple, or WORM technology. Sometime in the near future, we should begin to see an erasable CD-ROM technology, called CD-Erasable, or CD-E. The basic idea behind CD-E is that the reflectivity of the disk media can be changed, depending upon the temperature to which the media is heated. Thus, by heating a particular spot on the disk to the right temperature, that spot can be returned to its original state of reflectivity, thereby erasing any previous changes.

Magneto-Optical Drives Until CD-E becomes a reality, CD-ROM type technology will be confined to archiving or distributing files. On the other hand, magneto-optical drives (MO drives) use disks that can be written to repeatedly, just like floppy disks. Like a hard disk, an MO disk stores data magnetically. However, the MO disk medium differs from a hard disk medium in that it requires a much stronger magnetic field in order to change the direction of a magnetic domain on the disk. In other words, MO disks have a much higher magnetic coercivity than hard disks. One of the advantages of this higher coercivity is that an MO disk is very stable. Magnetic media (hard disks, floppy disks, MO disks and tapes) tend to

22. Optical Storage

367

slowly demagnetize on their own. An MO disk will deteriorate more slowly than a hard disk, for instance. Manufacturers specifications indicate that the data life of an MO disk is on the order of 30 years, whereas floppy or removable hard disks may have data lives closer to 3 to 5 years. (Of course, the 30 year lifetime is just a theoretical specification, since MO drives have not even been in existence for 30 years.) The high coercivity of an MO disk makes it more resistant to stray magnetic fields as well. Writing to an MO Disk As you might expect, there is a downside to the high coercivity of an MO disk. Namely, it takes a stronger magnetic field to change the direction of magnetism, that is, to record data. A stronger magnetic field requires a larger write head, which in turn means larger domains (recall from the chapter on hard disks that a domain is the smallest area that can be magnetically charged). This reduces the capacity of the disk. MO drives deal with this problem in a very clever way, using optical technology. In particular, writing to an MO cartridge is first done by heating a portion of the disk with a laser beam. When the disk medium reaches a temperature known as its Curie temperature, it loses its magnetic coercivity and can then be written to using a (smaller) magnetic write head. (The Curie temperature for MO disks is about 150 degrees Celsius.) A further advantage to this procedure is that the laser beam can heat a very small portion of the disk, so that even if the magnetic write head attempts to magnetize a larger portion of the disk, it will be successful only at the spot heated sufficiently by the laser beam. Thus, writing can be done using smaller domains than would be possible without the laser. Nonetheless, because the write head must be farther from the disk (the write head assembly is larger and bulkier than that of a hard drive, and does not float above the disk as in a hard disk), the magnetic field for writing must be significantly stronger than in a hard disk. Hence, using traditional MO technology, the magnetic write head cannot change direction rapidly enough to write the data in one pass. Instead, the writing of data is a two-step process. First, all domains are aligned in one direction (thus erasing any previous data). Then, on a second pass, the write head can reorient the appropriate domains, to write the data. A new write technology, referred to as Direct Overwrite, or DOW, is just beginning to make its appearance in the PC market. DOW allows data to be written in a single pass. One approach to DOW is to modulate the intensity of a laser beam in order to erase and write at the same time. This is referred to as the Light Intensity Modulation Method, or LIMM.

368

Understanding Personal Computers

Reading from an MO Disk Data is read from an MO disk using a laser beam (weaker than that used for writing), rather than a magnetic read head. This is a bit tricky, and requires that the laser beam be polarized. (This term means that all of the light waves are aligned in the same direction.) When the polarized light hits the magnetic domains on the disk, the direction of polarization is altered, according to the direction of the magnetic field of the domain. In this way, the differences in the magnetic orientation of the domains can be detected, and the data read. MO Drive Characteristics As we discussed in a previous chapter, several companies are currently making MO drives. The smaller form factor 3.5 inch drives use MO disks with a capacity of 230-640 MB. The average access times runs in the neighborhood of 30-40 ms. The read data transfer rates are in the range of 1-3 MB per second, whereas the write times are roughly 3 to 5 times slower than the read times. The larger 5.25 inch form factor drives can store considerably more data. One company is reportedly about to release a 5.25 inch MO drive with a 4.6-GB capacity.

Appendix

AI. Trying the Experiments

The experiments that are sprinkled throughout the book give you a chance to do a little poking around for yourself. You may find this to be an enlightening and even empowering experience. If you are a little hesitant about proceeding, be assured that all experiments are short and I will lead you through the steps one-by-one. All you really need to do is follow the instructions. Note that, even though these experiments involve delving into the inner workings of the PC, you cannot do any real harm (if you don't stray too far from the instructions). However, if you make a mistake, you may cause a system crash, which means that the PC will not respond to your keyboard input, and you will need to either hit the reset button or turn the power off. THEREFORE, BEFORE PROCEEDING WITH ANY EXPERIMENT, MAKE SURE YOU HAVE SAVED ALL OF YOUR WORK. AFTER A SYSTEM CRASH, YOU WILL LOSE ANY UNSAVED WORK. If you would like to try the experiments, you need to know a little bit about two simple programs that come with DOS or Windows. First, we need to review a few topics that are discussed in the book.

Hexadecimal Numbers All numbers must be entered in hexadecimal format, but with no trailing h. Since we will only deal with very small numbers, this shouldn't pose a problem.

370

Understanding Personal Computers

Registers Inside the microprocessor, we find the registers, which are small "scratch pad" memory locations to hold the data and addresses that are used in calculations. We will have use for the registers shown in Figure A1.1.

Each short block is 8-bits long

Ax l

AH

AL

Bx l

BH

BL

cx l

CH

CL

ox l

DH

DL

Figure At.t - Some of the registers in a CPU

Segmented Addressing We will have some occasion to enter addresses. These must be entered in segment:offset format. If you have not read the section on segmented addressing in the appendix on real and protected modes, you may wish to do so now. If not, you can still follow the experiments, but you may not understand the addressing scheme.

Debug Debug is a very old, and very primitive, program that is capable of doing some very low-level things that no other program can do, which is why we will use it in most of the experiments. While debug is relatively easy to use, it is a prime example of an unfriendly program. By this we mean, for instance, that the commands are entered using just a single letter, there is no help option (to speak of) and that mistakes usually require retyping.

AI. Trying the Experiments

371

Our main use for debug is to (1) take a look at the contents of memory in various locations and (2) write some very short assembly language programs to manipulate some of the hardware. If you are doing the experiments at the DOS prompt (not in a DOS box within Windows), you should find a copy of DEBUG.EXE in your DOS directory. If you are using Windows 95, the correct version of DEBUG.EXE is located in the Windows 95\Command subdirectory. All the debug experiments have been tested in a Windows 95 DOS box, and seem to work just fine. Note, however, that PC systems do vary in configuration, and you are the only one who can test the experiments on your own system. After starting debug, you will get the hyphen (-) prompt, as shown below. C : \ WIN9 5 >debug..-l

(My copy of debug is in the directory called C:\WIN95. Yours may be in a different location.) Here are some debug commands: q, r, d, e, a, u, I and g. (See what I mean about debug being unfriendly?) Let us discuss the meaning of these commands, and give some examples. You can either read through this short appendix now, or else simply use it as a reference whenever you need it. Note that the notation ..-l is used to denote hitting the ENTER key.

The q command (Quit debug): To exit debug, type q at the debug prompt and then hit the ENTER key. C : \ WIN9 5 >debug..-l _q ..-l C : \ WIN9 5 >debug

The r command (Display register contents): Issuing the r command will cause debug to display the contents of the microprocessor's registers, as in the following example: -r

AX=OFOB DS=116B

BX=OOOO ES=116B

CX=0107 SS=116B

DX=OOOO CS=116B

SP=FFEE 1P=0108

BP=OOOO S1=OOOO D1=OOOO NV UP E1 PL NZ NA PO NC

The first row contains the AX, BX, CX and DX registers, of which we are most interested. Remember that the AX register, for instance, is composed of two

372

Understanding Personal Computers

bytes. Thus, in the example above, the AH register contains OFh and the AL register contains OBh. (This is not little endian.)

The d command (Dump memory): This command allows you to take a peek at the contents of memory. For instance, issuing the command -d cOOO:

OOOO~

on my PC causes the following to appear. -d cOoo:oooo COOO:OOOO

55 AA 40 EB 7B 35 A6 04-00 06 40 lA 3E 43 A2 lA

COOO:0010

AC lA A5 74 00 00 00 00-60 00 00 00 00 20 49 42

U.@.{5 .... @.>C .. ... t .... ' .... IB

COOO:0020

4D 20 43 4F 4D 50 41 54-49 42 4C 45 20 4D 41 54

M COMPATIBLE MAT

COOO:0030

52 4F 58 20 56 43 2D 30-30 34 20 56 47 41 2F 45

ROX VC-004 VGA!E

COOO:0040

47 41 20 42 49 4F 53 20-28 56 33 2E 35 62 29 2C

GA BIOS (V3.5b),

COOO:0050

20 50 43 49 00 87 DB 87-DB 87 DB 87 DB 87 DB 90

PCI ........... .

COOO:0060

50 43 49 52 2B 10 10 OD-OO 00 00 00 12 03 00 00

PCIR+ .......... .

COOO:0070

00 00 00 00 00 80 00 00-49 4D 50 2B 00 FF FF FF

........ IMP+ ... .

The lefthand column contains memory addresses in segment:offset form, the middle group contains the hex values in those addresses and the righthand group contains the ASCII equivalents of the hex values, provided those ASCII equivalents correspond to ordinary characters (otherwise, there is a period). This is useful for poking around in memory, looking for such things as copyright messages! (What you do with these copyright messages is your own business.) A little practice reading bytes might be useful at this time. For example, to find the byte at address COOO:003D, we count from left to right in the row headed by COOO:0030: 0,1,2,3,4,5,6,7,S,9,A,B,C,D Thus, the byte in question is the third byte from the right end (value 41h). Note that debug does thoughtfully provide a small hyphen half-way through the row. The byte immediately following the hyphen is byte number S.

The e command (Enter data into memory): This command is used to enter data into memory. For example, to enter five E's in memory starting at address lOBO:OlOO, we would write

AI. Trying the Experiments

373

-e 10bO:0100 45,45,45,45,45 Note that the ASCII value of the character E is 69, which equals 45h. Debug expects the hex number 45, without the trailing h. The a command (Enter assembly language instructions): The a command is used to enter assembly language instructions (called assembling a program). For instance, here we enter four assembly language instructions, starting at offset 100.

-a 100+-1 116B:0100 mov 116B:0102 mov 116B:0104 mov 116B:0106 int 116B: 0108+-1

ah,l+-1 ch,l+-1 cl,7+-1 10+-1

The first line (a 100) starts the assembly procedure at (offset) address 100. Debug thoughtfully displays the addresses (shown in bold above) as we enter each instruction. Hitting the ENTER key on a line without entering an instruction stops the process. The u command (Unassemble) Unassembly refers to the process of turning machine language back into assembly language. For instance, after entering the four instructions above, entering -u 116B: 0100+-1

will produce the result

116B:0100 116B:0102 116B:0104 116B:0106

B401 B501 B107 CD10

MOV MOV MOV INT

AH,Ol CH,Ol CL,07 10

(with some additional lines as well). Note that we are getting back the instructions that we entered in the previous example. After the address on each

374

Understanding Personal Computers

line comes the actual machine language version of the instruction (in hex rather than binary), followed by the assembly language version. Note that this command is quite powerful since, for instance, debug can unassemble any program in memory, including the system BIOS, or any commercial program. However, it takes great experience to be able to make sense out of unassembled code. There are no comments to help piece together the context of the individual instructions. Nevertheless, it is a useful tool for checking that we have entered the correct instructions through the "a" command.

The I command (Load a file or disk sector): The load command (this is a lower case "el") has the following format, and is used to load disk sectors into memory: 1 [address]

[drive]

[sector]

[number]

where [address] =memory address to put data [drive] =drive number (0 =A drive, 1 =B drive, 2 =C drive, etc.) [sector] =sector number [number] = number of sectors to load

The g command (Go - execute instructions): After entering the instructions described under the "a" command, if we issue the "g" command exactly as follows

-g=116B:Ol00

116B:Ol08~

debug will execute the four instructions we entered. When done, it will display the microprocessor's registers. Debug also displays the next memory location after our four instructions, thinking it is another instruction, but it is probably just garbage. Here is the display for our example. AX=OFOB BX=OOOO DS=116B ES=116B 116B:Ol08 67

CX=0107 DX=OOOO SS=116B CS=116B DB 67

SP=FFEE 1P=0108

BP=OOOO S1=OOOO D1=OOOO NV UP E1 PL NZ NA PO NC

AI. Trying the Experiments

375

QBASIC We will do a few experiments with Microsoft QuickBasic (QBASIC.EXE), which is a much higher level program than debug (although programmers still find it very primitive). If you have DOS 6 (or earlier) on your PC, then you will have a copy of QBASIC.EXE in your DOS directory. However, if you bought your PC with Windows 95 installed, then you will not have a copy of Quick Basic. Unfortunately, Microsoft decided to remove it from the version of DOS that comes with Windows 95, which is too bad, since it is occasionally still useful. QuickBasic is reasonably friendly, with menus and a modest form of help, so we don't need to discuss the program in this appendix.

A2. Cache Designs

In this appendix, we describe in detail the three popular cache designs. I will assume that you have read the discussion of caches in the chapter on microprocessors.

Direct-Mapped Caches Figure A2.1 shows the idea behind a direct-mapped cache. In this design, main memory is divided into blocks, each containing a fixed number of bytes (4 bytes in the figure). The cache is divided into lines, each of which holds the same number of bytes as a block of memory (also 4 bytes in the figure). Thus, a memory block fits perfectly into a cache line. We have shaded the blocks and cache lines using 4 distinct shades to assist in the discussion. Each cache line is allowed to hold a copy of only certain memory blocks. In the figure, cache line 0, indicated in white in the figure, can hold any of the memory blocks that are also in white. More generally, each cache line of a given shade can hold only those blocks of the same shade. Associated with each cache line is a tag, whose purpose is to identify the block that is currently residing in that cache line. Now, if the microprocessor requests a specific byte of memory, the cache controller can determine, using the address of that byte, which block of memory contains the byte, and hence the block's shading (in Figure A2.1), in other words, it can determine which cache line is permitted to hold that block. It then checks the tag for that line to see if the correct block is currently in that cache line. If so (a cache hit), it can return the data to the microprocessor. If not (a cache miss) then the current contents of the cache line is replaced by the block that does contain the requested byte of data.

378

Understanding Personal Computers

Memory block 0 block 1 block 2 block 3

Cache

block 4 block 5

?fi-----'-------'-------..!.-- - - J - - --I

line 0

r---I-~___,,..........,-.....,f--__If.,-,-,,--.,.__J line 1

block 6

• • • • • • • • • • • • • • • • line2 line 3

block 7

8111!11

block 9

bl ock 10 block 11

Figure A2.1 - A direct-mapped cache design Thus, a direct-mapped cache is really just several (in this case 4) caches working independently on distinct portions of memory. The difficulty with a directmapped cache design can be illustrated as follows. If, for some reason, the microprocessor makes a large number of requests mostly from the white blocks in memory, then the cache controller will need to make a large number of replacements in the white cache line, whereas the other lines in the cache will go relatively unused. This imbalance is not an efficient use of cache memory.

Associative Caches We may avoid the aforementioned problem with direct-mapped caches by simply allowing any block of memory to be placed in any cache line. Such caches are referred to as fully associative caches. The tags in a fully associative cache hold the number of the block that is currently contained in the corresponding cache line. The difficulty with this approach is that, every time a memory request is made, the cache controller must check each line of the cache to see if the correct memory block is available in the cache.

A2. Cache Designs

379

Moreover, additional problems are created, such as which cache line to replace when a new memory block must be brought into a full cache. There are four common strategies in use for this decision. •

Remove the least recently used (LRU) data.



Remove the least frequently used (LFU) data.



First-in-first-out (FIFO), which essentially means that the memory block that has been in the cache (used or unused) the longest gets removed first.



Random choice.

The most effective technique is probably LRU, but even random selection seems not to lag too far behind in performance.

Set Associative Caches A compromise can be reached between these two extremes, as shown in Figure A2.2.

Memory

Cache Way 1

block 0 block 1

blocks

tags

r-------'--- - -'---- - .....!...-- - -J--- ---.j

block 2

line 0

r----+-.....",,...-,-;.-......,,...~"""",----......-+--..,.,,._--J line 1

block 3

• . . . . . . . . . . . . . . . . line2 line 3

block 4 block 5 block 6 block 7

block 9 8 block 10

IIII

Way 2 blocks

r-- --'--- - ..!..-- -..1....-- - --J-- - ----J line 0 I\'"-~+-.....----t---..",.+-~.......+=_.__-.-I line 1

1iI• • • • • • • •IIIi•••~...

block 11

Figure A2.2 - A 2-way set associative cache

line2 line 3

380

Understanding Personal Computers

In an n-way set associative cache, the cache is split into n equal-sized pieces, called ways. Figure A2.2 illustrates a 2-way set associative cache. In such a design, each white memory block can be placed in the white cache lines, in any of the ways. When a memory access is requested, the cache controller uses the address of the requested data to determine which block contains the data and also the shade of that block. It then looks, in each way, at the cache lines of that shade. If it fmds a tag that corresponds to the correct block, a cache hit has occurred. If not, we have a cache miss. The set associative cache design is a compromise between the two previous designs, in the sense that a given memory block has more than one allowable cache line and yet the cache controller needs only to check the cache lines of a certain shade, rather than all cache lines. Performance of a set associative cache varies with the number of ways. The four-way set associative cache seems to be the most widely used in the PC world at present.

A3. SIMM Chip Counts

As promised in the chapter on memory, we now discuss the details of how the chips on a SIMM can be organized. Memory SIMMs are packaged in a variety of ways. The individual bitmemory units inside a memory chip can be thought of as arranged in a row/column format, as shown in Figure A3.1. This figure also illustrates some of the important numbers associated with a SIMM. chip count = 8

SIMM data width

chip width

= 4 bits

,....;-..

= 32 bits

72 pins

Figure A3.1 Data Chip Count: The number of data chips on a SIMM, not including a parity check chip (if present) or any other support chips. Popular SIMMs have a data chip count of 2 or 8 (with one additional chip if the SIMM supports parity checking). Data Chip Width: This is the number of bits that are retrieved with each access of the data chip. You can think of this as the number of data wires leading from the chip. Some chips are I-bit wide, which means that a single bit is retrieved per access. Other chips widths are common, as we will see. Data Chip Capacity: This is the number of bits that a data chip can hold. It is usually measured in megabits, or Mbits. (A megabit is 220 = 1,048,576 bits.)

382

Understanding Personal Computers

Data Chip Depth: This is the chip capacity divided by its width. SIMM Data Width: This is the number of bits that are retrieved from the SIMM in a single data access. Since all data chips on a given SIMM must have the same chip width, we have the following simple relationship (l)

SIMM Data Width = Data Chip Count X Data Chip Width

Note that the SIMM data width must match the data bus width of the computer's memory (local) bus, since data is requested in this width by the CPU. This is why the common data widths are 8 bits (for older PCs and peripheral devices) and 32 bits (for newer PCs). Note that a modem Pentium PC has a 64-bit wide data local bus. For this reason, SIMMs must be installed in pairs in these PCs. (Some older Pentiums have 32-bit data buses, however.) SIMM Pin Count: This is the number of pin connectors on the bottom of the SIMM. These are the connectors that connect to the bus. Since a SIMM needs more than just data pins, the pin count is larger than the data width. In fact, SIMMs with an 8-bit data width have 30 pins and SIMMs with a 32-bit data width have 72 pins. SIMM Capacity: This is the total number of bytes that the SIMM can hold. Note that (2)

SIMM capacity in bits = SIMM Data width x Chip depth

SIMM capacity is usually quoted in megabytes, not in megabits, so some care must be taken when using this formula. The two relationships given above can be used to determine possible SIMM organizations. Here are some examples. Example 1 Suppose we want to buy 8 MB of SIMMs for a 486-based PC. The data bus is 32-bits wide, so we need 72-pin SIMMs. The SIMM width is thus 32 bits and the SIMM capacity is 8 MB. Equation (2) gives (since 8 MB = 64 Mbits) 64 MBits = 32 bits x Chip depth or, since 64 MBits = 26 x 220 bits = 226 bits, and 32 = 25 , we get

A3. SIMM Chip Counts

383

(Note that chip depth is not measured in bits or bytes-it is the number of rows in the chip and therefore does not have a unit of measurement. The chip depth above is 2 Meg = 2x220 = 2,097,152, which is the number ofrows in the chip.) In advertisements for computer memory, a SIMM is often denoted by an expression of the form (Chip depth x SIMM Data width) - (Pin Count) Hence, this type of SIMM is a 2 x 32 - 72 SIMM. This makes it easy to compute the SIMM's capacity-2 x 32 Mbits, or 2 x 8 = 16 MB. If the computer requires parity checking, then an additional chip will be required. This chip will need to accommodate each of the 4 bytes in the SIMM's data width and so an additional 4 bits is added to the SIMM width, for parity checking purposes. This is usually denoted in the designation by writing 2 x 36 72. To determine the possible chip counts, we use relationship (1), which gives 32 =Chip Count x Chip Width This allows for some possibilities. For instance, the SIMM could be made with 2 data chips, each 16 bits wide, in which case each chip would have a capacity of Chip capacity = Chip width x Chip depth = 16 bits x 221

=225 bits =32 MBits

On the other hand, we could also make a SIMM with 8 chips, each having width 4 bits. In this case, the chip capacity is Chip capacity =Chip width x Chip depth =4 bits X 221

=223 bits =8 MBits

Of course, if the SIMM has a parity check chip, then the chip counts are 3 and 9, respectively. Example 2 Suppose we want to add 32 MB to a Pentium with a 64-bit data bus. Since we must add 72-pin SIMMs in pairs to accommodate the 64-bit wide bus, each SIMM has a capacity of 16 MB. Equation (2) gives

384

Understanding Personal Computers

128 MBits = 32 bits x Chip depth or

Chip depth = 227 /2 5 = 222 = 4 MBits Thus, we need two 4 x 32 -72 SIMMs (no parity case) or two 4 x 36-72 SIMMs (with parity). The possible chip counts are the same as in the previous example.

A4. How Memory Works

The story of how a memory chip actually works is a fascinating one. Figure A4.1 shows a somewhat simplified internal view of a DRAM chip. DRAM chips are built around two very important devices-a capacitor, which is a device that holds an electric charge and a transistor, which is a switch. We have denoted a capacitor in Figure A4.1 by a large C with a circle around it, and a transistor by a small light switch. The main portion of a DRAM chip is a rectangular array of individual memory units (shown as gray squares), consisting of one capacitor and one transistor. A memory unit stores a single bit, determined by whether or not the capacitor is charged (indicating a 1) or empty (indicating a 0). The array of memory units is arranged in rows and columns, as shown in Figure A4.1. For each row of memory units there is a line ("wire") called a word line. The word lines are labeled WLl, WL2, ... in the figure. For each column of the array, there is a pair of lines, called bit lines, which are labeled BLla, BLlb; BL2a, BL2b, and so on. Note that, in each memory unit, the capacitor is connected, through the transistor, to the word line and to exactly one of the bit lines (the "b" lines in the figure).

386

Understanding Personal Computers

Precharge Unit

-

Col 1

Col 2

Coin

~

(l)

Row 1

000 oo~

(1)~ C

-CO -CO «~ >.0 ~

Row 2

o

U C1> ~~-----+--+----4~----~----+

E

Cl

E

0:

0(1)

~

...

C1>

"0

3:: o

-

~E

WLn

o

~

Rown

BUa

BUb BL2a

BL2b

BLna

Figure A4.1 - A DRAM chip

BLnb

A4. How Memory Works

387

Let us assume that our chip has a capacity of 4 megabits. For such a chip, one possible arrangement of memory cells is in an array with 2048 (= 211) rows and 2048 (= 211) columns. This provides a total of 2 11 .2 11 = 222 = 4 Mbits, as required. (The dimensions of memory unit arrays varies among chips, and need not be square.) The purpose of the address buffer is to receive the address of a memory unit. The address buffer for the chip is 11 bits wide (rather than 22 bits wide) because addresses are received by the chip in two parts-first the address of the row containing the memory unit and then the address of the column. The two addresses, of course, uniquely specify the memory unit. This row/column technique implies that the chip needs only 11 pins, rather than 22, and can therefore be smaller and cheaper. All memory read and write operations are initiated by a memory controller that is located on the same bus as the memory. The controller intercepts requests from the CPU and determines which memory units on which chips should be read from, or written to. Let us examine how a memory read operation is performed. The memory controller receives a read request from the CPU, along with a memory address, on the address bus. The memory controller determines which chips hold the requested data and identifies which memory units on these chips need accessing. Assume for the moment that each chip has a width of one bit, hence only one memory unit on each chip needs to be read. We can now concentrate on a single chip. Here is the step-by-step process for reading a data bit from a memory unit. The precharge unit first applies an equal current to each of the bit lines in the entire array. In particular, the bit lines get exactly half the rated voltage of the chip. Thus, in a 3.3 volt chip, each bit line gets charged to 1.65 volts. Now we are ready for the read process. 1. (Compute row and column addresses) The memory controller computes the row and column addresses of the memory unit to be read. In our example, each of these is an II-bit string, which is why the address buffer is 11 bits wide. 2. (Signal the sending of the row address to the chip and send the address) The memory controller activates the signal called RAS (see FigureA4.1) to inform the chip that it is about to receive the row address of the desired memory unit. (RAS stands for row access strobe.) It then sends that row address to the address buffer, which passes it on to the row decoder. 3. (Activate the appropriate row and thus alter the bit line pairs) The row decoder determines which row has this row address and activates the word

388

Understanding Personal Computers

line (WL), for that row. Let us assume, for the sake of discussion, that the activated line is WL2. This causes each of the switches (transistors) in row 2 to close, which then allows the current from the capacitors in row 2 to flow to (if the capacitor is charged), or from (if the capacitor is discharged) the bit lines to which they are connected. 4. (Interpret the bit line pairs voltage differences) Consider any column, say column 1. If the capacitor in column 1 is discharged (indicating a 0), then the current will flow from bit line BLlb to the capacitor, reducing the voltage in BLlb to a level lower than that of its partner BLla. On the other hand, if the capacitor is charged (indicating a 1), then current will flow from the capacitor to BUb, raising the voltage to a level higher than that of its partner BUa. Thus, the bit value is indicated by which bit line (BLla or BUb) has the higher voltage! 5. (Amplify the voltage difference) The actual voltage difference in the bit lines of a column is very slight, so a special unit called the amplifier amplifies this difference. It is important to note that the entire row containing the memory unit in question is read into the amplifier-I/O gate unit (and the signal is amplified). 6. (Signal the sending of the column address to the chip and send the address) The memory controller now activates the CAS (column access strobe) line, telling the chip to expect next the column address of the memory unit in question. The controller then sends the column address. It is accepted by the address buffer and placed in the column decoder, which determines the correct column to retrieve from the amplifier - I/O gate unit (which contains the entire memory row). This value is then output through the data output buffer. If a memory chip has width greater than one, then each access of must produce more than one bit as output. This is accomplished by memory unit arrays, as shown in Figure A4.2. Each row/column address memory unit from each array, thus selecting as many bits as there are arrays in the chip.

the chip stacking selects a memory

A4. How Memory Works

389

.... Q)

"0

o

U

Q)

Memory arrays

Cl

~

c::

Chip width 4

Column Decoder

Figure A4.2 - A chip width of 4

More on DRAM Refresh Recall that DRAM chips require refreshing, because the capacitors "leak" their charge. Fortunately, the read process actually provides a perfect opportunity to refresh the row that contains the memory cell to be read. After the data bit is output, the entire row, which has been amplified in the amplifier-I/O gate, can be read back to its original location, thereby refreshing the row. Thus, every time a bit is read from a row, the entire row gets refreshed! Despite this, a more regular method is required to insure constant refreshing. The simplest approach is to include circuitry to periodically read every row in every chip. Moreover, this can be done without activating the CAS line, thus skipping the actual selection of a column and thereby saving time.

Memory Cycle Time We have discussed the access time of a memory chip in the chapter on memory. This is the time between applying the row address to the address buffer and the time the data is actually output. Although this time is commonly quoted in advertisements for memory, it is not the whole story when it comes to memory performance. For there is considerable time involved in the precharge process, activating the RAS and CAS lines, decoding the row and column addresses and refreshing the row and then discharging the RAS and CAS lines. Thus, the cycle time, which is the shortest time between consecutive memory reads, may be approximately twice the access time.

390

Understanding Personal Computers

Improvements on Ordinary DRAM Let us briefly discuss some improvements that have been made over ordinary DRAM. Fast Page Mode (FPM) One of the problems with the preceding process for reading data from memory is that, each time a bit is read, the entire row/column decoding process takes place. However, it is often the case that successive data reads (or writes) take place from the same row of a chip. In the so-called fast page mode DRAM, or FPM DRAM, the RAS signal is not deactivated after each read. Then, if the memory controller sees that the next access is in the same row, it can bypass the row address, and simply send the column address to the chip. By quickly deactivating and reactivating the CAS line, the memory controller tells the chip to expect a column address and to use the same row address as before. This saves inputting and decoding the row address, as well as the time required to precharge the column bit lines. This cuts the access time for subsequent same-row reads to about 50% and cycle times to about 70% of normal. Of course, it does not apply to the first read in a given row. Extended Data Out (EDO) A newer approach to improving performance is used in extended data out RAM, or EDO RAM. This is a slight variation on fast page mode, and may improve memory performance significantly. In particular, the CAS line does not need to be deactivated and then reactivated for each column read from a given row. Instead, the new column address is enough to produce a new output from that roW. This saves a bit more time in the read process. EDO RAM has become common in many high-end PCs, mainly because EDO chips can be manufactured with only very slight adjustments to non-EDO manufacturing procedures, and is thus very cost-effective. Synchronous DRAM (SDRAM) In general, the term synchronous refers to operations that take place at regular intervals, such as at the beginning of each tick of a clock. Asynchronous operations are not paced by a regularly occurring event, such as a clock tick. (The terms have a somewhat different meaning when applied to communications, as discussed in the chapter on asynchronous and synchronous transmission.) Ordinary DRAM operates asynchronously. When one event, such as the decoding of a row address, is completed, the next event (decoding the column address) begins, without waiting for a new clock cycle. In a synchronous

A4. How Memory Works

391

DRAM, or SDRAM, all events are paced by a clock. Synchronous DRAM is generally faster than ordinary (asynchronous) DRAM. It might seem at fIrst that asynchronous operations should be faster than synchronous operations, since the events do not need to wait for the beginning of a new clock cycle. However, there are some other factors to consider. First, asynchronous operations may require more overhead, because there must be a way to tell when an event is finished, so the next event can start. Consider the following simple example, just to illustrate the point. Suppose I want to send you two messages. I know (but you do not) that each message takes exactly 59 seconds to transmit. Suppose we both have access to a clock that ticks once every minute. Now, if I send you the messages synchronously (with the clock), then you will get both messages after 1 minute 59 seconds, with a 1 second delay after the first message. On the other hand, to send the messages asynchronously, one right after the other, I must include a signal that separates the two messages, so that you will know when the first message ends and the second one begins. But if that extra signal takes 3 seconds to transmit, the total transmission time for both messages (and the separator signal) is 59+3+59 = 2 minutes and 1 second! Thus, in this case, synchronous transmission is faster. Second, consider an asynchronous request of the CPU for data from memory. If the transmissions are asynchronous, the CPU cannot know exactly when the data will be returned, since it depends on how long memory takes to process the request. Hence, the CPU must wait for the returned data, or deal with an interrupt. On the other hand, with synchronous transmissions, the CPU can do other work in between clock ticks. (To draw another analogy, if I need to greet the postman the moment he or she arrives at the door, I can get much more other work done if I know that the postman comes exactly on the hour, and at no other time.) Burst Modes In general, the term burst transfer, or just burst, refers to the situation where a single address is sent, but data from more than that one address is retrieved. In Burst EDO DRAM, which seems to be the next big wave in memory chip technology, not only is the requested bit from a given row placed in a 4-bit data buffer, but so are the 3 following bits (if there are 3 following bits in that row). From the data buffer, the bits can be output very rapidly, without the need for further row/column manipulations.

392

Understanding Personal Computers

According to Micron Technology, a leading memory chip manufacturer, burst EDO DRAM doubles the performance of FPM DRAM and improves performance 40% over ordinary EDO memory. In fact, it is designed to eliminate microprocessor wait states in the 66 MHz local buses currently used in Pentiumbased pes. In other words, burst EDO memory does not place a performance drag on the Pentium processor, over and above the drag placed on it by the bus.

AS. Real and Protected Modes

There are two main aspects to the differences between real mode and protected mode. One is a physical difference in how memory is addressed. The other is a logical difference, in particular, protected mode is designed to protect one program from another, so that one program cannot access the memory reserved for another program. Let us begin our explanation with a brief look at how the 8088 processor addresses a memory location.

Segmented Addressing In all microprocessors, addresses and data are stored in memory and brought into the microprocessor's registers for use. In a 16-bit microprocessor, such as the 8088, there is a bit of a problem, however. Namely, a 16-bit register can hold only 2 16 = 65,536 addresses, far short of the 220 potential linear addresses for the 8088. The solution to the problem of how to express a linear memory address in the 8088 is to use two 16-bit registers for each linear address, in a manner referred to as segmented addressing, also known as real mode addressing. The concept of segmented addressing is really quite simple. A segmented address consists of two parts, the segment (also called the paragraph) and the offset. Each of these is a 16-bit word, and will thus fit nicely into a 16-bit register. The formula for the linear address is simply linear address =(16 x segment value) + offset value Segmented addresses are usually written in the form segment:offset, and are thus said to be in segment:offset format. As an example, the segmented address 5010:123 is the same as the linear address linear address = (16 x 5010) + 123 = 80283 Computing linear addresses is actually simpler when all numbers are expressed in hex. In hexadecimal notation, multiplying by 16 10 is done simply by

394

Understanding Personal Computers

placing a a on the right end of the number, just as, in decimal notation, multiplying by 10 has this effect. Figure A5.1 illustrates the processes of address resolution of a segmented address (1234:5678) into a linear address, when all numbers are expressed in hex notation.

Figure AS.1 - Real mode address resolution It might help allay any confusion to note that segmented addressing is just a special kind of notation for memory addresses. A memory byte has only one address-it is a question of how to express that address using only 16-bit registers. Segmented addressing was invented precisely because the 8088's registers are only 16-bits long. Had they been 20 bits long, segmented addressing might never have been invented, and many a programmer wishes this had been the case! Segmented addressing allows the 8088 to "reach" its entire I MB (=2 2°) addresses using 16-bit registers. The 8088, and all subsequent microprocessors for compatibility reasons, have 4 special registers, named the code segment register (CS), data segment register (DS), extra segment register (ES) and stack segment register (SS). The purpose of these registers is to hold the segment portions of an address. The offset portions of the address are placed in other registers.

Memory Segments If we fix a segment value, say lillh, and let the offset value vary, we get a series of consecutive memory addresses, beginning at 1111:0 = 1111 a (linear) and ending at l111:FFFF = 11110 + FFFF = 12110F (linear)

AS. Real and Protected Modes

395

These memory locations consist of 10000h (= 6S,S36 = 64 K) consecutive bytes, and are referred to as a memory segment (or just a segment), as shown in Figure AS.2. Put another way, all addresses with a fixed segment portion make up a 64 KB memory segment.

11 10F

1111 :0= 1111 0 11 11 1

-

Memory Segment65,536 bytes

end of segment :

1111 :FFFF =

=

~

12110F (linear)

I

···

Segment starting at 1111 :0 or 111 10 (linear)

~=================~ ~----------------~

Figure AS.2 - A memory segment Since adding 1 to a segment value has the effect of adding 16 (= 10h) to the linear address, we see that two consecutive segments start only 16 bytes apart. Thus, they overlap dramatically, as shown in Figure AS.3.

16b~es

{ Ir-_____S_e_g_m_e_ nt______

~

Next Segment

Figure AS.3 - Adjacent segments You can thus see why a single linear address has many different segment:offset formats. For instance, (in decimal) the segmented address 1:100 has linear address

396

Understanding Personal Computers

linear address = 16x 1 + 100 = 116 and the segmented address 2:84 also has linear address linear address = 16x 2 + 84 = 116 The important thing is to note that a segmented address corresponds to only one linear address. Protected Mode Addressing When the 80286 processor was designed, it was given 24 address lines and thus an address space of 224 bits = 16 MB. However, the register sizes were not changed. Thus, the real mode segmented addressing scheme would no longer reach the entire address space. It could only reach the first 1 MB (plus a tiny bit more) of 16 MB! In order to preserve backward compatibility, the designers of the 80286 decided that the processor should work in two mode - real mode and protected mode. In real mode, addressing is done exactly as above, in which case the 80286 can only access 1 MB of memory. In this mode, the 80286 is just like an overgrown 8088-it is faster, but in other respects essentially the same. When an 80286-based PC (or indeed any PC) is first turned on, it starts in real mode. In protected mode, the offset portion of a segmented address is the same as in real mode, but the segment portion is different. Without going into some rather complex details, as illustrated in Figure A5.4, the value in the segment register is used to look up an address in a special table, called a (segment) descriptor table. The entry in this table gives, among other things, the beginning address of the segment, also called the base address. This value gets added to the offset to get the linear address. Since the base address in the descriptor table is 32-bits long, there are many more possible segment values. Simply put, since the segment register is too small to contain a large enough number, why not just let the segment value point to a larger number?

AS. Real and Protected Modes segment

r----~

397

offset

15 161718 1 -------,

•• • 1233 1234

+~ •• •

111117171AI ~ linear address

descriptor table

Figure A5.4 - Protected mode address resolution Segment Size and a Flat Memory Model When doing assembly language programming in real mode, the programmer can (with a few exceptions) manipulate the segment registers directly. However, the nature of assembly language programming makes it inconvenient to change segment register values-much more so than to change an offset value. Thus, whenever possible, the programmer tries to work with a fixed segment register value; that is, within a single 64 KB memory segment, at least for each of the four segment registers (thus having access to 4 segments simultaneously). However, these days, a 64 KB chunk of memory is mighty small. Relief came with the 80386 (and following) processors. In particular, while the segment portion remained 16-bits, the offsets were put into 32-bit registers (such as ESI and EDI). This increased segment size to 232 = 4 GB, or about 4 billion bytes. Note that this happens to be the size of the entire address space for an 80386 or later processor. This means that a single segment can encompass all possible memory! The programmer can simply set the segment registers (CS, DS, ES and SS) to 0 and use only the offset portion of the address. While there are benefits to using more than one segment in protected mode, which are mostly related to memory protection (memory segments may be marked read-only, for instance, to prevent errant programs from writing to that portion of memory), sticking with a single memory segment does simplify programming considerably, as well as improve performance. This is probably why Microsoft Windows itself does just this. When a program fixes all segment registers at 0 and uses only the offset, thus turning all available memory into one

398

Understanding Personal Computers

large segment, it is said to follow a flat memory model. Windows uses a flat memory model. To avoid any possible confusion, we should mention that protected mode segments do have a size. They need not all be 4 GB is size. Thus, in a segmented memory model, as opposed to a flat memory model, there may be several memory segments of varying sizes. Real Mode versus Protected Mode Real mode is characterized by the fact that all addressing is done in the segmented format. Thus, whilst in real mode, a processor (be it an 8088 or a Pentium Pro) can access only 1 MB of address space. As mentioned before, it functions like an overgrown 8088. Protected mode is characterized by the fact that the processor (80286 and above) uses protected mode addressing. This is simply a matter of interpreting the segments as an index into a descriptor table, which gives the base (or beginning) address of the segment. Protected mode also carries with it some built-in protections, as the name would suggest. We have seen that the lower areas of conventional memory are occupied by the interrupt vector table, the BIOS data area and a portion of the DOS system files. Now, in real mode, an assembly language programmer has complete access to these areas of memory, and can therefore easily corrupt the important data that is stored there. This will generally bring the PC to a complete halt. To recover from such a system crash, it is usually necessary to hit the reset button on the PC, thus causing it to reboot. In short, we can say that real mode operation provides no protection for the operating system, from an errant program (or an errant programmer). In a multitasking environment, such as Microsoft Windows, several programs may be running at one time, and it is important that one program not be allowed to corrupt other programs, or the operating system. One of the strengths of protected mode is that it attempts to provide this protection. Protected mode operation has several protection mechanisms. For instance, you will recall that the segment value points to a location in a descriptor table, wherein lies the segment's base address. That location in the description table is referred to as a segment descriptor. The segment descriptor contains other information as well. For instance, it has something called a privilege level that controls access to the segment. The segment descriptor also contains a segment type field. The type field can be used to mark a segment as, for instance, readonly, which means no program can write data to that segment, an obvious benefit for a segment containing sensitive operating system data.

A5. Real and Protected Modes

399

Paging Unfortunately, the protection mechanisms that we discussed in the previous paragraph are of little use when there is only one segment. We cannot mark it read-only, for instance, since then no program could ever write to memory. However, beginning with the 80386, Intel added a new feature called paging. Paging places another layer between a segmented address and the actual address in memory. When paging is activated, the linear address does not represent an actual physical address of a byte in a real-life memory chip. Thus, we have a two-step process that might be described by segment:offset --7 linear address --7 physical address The translation from linear address to physical address is done in a manner similar in principle to the translation from segmented address to linear address, as shown in Figure A5.5. Linear Address 31

I

I

22 21

o

1211

,I

I

Index into Index into, Offset into Page Directory Page Table Page Frame

I

,---J Page Directory

Page Frame (4 KB)

~ PageTable

[ ~

Page Directory entry points to a Page Table

1Page Table entry

~ points to a Page Frame

~PhYSiCal ~Address

Figure AS.S - Linear to physical address resolution The linear address is divided into three parts (instead of the two parts in a segmented address). The fIrst part points to an entry in the so-called page directory, that is kept by the operating system. Note that there is one page directory for each running application. The page directory entry in turn points to one of many page tables. The second part of the linear address is used as an index into that page table. The entry in that page table points at last to a page frame. Page frames are 4 KB in size and (generally) lie in physical memory. The third and final part of the linear

400

Understanding Personal Computers

address is an offset (index) into the page frame, thus giving an actual physical address. One immediate advantage to this scheme is that a page frame is very small (4 KB) and thus easily kept in memory. It is much easier to find 4 KB of consecutive physical memory than, say 64 KB. However, the main advantages are threefold. •

One application's memory can be protected from another application.



Memory can be shared among applications.



Each application can think it has the full 4 GB address space to itself! That is, each application has a 4 GB virtual address space.

Let us examine these in more detail. The Virtual Memory Manager We should first remark that the entire memory management process, which is quite complex, is managed by a part of the operating system called the virtual memory manager, or VMM. For instance, a VMM is a major part of Microsoft Windows. Windows 95 and Windows NT actually handle memory management in a somewhat different way. However, we will not go into the differences in this book. Memory Protection and Memory Sharing Now, let us consider a simple example. Consider two Windows based applications-call them App 1 and App2. The VMM allows each application to refer to any linear address from 0 to 4 GB minus 1. Thus, each application thinks it has the full 4 GB memory address space to itself. As to the protection of memory, suppose that Appl and App2 both refer, in their programming, to the same linear address. For Appl, the linear to physical address resolution uses App l' s page directory, and the address resolution for App2 uses App2' s page directory. Thus, while the indices into the page directories are the same in both cases, these indices are indices into different tables. Thus, the VMM can see to it that the entries in these directory tables point to different page tables. Once again, the indices into the page tables are the same, but the page tables themselves are different. Thus, the VMM can see to it that different page tables point to different page frames in physical memory. In this way, the same linear address in two different applications refers to two different physical addresses!

A5. Real and Protected Modes

401

To emphasize the point, neither application even realizes that there are other applications, since any address it uses, throughout the entire 4 GB address space of the microprocessor, is translated into a physical address that has been set aside by the VMMfor that application alone. By the same token, the VMM can arrange it so that a particular linear address used by App 1 and a particular linear address used by App2-be they the same or not-both point to the same physical address. (One way to do this would be to have both page directory entries point to the same page table, and thus to the same page frames.) In this way, applications can share common memory. This memory can be marked read-only, if desired, as it would be if it contained operating system files, for instance. It can also be used to set up a line of communication between two applications -one application writes to this area and the other application reads from this area. Virtual Memory But, what happens if a PC has 16 MB of physical RAM and one application decided to create 100 MB of data? This data will not fit into physical memory. The solution is quite simple and quite elegant. Let us describe a fictitious scenario to illustrate how this problem is handled. Since a page frame has size 4 KB, a 16 MB physical memory consists of 16 M/4K = 4096 pages. Suppose that an application decides to create 4097 pages of data. The first 4096 pages could (theoretically) fit into memory. In order to make room for the last page, one of the current pages can be swapped to hard disk. Roughly speaking, this is done as follows. Each page table entry has a special bit called the present bit. This bit is set to 1 if the page frame is actually in physical memory. If not, the present bit is set to O. When the present bit is set to 1, the page table entry contains the address of a page frame in physical memory (and the offset gives the actual byte). On the other hand, when the present bit is set to 0, the rest of the page table entry can be used in any way by the VMM, since it no longer points to a page frame in physical memory. Thus, the VMM can use this space to identify where, on the hard disk swap file, it placed the page frame. Now, in a real life situation, each application may have some of its page frames in physical memory and some on hard disk, in so-called virtual memory. When a running application references an address, the VMM determines the corresponding page frame. If the frame is in physical memory, a page hit has occurred and the page is said to be valid. If not, a page fault occurs and the page is said to be invalid. In this case, the VMM uses the page table entry to retrieve that invalid page from virtual memory back into physical memory, for subsequent access, as shown in Figure A5.6.

402

Understanding Personal Computers

Virtual Address Space (4 GB) for App 1

Physical Address Space (perhaps 16 MB)

Virtual Address Space (4GB) for App2

Invalid

Free

Invalid

Invalid

~

1-

Shared

(

Valid

Free

Invalid

Invalid

Free

Invalid

Invalid

Used by App 1

Valid

Used by App 2

Invalid

Used by App 1

Invalid

Valid

Invalid Valid Invalid Valid Invalid

:s

•• •

} 4 KB Page

Invalid Invalid Invalid

•• • Figure A5.6 - Virtual memory

Using this page swapping technique, referred to simply as paging, an application can use more memory than actually exists physically. As indicated, Windows uses a hard disk swap file for paging. Under Windows 3.1, the user can set the size of this file, using the control panel, thus determining the amount of virtual memory. Windows 95 and Windows NT will dynamically manage the amount of virtual memory on your system, expanding and contracting the size of the swap file. (You can fix the size of the swap file in Windows 95 using the System icon under the Control Panel, but this is not recommended.)

A6. Sector Translation

In this appendix, we describe the process of sector translation. There are actually two different schemes for sector translation.

Enhanced CBS Addressing-An Example First In this scheme, the drive's BIOS translates a CHS address into a so-called enhanced CHS, or EellS, address. Let us illustrate with a simple picture before generalizing. Consider Figure A6.1. For the sake of simplicity, imagine that the platters are only one-sided and that the actual physical disk has only 2 heads and 4 cylinders. On the other hand, imagine that the BIOS can recognize more heads, but only 2 cylinders.

Note that we have numbered each track with its cylinder number followed by its head number. The bottom portion of the figure (below the horizontal line) shows the logical view of the BIOS, after sector translation. The idea is simply to take every other track (shown in broken lines) from the original heads and "move" them to new imaginary (logical) heads. Thus, the logical addresses use half as many cylinders, but twice as many heads. We have not included sector numbers because these do not change when translating between the two types of addresses. The BIOS's version of the CHS address is called an enhanced CBS address, or ECHS. One key point to observe is that the order in which data is read or written remains the same in either addressing scheme. To explain this, note that the order

404

Understanding Personal Computers

CyVHead

Actual disk

(···············,;:~~~~~~:.:.~;·;:···········O/2

BIOS's view of disk

................................................

Figure A6.1 - Example of sector translation

A6. Sector Translation

405

in which data is read from a disk proceeds by starting at the beginning of a cylinder (at head 0), reading the entire track on that head, then proceeding to the next track on the same cylinder (thus a new head). In other words, an entire cylinder is read, going through each side of each platter, before changing cylinders. This order makes perfectly good sense when you think about the fact that each side has its own read/write head, and it is much quicker to electronically switch heads than to physically move the entire head assembly to a new cylinder. Thus, in the physical CH(S) addressing, the tracks (actually all sectors on the track) would be read in the following order: 0/0, 0/1, 1/0, 1/1, 2/0, 2/1, 3/0, 3/1 Now, in the logical view, this would be 0/0,0/1,0/2,0/3, 1/0, 1/1, 1/2, 1/3 But if you trace over the figure, following both of these cylinder/track numbers, you will see that you are tracing over the tracks (and hence also the sectors) in exactly the same order!

Enhanced CHS Addressing-The Details For those who are mathematically inclined, here are the details of ECHS addressing. Suppose a drive has DrvH heads and DrvC cylinders. Suppose also that the BIOS does not support this many cylinders, but it does support more heads. A number N (which must be a power of 2) is chosen so that the BIOS does support DrvCIN cylinders, and will also support DrvH*N heads. In loose terms, we are "shifting" the number of cylinders down by a factor of lIN, for which we must shift the number of heads up by a factor of N. For a given enhanced CH address ECIEH (remember the sector number does not change), the physical CH address CIH is computed from the formulas C =EC*N + Int(EHlDevH) H =Rem(EHlDevH) where Int means take only the integer portion of the quotient and Rem means take only the remainder.

406

Understanding Personal Computers

Let us try this out on our example, where N = 2. For ECH address 1/2, we have EC=1 and EH=2 and so C = 1*2 + Int(2/2) = 2 + 1 = 3 H = Rem(2/2) = 0 and so the CH address is 3/0. The figure will bear this out. As another example, the ECH address 0/3 is equivalent to the CH address with C = 0*2 + Int(3/2) = 1 H = Rem(3/2) = 1 that is, 1/1. Again, the figure bears this out.

Logical Block Addressing (LBA) The ECHS scheme is a bit complex, wouldn't you say? A much simpler scheme is to just label the sectors consecutively, in the order they would be read, starting with sector 1. This is the logical block address, or LBA. To illustrate, consider again Figure A6.1 and assume that there are 4 sectors per cylinder. The conversion from CHS to LBA is simple: 0/0/1 -70 0/0/2 -71 0/0/3 -72 0/0/4 -7 3 0/1/1 -74 0/1/2 -7 5 0/1/3 -7 6 0/1/4 -7 7

A6. Sector Translation

407

1/0/1 ~ 8 1/0/2 ~9 1/0/3 ~ 10 1/0/4 ~ 11 1/1/1 ~ 12 1/1/2 ~ 13 1/1/3 ~ 14 1/1/4 ~ 15 and so on. As to the fonnula, for a given CHS address, the LBA is LBA =C*(no. of sectors/cylinder) + H*(no. of sectors/track) + S - 1 For instance, in our example, the number of sectors per cylinder is 8 and the number of sectors per track is 4. Hence, the fonnula is LBA =C*8 + H*4 + S - 1 For instance, CSH address 1/1/2 corresponds to LBA 1*8+ 1*4+2-1

= 13.

A 7. How Data Is Encoded on a Disk

In this appendix, we look more closely at how the data is encoded on a hard disk. This material supplements the chapters on hard disks. It might seem as though we could simply magnetize a given domain in one direction to indicate a 0, and in the other direction to indicate aI, as shown in Figure A7.1. (We will assume a horizontal recording scheme, rather than a vertical one. For this discussion, it makes no difference.)

> )

> >

> > > >

> >

oE'

oE oE oE

) )

0

Figure A7.1 There is a problem with this, however. A long string of Os would mean a long stretch of track with no flux transitions. Hence, no current would be generated while the head traveled over this portion of the track. Unfortunately, there is an inherent variation in the rotational speed of the disk. Thus, when attempting to read a long stretch of Os, it may take a slightly different amount of time to go from one 0 to the next, than it took when writing those same Os. Since there is no signal being generated, there is no way for the drive electronics to observe this slight variation from 0 to 0, and so it will accumulate over the entire stretch of Os. This may cause the head to fall seriously enough out of synchronization with the data, resulting in an incorrect reading (or writing). We can put this more concretely with an exaggerated example. Suppose it took exactly 1 second to write each of one thousand Os on a track. While reading these same Os, the rotational speed is such that it takes 1.01 seconds to get from 0 to O. Then, in the same one thousand seconds it took to write the data, the head will only read 1000/1.01 = 990 of these Os. Hence, ten Os are lost. Thus, it is clear that time measurement alone is not enough to keep the head synchronized.

410

Understanding Personal Computers

On the other hand, if there were a pulse of current at each 0, then the drive electronics could use this pulse to recalibrate, taking into account the change in rotational speed. It follows that we cannot allow the absence of flux transitions for any significant period of time. Frequency Modulation (FM) The most naive approach to resolving the synchronization problem and the approach that was used in early drives, is simply to intersperse a transition between every data bit. In particular, we do the following: •

Intersperse between each data bit a special clock bit. For example, consider the data 11001010. We use the symbol (9 to denote a clock bit, giving Q)1Q)1C90C90C91(1)0C91(1)O



Encode this on the disk track so that each 1 and each clock bit produce a current pulse (that is, a flux transition). Thus, a clock bit can be either a 0 or al.

This procedure guarantees that any two consecutive Os are broken up by a current pulse, called a clock pulse, or clock transition. This certainly prevents long sequences of Os. Figure A7.2 shows the current pulse graph, corresponding flux transition graph and the disk track encoding for the example given above.

A7. How Data is Encoded on a Disk

411

clock data pulse bit

~

(3

~

(3

(3

0

C9

C9

0

current

pulses

I I

flux

+

I

I

I

I I

I I

I

~ ~ ~ ~

I

I

I

I

I

I

I

I I I I

I I I I

I

I I I

I I I I

I I I I

i

tranSftiOn~ ~

disk encoding

I I

nLTnOf-----'I i'----l.f---'I ----l.f---·-Of---+--L---+-O-I---I--L--RIJ~Ln[J::, I

I

I I I I

I

I

~ ~ ~~ ~ ~ ~~ ~ ~ ~~ ~ ~ ~~

I

I

I

I

I

+-+-+-+--

I

I I I I

~ ~ ~ ~

~ ~

~ ~ ~ ~ ~ ~

I

+-+-+-+--

I

~ ~ ~ ~

~ ~ ~ ~ ~ ~ ~ ~

one domain length

Figure A7.2 - FM encoding The encoding shown in Figure A7.2 is referred to as frequency modulation. This terminology stems from the fact that the frequency of current pulses (or flux transitions) varies over time, and it is this variation that encodes the data. Note that the frequencies are either one or two domain lengths long. Moreover, each bit of data requires two domain lengths of disk track for encoding. Thus, our 8bit example requires 16 domain lengths of track. Modified Frequency Modulation (MFM) The PM method of data encoding is very wasteful and it is easy to do much better. All we need to do is realize that the only place a clock transition is really needed is between pairs of consecutive Os. In this case, our example becomes

110001010 This is certainly more economical, and we still have plenty of transitions for synchronization. The problem comes with reading the sequence of clock pulses to recover the data. For instance, if we use the same scheme of magnetizing a single domain between adjacent transitions, the current pulse graph for writing the data (and clock pulse) would appear as follows.

412

Understanding Personal Computers

o C9

o

o

0

current pulses

However, upon reading these pulses, we cannot tell the difference between the sequence 11OG01010 (as written) and the sequence 110101010 (with the clock bit replaced by a 1). This ambiguity means that the data are useless! This problem is easily solved, however, by first noting that there are two domain lengths surrounding each clock current pulse. However, as long as the drive electronics can tell the difference between a stretch of disk track equal in length to 1, Ph and 2 domain lengths (and this is not a problem for today's drives), we can shrink these two-domain length stretches down to 1V2 domain lengths, as shown in Figure A7.3.

,

current pulses

oC:;Jo

,: , , • ,,,

,

,

o



,

, ,

+

0

flux transitions I

disk

I

J

I

I

I

~ 'E'E'8''EE'I'I~' =t ~ =: t= ===: ~ ~~~~~~

encoding

one domain length

Figure A7.3 - MFM encoding As you can see from the figure, each bit of data now requires only one domain length of disk track-reducing the total for the 8-bits to 8 domain lengths, a 50% improvement over FM encoding. This encoding scheme is referred to as modified frequency modulation, or MFM. Floppy disks use MFM encoding of data. (It might be useful to point out that, while we are dealing with fractions of a domain length, this happens only when the total magnetized length is at least one domain length. In other words, while the drive cannot magnetize a section of track whose length is only one-half

A7. How Data is Encoded on a Disk

413

of a domain length, it can magnetize a section of track whose length is one and one-half domain lengths.) Run Length Limited (RLL)

A further improvement is possible by taking a different approach, called run length limited, or RLL encoding. The idea is to replace the bits to encode by a sequence of T's and N's, where a T represents a transition and an N represents a non-transition. In one common form of RLL, called 2,7 RLL, any such sequence of T's and N's has at least two but at most seven N's between every pair of T's. The former requirement is designed to limit the number of transitions, for efficient packing of the data, and the latter requirement is designed to avoid excessively long periods with no transitions (which is bad for synchronization). The actual encoding is done by using the RLL decision tree in Figure A7 A. We start on the far left and follow the branches (lines) until we arrive at a sequence of N's and T's. Let us refer to the sequences at the ends of the branches as RLL codewords. The path to follow to encode the bit string 0010 is shown in bold in the figure. The resulting RLL codeword is NNTNNTNN. TNNN

o NTNN NNTNNN

o TNNTNN NNNNTNNN NNTNNTNN

o

NNNTNN

Figure A7.4 - RLL decision tree In this way, a string of bits is encoded as a string of one or more RLL codewords. For instance, our example bits 11001010 are encoded as follows:

1110010110 TNNN NNTNNTNN NTNN

414

Understanding Personal Computers

Note that each RLL codeword satisfies the two requirements mentioned earlier. In addition, each RLL codeword ends in at least two N's. Hence, when two or more codewords are placed end-to-end, every pair of T's will still be separated by at least two N's. Also, no codeword ends with more than three N's nor begins with more than four N's, so any string of codewords has no more than 7 consecutive N's. Before drawing any graphs, we must also note that the original data can always be unambiguously recovered from its RLL encoding as a string ofT's and N's. The reason is that no RLL codeword is a prefix (beginning) of another RLL codeword. To illustrate what can go wrong, suppose we change the last codeword in the decision tree from NNNTNN to TNNNNTNN. Now we have the problem that the data strings 000 and 1110 both encode as the RLL string TNNNNTNN. Thus, when reading this RLL string, we don't know how to recover the original data! The reason for this problem is that the RLL codeword TNNNNTNN begins with another RLL codeword (TNNN). So, once we have read TNNN, we (or rather, the drive electronics) don't know whether to decode this as 1110 or to wait for further T's and N's before decoding. In any case, it is easy to check that no RLL codeword is a prefix of another RLL codeword and so decoding (that is, reading) will always proceed unambiguously. As to the actual encoding on disk, we know that every transition (T) is followed by at least 2 non-transitions (N's). Hence, we can always devote a single domain to each sequence of three T's and/or N's. This gives the graphs shown in Figure A7.5.

A7. How Data is Encoded on a Disk

415

TNNNNNNTNNTNNNTNN current pulses

+ flux transitions

disk encoding

...........-

one domain length

Figure A7.5 - RLL encoding As you can see, the data are even more tightly packed than in MFM encoding. In fact, the decision tree shows that any string of data will be encoded as a string of T's and N's that is twice as long as the data. However, every three of the T's and N's fits in one domain length of track. Hence, each bit requires 2/3 of a domain length (double, then divide by 3). For instance, our 8-bit example requires 16/3 domain lengths, rather than 8 domain lengths for RLL. This is a 33% savings in space. We should point out that RLL can put a strain on the timing between controller and drive. The two must work together quite closely in order to keep the head in synchronization over a stretch of 7 consecutive Os. This cooperation is possible in EIDE and SCSI systems, since the controller and drive are combined into a single unit, which means that they are manufactured, by the same company, to work together closely.

AS. Intel Microprocessor Quick Reference Guide

The following Intel Microprocessor Quick Reference Guide is reprinted by permission of Intel Corporation, from the Intel web site located at www.intel.com. The performance information below refers to SPECint92 and SPECfp92. SPEC refers to Standard Performance Evaluation Corporation, a nonprofit organization established to provide a consistent standard for benchmarking computers. The SPECint92 benchmark measures integer arithmetic performance and SPECfp92 measures floating point (noninteger) math performance.

Intel Microprocessor Evolution 4004 Introduction date: November 15, 1971 Clock speed: 108 kilohertz 0.06 MIPS Number of transistors: 2,300 (10 microns) Bus width: 4 bits Addressable memory: 640 bytes Typical use: Busicom calculator First microcomputer chip, arithmetic manipulation 8008 Introduction date: April 1972 (developed in tandem with 4004) Clock speed: 200 kilohertz 0.06 MIPS Number of transistors: 3,500 (10 microns) Bus width: 8 bits

418

Understanding Personal Computers

Addressable memory: 16 Kbytes Typical use: Dumb terminals, general calculators, bottling machines Data/character manipulation

8080 Introduction date: April 1974 Clock speed: 2 MHz 0.64 MIPS Number of transistors: 6,000 (6 microns) Bus width: 8 bits Addressable memory: 64 Kbytes Typical use: Traffic light controller, Altair computer (first PC) Ten times the performance of the 8008. Required six support chips versus 20 for the 8008

8085 Introduction date: March 1976 Clock speed: 5 MHz 0.37 MIPS Number of transistors: 6,500 (3 microns) Bus width: 8 bits Typical use: Toledo scale. From weight and price computed cost. High level of integration, operating for the first time on a single 5 volt power supply (from 12 volts previously)

8086 Introduction date: June 8, 1978 Clock speeds: 5 MHz (0.33 MIPS) 8 MHz (0.66 MIPS) 10 MHz (0.75 MIPS) Number of transistors: 29,000 (3 microns) Bus width: 16 bits Addressable memory: 1 Megabyte Typical use: Portable computing Ten times the performance of the 8080

8088 Introduction date: June 1979 Clock speeds: 5 MHz (0.33 MIPS) 8 MHz (0.75 MIPS)

A8. Intel Quick Reference Guide

419

Internal architecture: 16 bits External bus width: 8 bits Number of transistors: 29,000 (3 microns) Typical use: Standard microprocessor for all IBM PCs and PC clones Identical to 8086 except for its 8-bit external bus 80186 Introduction date: 1982 Note: Used mostly in controller applications 80286 Introduction date: February 1982 Clock speed: 6 MHz (0.9 MIPS) 10 MHz (1.5 MIPS) 12 MHz (2.66 MIPS) Number of transistors: 134,000 (1.5 microns) Bus width: 16 bits Addressable memory: 16 megabytes Virtual memory: 1 gigabyte Typical use: Standard microprocessor for all PC clones at the time Three to six times the performance of the 8086 Can scan the Encyclopedia Britannica in 45 seconds Intel386™ DX CPU Introduction date: October 17, 1985 Clock speeds: 16 MHz (5 to 6 MIPS) 20 MHz introduced February 16, 1987 (6 to 7 MIPS) 25 MHz introduced April 4, 1988 (8.5 MIPS) 33 MHz introduced April 10, 1989 (11.4 MIPS, 9.4 SPECint92 on Compaq/i 16KL2) Number of transistors: 275,000 (1.5 microns, now 1 micron) Bus width: 32 bits Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Typical use: Desktop computing Can address enough memory to manage an eight-page history of every person on earth Can scan the Encyclopedia Britannica in 12.5 seconds

420

Understanding Personal Computers

Intel386™ SX CPU Introduction date: June 16, 1988 Clock speeds: 16 MHz (2.5 MIPS) 20 MHz introduced January 25, 1989 (2.5 MIPS) 25 MHz (2.7 MIPS) 33 MHz introduced October 26,1992 (2.9 MIPS) Number of transistors: 275,000 (l.5 microns, now 1 micron) Internal architecture: 32 bits External bus width: 16 bits Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Typical use: Entry-level desktop and portable computing Intel™ DX CPU 50X performance of the 8088 Introduction date: April 10, 1989 Clock speeds: 25 MHz (20 MIPS, 16.8 SPECint92, 7.40 SPECfp92) 33 MHz introduced May 7, 1990 (27 MIPS, 22.4 SPECint92 on Micronics M4P 128K L2) 50 MHz introduced June 24, 1991 (41 MIPS, 33.4 SPECint92, 14.5 SPECfp92 on Compaq/50L 256K L2) Number of transistors: 1,200,000 (1 micron, with 50 MHz at .8 micron) Bus width: 32 bits Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Typical use: Desktop computing and servers Fifty times the performance of the 8088 Can scan the Encyclopedia Britannica in 3.5 seconds Intel386™ SL CPU Introduction date: October 15, 1990 Clock speeds: 20 MHz (4.21 MIPS) 25 MHz introduced September 30, 1991 (5.3 MIPS) Number of transistors: 855,000 (l micron) Internal architecture: 32 bits External bus width: 16 bits Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Typical use: First microprocessor made specifically for portables Highly integrated; includes cache, bus, and memory controllers

A8. Intel Quick Reference Guide

421

Intel™ SX CPU Introduction date: April 22, 1991 Clock speeds: 16 MHz introduced September 16, 1991 (13 MIPS) 20 MHz (16.5 MIPS) 25 MHz introduced September 16, 1991 (20 MIPS, 12 SPECint92) 33 MHz introduced September 21, 1992 (27 MIPS, 15.86 SPECint92) Number of transistors: 1,185,000 (1 micron); 900,000 (.8 process) Bus width: 32 bits Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Typical use: Low-cost entry to Intel™ CPU desktop computing. Same as InteFM DX CPU with no math coprocessor on chip. Upgradable with the Intel OverDrive processor. Becoming standard processor in embedded applications. IntelDX2TM Processor Introduction date: March 3, 1992 Clock speeds: 50 MHz (41 MIPS, 29.9 SPECint92, 14.2 SPECfp92 on Micronics M4P256KL2) 66 MHz introduced August 10,1992 (54 MIPS, 39.6 SPECint92, 18.8 SPECfp92 on Micronics M4P 256K L2) Number of transistors: 1.2 million (.8 micron) Bus width: 32 bits Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Typical use: High performance, low-cost desktops Uses "speed doubler" technology where the microprocessor core runs at twice the speed of the bus. Minimum required performance, lowest-cost desktops.

IntelDX4TM Processor Introduction date: March 7, 1994 Clock speeds: 75 MHz (53 MIPS, 41.3 SPECint92, 20.1 SPECfp92 on Micronics M4P 256KL2) 100 MHz (70.7 MIPS, 54.59 SPECint92, 26.91 SPECfp92 on Micronics M4P 256K L2) Numbers of transistors: 1.6 million (0.6 micron) Bus width: 32 bits Addressable memory: 4 gigabytes Virtual memory: 64 terabytes

422

Understanding Personal Computers

Pin count: 168 PGA Package, 208 SQFP Package Die size: 345 square mm Typical use: High perfonnance entry-level desktops and value notebooks

InteITM SL CPU Introduction date: November 9,1992 Clock speeds: 20 MHz (15.4 MIPS) 25 MHz (19 MIPS) 33 MHz (25 MIPS) Addressable memory: 64 MBytes Addressable virtual memory: 64 terabytes Process size in microns: 0.8 Number of transistors: 1.4 million (0.8 micron) Internal data path: 32 bits External data path: 32 bits Typical use: First CPU specifically designed for notebook PCs Pentium® Processor (60 & 66 MHz) Introduction date: March 22, 1993 Clock speeds: 60 MHz (100 MIPS, 70.4 SPECint92, 55.1 SPECfp92 on Xpress 256KL2) 66 MHz (112 MIPS, 77.9 SPECint92, 63.6 SPECfp92 on Xpress 256K L2) Number of transistors: 3.1 million (.8 micron, BiCMOS) Bus width: 64 bits (external data bus), 32 bits (address bus) (Note: It is a 32-bit microprocessor.) Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Pin count: 273 (Pin Grid Array Package) Package dimensions: 2.16" (5.49 cm) x 2.16" (5.49 cm) Typical use: Desktops Pentium® Processor (75 MHz) Introduction date: October 10, 1994 Clock speeds: 75 MHz (126.5 MIPS, 2.31 SPECint95, 2.02 SPECfp95 on Gateway P5 256K L2) Number of transistors: 3.2 million (.6 micron, BiCMOS) Bus width: 64 bits (external data bus), 32 bits (address bus) (Note: it is a 32-bit microprocessor) Addressable memory: 4 gigabytes Virtual memory: 64 terabytes

A8. Intel Quick Reference Guide

423

Pin count: 320 Lead Tape Carrier Package (TCP) 296 Staggered Pin Grid Array (SPGA) Package dimensions: PGA: 1.95" (5 cm) x 1.95" (5 cm) TCP: 0.94" (2.4 cm) x 0.94" (2.4 cm) Typical use: Desktops and notebooks.

Pentium® Processor (90 & 100 MHz) Introduction date: March 7, 1994 Clock speeds: 90 MHz (149.8 MIPS, 2.74 SPECint95, 2.39 SPECfp95 on Gateway P5 256K L2) 100 MHz (166.3 MIPS, 3.30 SPECint95, 2.59 SPECfp95 on Xxpress 1M L2) Number of transistors: 3.2 million (.6 micron, BiCMOS) Bus width: 64 bits (external data bus), 32 bits (address bus) (Note: it is a 32-bit microprocessor) Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Pin count: 296 (Pin Grid Array Package) Package dimensions: 1.95" (5cm) x 1.95" (5cm) Typical use: Desktops Pentium® Processor (120 MHz) Introduction date: March 27, 1995 Clock speeds: 120 MHz (203 MIPS, 3.72 SPECint95, 2.81 SPECfp95 on Xxpress 1MB L2) Number of transistors: 3.2 million (0.6 and .35 micron, BiCMOS) Bus width: 64 bits (external data bus), 32 bits (address bus) (Note: it is a 32-bit microprocessor) Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Pin count: 296 (Pin Grid Array Package) Package dimensions: 1.95" (5cm) x 1.95" (5cm) Typical use: Desktops and notebooks Pentium® Processor (133 MHz) Introduction date: June 1995 Clock speeds: 133 MHz (218.9 MIPS, 4.14 SPECint95, 3.12 SPECfp95 on Xxpress 1MB L2) Number of transistors: 3.2 million (0.35 micron BiCMOS) Bus width: 64 bits (external data bus), 32 bits (address bus) (Note: it is a 32-bit microprocessor)

424

Understanding Personal Computers

Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Pin count: 296 (Pin Grid Array Package) Package dimensions: 1.95" (5cm) x 1.95" (5cm) Typical use: High-performance desktops and servers

Pentium® Processor (150 MHz) Introduction date: January 4, 1996 Clock speeds: 150 MHz (4.27 SPECint95, 3.04 SPECfp95 on Xxpress 1MB L2) Number of transistors: 3.2 million (0.35 micron BiCMOS) Bus width: 64 bits (external data bus), 32 bits (address bus) (Note: it is a 32-bit microprocessor) Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Pin count: 296 (Pin Grid Array Package) Package dimensions: 1.95" (5cm) x 1.95" (5cm) Typical use: High-performance desktops and servers Pentium® Processor (166 MHz) Introduction date: January 4, 1996 Clock speeds: 166 MHz (4.76 SPECint95, 3.37 SPECfp95 on Xxpress 1MB L2) Number of transistors: 3.2 million (0.35 micron BiCMOS) Bus width: 64 bits (external data bus), 32 bits (address bus) (Note: it is a 32-bit microprocessor) Addressable memory: 4 gigabytes Virtual memory: 64 terabytes Pin count: 296 (Pin Grid Array Package) Package dimensions: 1.95" (5cm) x 1.95" (5cm) Typical use: High-performance desktops and servers Pentium® Pro Processor (150 MHz) Introduction date: November 1, 1995 Clock speeds: 150 MHz (6.08 SPECint95, 5.42 SPECfp95 on Alder 256K L2) Number of transistors: 5.5 million (0.6 micron), 256K L2: 15.5 million (0.6 micron) Bus width: 64 bits front side; 64 bits to L2 cache Addressable memory: 64 gigabytes Virtual memory: 64 terabytes Pin count: 387 (Dual Cavity Pin Grid Array Package) Package dimensions: 2.46 (6.25cm) x 2.66 (6.76cm)

A8. Intel Quick Reference Guide

425

Typical use: High-end desktops, workstations, and servers Pentium® Pro Processor (166 MHz) Introduction date: November 1, 1995 Clock speeds: 166 MHz (7.11 SPECint95, 6.21 SPECfp95 on Alder 512K L2) Number of transistors: 5.5 million (0.35 micron), 512K L2: 31 million (0.35 micron) Bus width: 64 bits front side; 64 bits to L2 cache Addressable memory: 64 gigabytes Virtual memory: 64 terabytes Pin count: 387 (Dual Cavity Pin Grid Array Package) Package dimensions: 2.46 (6.25cm) x 2.66 (6.76cm) Typical use: High-end desktops, workstations, and servers Pentium® Pro Processor (180 MHz) Introduction date: November 1, 1995 Clock speeds: 180 MHz (7.29 SPECint95, 6.08 SPECfp95 on Alder 256K L2) Number of transistors: 5.5 million (0.35 micron), 256K L2: 15.5 million (0.6 micron) Bus width: 64 bits front side; 64 bits to L2 cache Addressable memory: 64 gigabytes Virtual memory: 64 terabytes Pin count: 387 (Dual Cavity Pin Grid Array Package) Package dimensions: 2.46 (6.25cm) x 2.66 (6.76cm) Typical use: High-end desktops, workstations, and servers Pentium® Pro Processor (200 MHz) Introduction date: November 1, 1995 Clock speeds: 200 MHz (8.09 SPECint95, 6.75 SPECfp95 on Alder 256K L2) Number of transistors: 5.5 million (0.35 micron), 256K L2: 15.5 million (0.6 micron), 512K L2: 31 million (0.35 micron) Bus width: 64 bits front side; 64 bits to L2 cache Addressable memory: 64 gigabytes Virtual memory: 64 terabytes Pin count: 387 (Dual Cavity Pin Grid Array Package) Package dimensions: 2.46 (6.25cm) x 2.66 (6.76cm) Typical use: High-end desktops, workstations, and servers Pentium® II Processor (233 MHz) Introduction date: May 7, 1997

426

Understanding Personal Computers

Clock speeds: 233 MHz (9.49 SPECint95, 6.43 SPECfp95) Number of transistors: 7.5 million (0.35 micron), 512K L2 Bus width: 64 bit System Bus w/ECC; 64 bit Cache Bus w/opt. ECC Addressable memory: 64 gigabytes Virtual memory: 64 terabytes Single Edge Contact cartridge (S.E.C.), 242 pins Package dimensions: 5.505" (12.82cm) x 2.473" (6.28cm) x 0.647"(1.64cm) Typical use: High-end business desktops, workstations, and servers

Pentium® II Processor (266 MHz) Introduction date: May 7, 1997 Clock speeds: 266 MHz (10.8 SPECint95, 6.89 SPECfp95) Number of transistors: 7.5 million (0.35 micron), 512K L2 Bus width: 64 bit System Bus w/ECC; 64 bit Cache Bus w/opt. ECC Addressable memory: 64 gigabytes Virtual memory: 64 terabytes Single Edge Contact cartridge (S.E.C.), 242 pins Package dimensions: 5.505" (12.82cm) x 2.473" (6.28cm) x 0.647"(1.64cm) Typical use: High-end business desktops, workstations, and servers Pentium® II Processor (300 MHz) Introduction date: May 7, 1997 Clock speeds: 300 MHz (11.6 SPECint95, 7.20 SPECfp95) Number of transistors: 7.5 million (0.35 micron), 5I2K L2 Bus width: 64 bit System Bus w/ECC; 64 bit Cache Bus w/opt. ECC Addressable memory: 64 gigabytes Virtual memory: 64 terabytes Single Edge Contact cartridge (S.E.c.), 242 pins Package dimensions: 5.505" (12.82cm) x 2.473" (6.28cm) x 0.647"(1.64cm) Typical use: High-end business desktops, workstations, and servers

Microprocessor Ratings Intel 8088 5 MHz - (0.33 MIPS) 8 MHz - (0.75 MIPS)

A8. Intel Quick Reference Guide

427

Intel 80286 8 MHz - (1.2 MIPS) 10 MHz - (1.5 MIPS) 12 MHz - (2.66 MIPS per power meter 1.5 benchmark, assumes 0 wait state memory access) Intel386™ DX CPU 16 MHz - (5 to 6 MIPS) 20 MHz - (6 to 7 MIPS) 25 MHz - (8.5 MIPS) 33 MHz - (11.4 MIPS) Intel386™ SX CPU 16 MHz - (2.5 MIPS) 20 MHz - (4.2 MIPS) Intel386™ SL CPU 20 MHz - (4.21 MIPS) OR (4.3 UNIX environment Dhrystone V 1.1) 25 MHz - (5.3 MIPS) Intel™ DX CPU 25 MHz - (20 MIPS) 33 MHz - (27 MIPS, 22.4 SPECint92) 50 MHz - (41 MIPS, 33.4 SPECint92) Intel™ SX CPU 16 MHz - (13 MIPS) 20 MHz - (16.5 MIPS) 25 MHz - (20 MIPS, 12 SPECint92) 33 MHz - (27 MIPS, 15.86 SPECint92) IntelDX2TM Processor 50 MHz: 41 MIPS, 29.9 SPECint92 66 MHz: 54 MIPS, 39.6 SPECint92 IntelDX4TM Processor 75 MHz: 53 MIPS, 41.3 SPECint92 100 MHz: 70.7 MIPS, 54.59 SPECint92

428

Understanding Personal Computers

Intel Pentium® Processor 60 MHz: 100 MIPS, 70.4 SPECint92 66 MHz: 112 MIPS, 77.9 SPECint92 75 MHz: 126.5 MIPS, 2.31 SPECint95 90 MHz: 149.8 MIPS, 2.74 SPECint95 100 MHz: 166.3 MIPS, 3.30 SPECint95 120 MHz: 203 MIPS, 3.72 SPECint95 133 MHz: 218.9 MIPS, 4.14 SPECint95 150 MHz: 4.27 SPECint95 166 MHz: 4.76 SPECint95 Intel Pentium® Processor with MMXTM Technology 166MHz: 5.59 SPECint95 200MHz: 6.41 SPECint95 233MHz: 7.12 SPECint95 Intel Pentium® Pro Processor 150 MHz: 6.08 SPECint95 166 MHz: 7.11 SPECint95 180 MHz: 7.29 SPECint95 200 MHz: 8.09 SPECint95 Intel Pentium® II Processor 233 MHz: 9.49 SPECint95 266 MHz: 10.8 SPECint95 300 MHz: 11.6 SPECint95

Index

1 1/3-stroke method, 365

3 32-bit color, 196

A Acab1e,224 access speed, 144 access time, 39, 244 ACCESS.Bus, 176 accumulator, 128 ACK,75 Acknowledge, 75 acknowledge signal, 334 active partition, 253, 259 adaptive Huffman encoding, 350 adaptor card, 14 address buffer, 387 address line, 74 address resolution, 394 ADP,320 Advanced SCSI Programming Interface (ASPI),228 Advanced Technology (AT), 118 all points addressable, 302

all points addressable (APA) graphics, 195 allocation unit, 273 alpha channel, 196 Alt,156 alternate key, 156 ALU,126 American National Standards Institute, 25 American Standard Code for Information Interchange, 25 amplifier, 388 amplifier -I/O gate, 388 amplitude, 340 amplitude modulation (AM), 342 amplitude-shift keying (ASK), 343 analog signal, 336 analog transmission, 336 anode, 177 ANSI, 18,221 ANSI character code, 25 anti-glare coating, 175 apelture grill, 181 aperture grill monitor, 180 application level, 21 Application level, 19 arbitration, 76 areal density, 237 arithmetic and logical unit (ALU),126

ASCII character code, 25 ASK, 343 aspect ratio, 188 ASPI,228 ASPI manager, 229 assembly language, 30, 45 asynchronous, 77, 225, 390 asynchronous data packet (ADP),320 asynchronous data transfer, 319 AT, 118 AT Attachment interface, 211 AT command set, 356 AT style serial connector, 331 AT Ainterface, 211 AT APacket Interface (ATAPI),216 ATAPI,216 attention characters, 356 Attenuation, 347 attribute byte, 191 automatic bad-sector remapping, 212 average codeword length, 350 average seek time, 244

430

Understanding Personal Computers

B B cable, 224 bad sector byte, 248 bandwidth, 186,201,345 bank switching, 197 base address, 396 basic input/output services, 20 baud rate, 343 Baudot, J.M. Emile, 343 beep code, 66 benchmark program, 40 bidirectional, 101 bidirectional communication, 296 big endian, 38 binary digit, 23 binary string, 23 BIOS data area, 70 BIOS level, 19, 20 BIOS parameter block, 264 BIOS service routines, 49 BIOS services, 49 bipolar encoding, 323 bit, 23 bit cell, 336 bit density, 237 bit line, 386 bit synchronization, 322 BitBlk transfer, 200 bit-block transfer, 200 bit-mapped graphics, 195 bitmapped image, 310 bleed,308 block code, 350 block graphics, 190 block transfer, 79 BNC connectors, 176 boot drive, 67 boot partition, 67 boot process, 64

bootrecord,67,261,263 boot sector, 67 bootable, 251, 258, 261 bootable diskette, 67 booting,66 bootstrap program, 67, 261,264 branch prediction logic, 127 bridge, 14,74,83 bubble jet, 304 buffer, 322 Burst EDO DRAM, 391 burst mode, 79 burst transfer, 391 bus, 13,73 address, 74 control,74 data, 74 Extended Industry Standard Architecture (EISA),86 IDE,74 Industry Standard Architecture (lSA or AT), 84 local,73 micro channel architecture (MCA), 86 peripheral, 74 Peripheral Component Interconnect (PCI), 87 processor, 73 SCSI,74 VESA local (VL), 87 width,74 bus arbiter, 76 bus arbitration, 105 bus controller, 76, 105 Bus grant, 75

bus interface unit (BUI), 126 bus master, 76,105 Bus request, 75 bus slave, 76

C cache, 96 blocks, 377 direct mapped, 136 fully associative, 136 level 2 or L2, 135 lines, 377 set associative, 136 tag, 377 ways, 380 write-back, 136 write-through, 136 cache controller, 135 cache hit, 135 cache memory,12 cache miss, 135 caching controller, 246 caddie, 365 CAM,228 capacitor, 141,385 carrier signal, 333, 338 cascading, 91 cathode, 177 cathode ray tube (CRT), 177 CCITT,18,331,353 CD-E,366 CD-Erasable (CD-E), 366 CD-R,366 CD-Recordable (CD-R), 366 CD-ROM sector, 364 central processor unit, 117 Centronics cable, 291 Centronics mode, 294 character generator, 193

Index

character graphics, 190 character impact printer, 301 character ROM, 193 character synchronization, 322 character-mapped display, 193 charge roller, 314 chipset, 109 CHKDSK, 279 CHS, 243 CHS address, 215 CISC,130 cleaning blade, 314 clock,110 Clock,75 clock bit, 410 clock cycle, 38, 77 clock doubling technology, 119 clock generator, 110 clock pulse, 410 clock signal, 77, 11 0 clock speed, 117 clock transition, 410 cluster, 273 cluster chain, 274 CMOS memory, 68 CMOS RAM chip, 68 code cache, 126 code segment register (CS),394 coherent light, 362 cold reboot, 165 color depth, 195 color graphics adaptor, 187 color lookup table, 198 color matching, 175 color temperature, 175 color triad, 177 column decoder, 388

command mode, 355 command overhead, 244 command processor, 65, 68 Common Access Method (CAM),228 communication parameters, 322 communications mode, 355 compatibility, 17 Compatibility mode, 294 compiler, 46 Complex Instruction Set Computer (CISC), 130 composite black, 305 COMx,8 configure, 15 Connector Types, 8 constellation, 344 contact recording, 237 contention, 76 continuous forms, 303 control bus, 20 Control cable, 5 control code, 26 control key, 156 control line, 75 control register, 103, 295 Control register, 41 control signal, 20 controller, 96 controller card, 14,206 controller electronics, 204 cooling fan, 6 counter, 111 CPU, 5,10,117 CPU fan, 6 CRC, 349 crossed cluster chains, 279 crosstalk,291 CRT,I77

431

Ctrl,156 Ctrl+Break, 165 Ctrl-Alt-Del, 165 Curie temperature, 367 current pulse graph, 410 cursor control key, 156 cutsheetfeede~303

cycle, 340 cycle time, 144, 389 cycles per second, 38 cyclic error-correcting code, 349 cyclic redundancy check (CRC),349 cylinder, 214, 242 cylinder skew, 247 cylinder/head/sector, 243 CYM color model, 305 CYMK,305

D DAC, 188, 198 daisy wheel printer, 299 data bit rate, 352 data buffering, 96 data cache, 126 data communication equipment (DCE), 331 Data Communication Equipment (DCE), 335 data compression, 338 data control register, 20 data line, 74 data output buffer, 388 data register, 103, 295 Data register, 41 data segment register (DS),394 data separator, 96 data separator electronics, 204 data synchronizer, 96

432

Understanding Personal Computers

data tenninal equipment (DTE),331 Data Terminal Equipment (DTE),335 data transfer rate, 39 data transmission mode, 355 daughter card, 188 DCl,334 DC3, 334 DCE, 331, 335 DDC, 176 DDC2B,176 debug, 37,370 decode unit, 126 delay rate, 156 demodulated,338 descriptor table, 396 desktop configuration, 4 destructive interference, 363 developer roller, 315 device BIOS, 55 Device Control 1 (DCl), 334 Device Control 3 (DC3), 334 device driver, 16, 56 device level, 203 device-independent, 219 digital output register, 102,210 digital signal, 335 digital to analog converter (DAC), 188, 198 digital transmission, 335 direct mapped cache, 377 Direct Overwrite (DOW), 367 dirty memory, 136 Disk Administrator, 252 disk geometry, 214

Disk Operating System, 21 disk track encoding, 410 disk-to-buffer transfer rate, 244 dispatch/execute unit, 132 display adaptor card, 173 display attributes, 191 display data control (DDC),176 display geometry controls, 175 display monitor, 173 Display Power Management Signaling (DPMS),174 display system, 187 DMA,213 bus mastering, 104 bus mastering, 105 third party, 104 DMA channel, 104 DMA conflict, 92 DMA controller, 104 DMA modes, 214 domain, 237, 367 domain length, 237 DOS, 21 DOS fIle system, 258 DOS functions, 53 DOS operating system level, 19 DOS partition, 252, 258 DOS service routines, 53 dot matrix printer, 301 dot pitch, 179 dot triad, 177 dot trio, 177 dot-clock, 186 dots per inch (dpi), 306 double-density diskette, 282 doubleword, 24

DOW,367 dpi,306 DPMS, 174 draft mode, 302 DRAM,141 drive bay, 4 drive capacity, 244 driver, 16 DTE, 331, 335 dual cartridge configuration, 305 dynamic RAM, 141

E ECC, 145 echo cancellation, 339 echoplex, 339 ECHS, 403 ECP, 297 EDORAM,390 EEPROM,142 EGA,188 EIA, 18,331 EIA/fIA-232,327,331 EIA/fIA-232-E,331 EIA-232-D,331 EIDE, 212 EISA bus, 86 electromagnetic deflectors, 177 electron gun, 177 Electronics Industries Association (EIA), 331 electrophotographic process, 312 ELF,174 embedded servo, 235 EMM,152 EMS, 151 EMS memory board, 151 EMS memory manager, 151

Index

EMS page frame, 152 end of frame character, 322 end-to-end flow control, 333 energy star compliant, 174 enhanced bidirectional port, 297 enhanced CHS (ECHS), 403 enhanced graphics adaptor (EGA), 187 enhanced IDE interface (EIDE),212 Enhanced Parallel Port (EPP),297 enhancing the BIOS, 61 enquire signal, 334 enquire/acknowledge (ENQlACK) flow control, 334 entry point, 59 EPdrum,313 EPprocess, 312 EPP, 297 EPROM,142 erase light, 314 error correction, 96 error detection, 96, 338 error-correcting code, 348 error-correcting code (ECC),145 error-detecting code, 348 escape character, 311 escape sequence, 311 ESDI,206 even parity check bit, 144 even parity scheme, 321 exa, 31 execution processor, 65 execution unit, 126

expanded memory specification (EMS), 151 expansion board, 14 Expansion cards, 5 expansion slot, 14 extended ASCn code, 25 Extended Capability Port (ECP),297 extended data out DRAM, 390 extended Hayes command set, 355 extended memory manager (EMM), 152 extended partition, 252 extending the BIOS, 61 external cache, 135 external communication subsystem, 10 external device, 13 extra segment register (ES),394 extremely low frequency (ELF),174

F fast ATA, 212, 216 fast ATA-2, 212 Fast ATA-2, 216 Fast Centronics, 296 FastIR,330 fast page mode DRAM, 390 fast SCSI, 222 Fast-20, 222 FAT, 261, 273 FAT fIle system, 252, 258 FAT32,279 FDISK,252 feature connector, 188 fetch/decode unit, 132

433

Fibre Channel (FC), 222 FIFO buffer, 328 fIle allocation table (FAT), 261, 273 fIle system, 251, 252, 258, 261 Firewire, 222 First-in-first-out (FIFO), 379 flag, 129 flash BIOS, 55, 64 flash RAM, 143 flat memory model, 398 flat square tube, 174 flip-flop, 141 floating point arithmetic, 118 floating point unit, 127 flux transition, 238 flux transition graph, 410 flux-sensitive heads, 240 formatting, 248 four cartridge configuration, 305 Fourier series, 345 FPM DRAM, 390 fragmented, 273 frame buffer, 189 frame rate, 183 frames, 322 frequency, 341 frequency modulation (PM), 342,411 frequency-shift keying (FSK),343 friction feed, 303 FSK,343 full stroke seek time, 244 full-duplex, 333, 339 full-height, 232 full-height bay, 232 fully associative cache, 378

434

Understanding Personal Computers

function key, 156 fundamental frequency, 345 fusing roller, 315

G gig, 31 giga,31 grain size, 237 graphics accelerator, 188, 199 graphics controller, 199 graphics coprocessor, 98 graphics mode, 190 graphics processor, 188, 199 group coding, 343 guard period, 355

H half-duplex, 333, 339 half-height, 232 handshaking, 224, 227, 333,355 handshaking mode, 355 hard-sectored,243 hardware event, 60 hardware flow control, 333 hardware interrupt, 60 hardware level, 19 Hardware level, 19 harmonics, 345 Hayes commands, 355 head,214, 242 head actuator, 233 Head Drift, 241 head switch time, 246 heat sink, 6 Hercules graphics card (HGC),187

hertz, 38 Hewlett Packard Graphics Language (HP-GL), 310,312 hex, 33 hexadecimal, 33 HGC, 187 high color, 196 high memory area (HMA),150 high performance file system (HPFS), 258 High Performance Serial Bus, 222 high-density diskette, 282 High-level formatting, 248,251 high-level language, 62 high-order byte, 37 HIMEM,153 HMA,150 hooking an interrupt, 61 horizontal recording, 238 horizontal retrace, 183 horizontal scan rate, 183 horizontal scanning frequency, 183 horizontal sync, 183 horizontal synchronization signal, 183 host, 203 host adaptor, 96 host adaptor card, 206 host communication electronics, 204 HPFS, 258 HP-GL, 310, 312 Huffman encoding, 350

I I/O, 95

I/O module, 95 I/O port, 100 I/O port conflict, 92 I/O port expansion card, 8 I/O processor, 99 I/O read, 75 I/O subsystem, 12 I/O write, 75 I/O-mapped 1/0,99, 100 iCOMP, 39, 124 IDE interface, 97, 206, 211 IDE peripheral bus, 74 IEEE,18 IEEE 1284,294 IEEE 1394,222 impact printer, 301 index wheel, 168 inductive head, 237 Infrared Data Association (IRED),330 initialization string, 357 initiator, 225 initiator-target model, 225 inline gun tube, 178 input device, 11 input/output, 95 input/output subsystem, 12 instruction pointer, 42, 129 instruction pool, 132 instruction prefetch unit, 126 Integrated Drive Electronics (IDE), 206 Intel Corporation, 11 Intelligent Device Electronics Interface (IDE),211 Intelligent Drive Electronics (IDE), 206 interface. 203

Index

device-level,218 system-level,218 interlacing, 185 internal device, 13 Internal peripheral devices, 5 International Telecommunications Union (lTV), 331 interrupt, 49, 60 Interrupt acknowledge, 75 interrupt driven, 89 Interrupt Enable Register, 329 interrupt level, 90 interrupt request, 90 Interrupt request, 75 interrupt vector table, 59 Invar, 179 IRED,330 IRQ conflict, 92 ISA bus, 84 ISO, 18

1TV,18,331 ITV-T,331

K keyboard buffer, 158 keyboard status bytes, 162 kilo, 31 kilobyte, 31

L landing zone, 236 lands, 360 latency time, 244 layers, 198 LBA,406 leading edge, 77 least frequently used (LFU), 136, 379

least recently used (LRU), 136,379 least significant bit, 23 least significant digit, 23 legacy BIOS, 93 legacy device, 93 length of a string, 23 letter quality mode, 302 level 1 cache, 126 level 2 cache, 126 LFU,136 Light Intensity Modulation Method (LIMM),367 LIMM,367 line control register, 328 line noise, 347 line probing, 354 linear address, 33 linear velocity, 359 little endian, 38 local bus, 17 local command mode, 355 local flow control, 333 logical block address (LBA), 221, 406 logical block addressing, 262 logical drive, 252 logical port, 101 logical sector addressing, 261 logical unit, 219 logical unit number (LUN),218 logical/physical port pair, 106,292 lost chain, 279 lost cluster, 279 low-level formatting, 248 Low-level formatting, 251 low-order byte, 37 LPTx.8

435

LRU, 136 LUN,218

M machine language, 45 magnetic coercivity, 366 magneto-optical drive (MO drive), 366 magnetoresistive (MR) head technology, 240 main memory, 12 mark,321 marking state, 321 master boot record (MBR),253 master drive, 211 MBR,253 MeA bus, 86 media descriptor byte, 264 media descriptor table, 261,264 meg, 31 mega, 31 megahertz, 38 memory conventional, 149 memory, 5 address block, 153 chip labeling, 147 dual ported, 144 dynamic RAM (DRAM),141 extended,152 RAM, 141 read-only (ROM), 141 segmented,150 SIMM,139 static RAM (SRAM), 141 upper, 149 memory bank, 197

436

Understanding Personal Computers

memory banks, 139 memory conflict, 92 memory controller, 387 memory map, 148 Memory read, 75 memory segment, 395 memory SUbsystem, 12 memory unit, 386 Memory write, 75 memory-mapped I/O, 99 MFM,206 MHz, 38 microcode, 126 micron, 120, 235 micron technology, 120 microprocessor, 5, 10, 117 microprocessor subsystem, 12 minutes per page (mpp), 307 misconvergence, 179 MMX,122 MNP,354 MO drive, 366 modem, 338 modem output bit rate, 352 modified frequency modulation (MFM), 412 modulate, 338 modulation/demodulation, 338 module, 12 monitor, 173 fixed frequency, 186 multi scanning, 186 Multisync, 186 monitor bandwidth, 186 monochromatic light, 362 monochrome display adaptor, 187

most significant bit, 23 most significant digit, 23 motherboard, 5 Motherboard, 81 mouse bus, 170 interface, 170 optical, 168 opto-mechanical, 168 protocol, 169 resolution, 171 serial, 170 mouse controller, 171 mouse data packet, 169 MR,240 multibit modulation, 343 multi-boot configuration, 260 multiple zone recording, 257 Multiple Zone Recording, 243 multiplexing, 76 multitasking, 118

N nanosecond, 88 narrow SCSI, 222 nibble mode, 296 NMI,91 nonmaskable interrupt (NMI),91 NT file system (NTFS), 258 NTFS,258 numeric coprocessor, 118 numeric keypad, 156 Nyquist, 353

0 odd parity scheme, 321

off-line command mode, 355 offset, 393 on-line command mode, 355 OPC drum, 313 operating system, 64 operating system level, 21 organic photoconductive drum, 313 oscillator, 110 output device, 11 overtone, 345

P P cable, 224 packed byte, 122 packed data type, 122 page directory, 399 page fault, 401 page frame, 399 page hit, 401 page table, 399 pages per minute (ppm), 307 paging, 193, 197,399, 402 palette, 199 paragraph, 393 parallel cable, 291 parallel device, 8 Parallel device, 291 parallel port, 8, 101 Parallel Port FIFO Mode, 296 parallel processing, 122 parity bit, 349 parity check bit, 157,320, 321 parity check error, 145 parity check scheme, 348 parity checking, 144

Index

parking the heads, 236 partition, 251 Partition Magic, 278 partition sector, 253 partition table, 253 Partitioning, 251 passband, 342, 345 pause key, 166 PCI bus, 87 PCL,310 peer-to-peer interface, 225 Pentium, 120 Pentium Pro, 121 peripheral device, 12, 13 persistence, 184 peta,31 phase modulation (PM), 342 phase shift, 341 phase-shift keying (PSK), 343 phosphor, 178 physical address, 33 Physical formatting, 248 physical layer, 203 physical port, 101 PIC, 89 picture element, 181 piezoelectric crystal, 304 PIO,213 PIO timing modes, 213 pipeline, 127 pipelining, 80 PIT,110 pits, 360 pixel, 180, 181 PJL,31O platter, 214, 232 Plug and Play (PnP), 93 PnP,93 polarized light, 368 polling, 89, 156

port, 8, 20, 100, 101 POST, 66 postscript, 308, 310 power cable, 5 power management, 174 power supply, 5, 12 Power Supply, 7 power-on self test, 66 precharge unit, 387 present bit, 401 primary corona wire, 314 primary partition, 252 print engine, 312 Printer Control Language (PCL),31O Printer Job Language (PJL),31O printer language, 310 printer port, 294 printer resolution, 306 PrintScreen, 165 privilege level, 398 programmable interrupt controller, 89 programmable interval timer, 110 programmed I/O (PIO), 213 programming the hardware, 47 programming the registers, 47 PROM,142 protected mode, 153, 396, 398 protocol, 203, 294 protocol level, 203 pseudo color, 196 PSK,343 PSTN,336 Public Switched Telephone Network (PSTN),336

437

pure wave, 342

Q Q cable, 224 QAM,343 quadrature amplitude modulation (QAM), 343 quadrature phase-shift keying (QPSK), 343 quadword, 24 QuickBasic, 375

R RAM, 141 burst EDO, 391 EDO,390 fast page mode DRAM, 390 flash, 143 shadow, 144 synchronous DRAM, 391 VRAM,144 Windows, 189 RAMDAC,188 random access memory (RAM), 141 Random Access Memory Digital to Analog Converter (RAMDAC) , 188 RAS to CAS time, 88 Raster Image Processor (RIP),31O raster scanning, 183 read only memory (ROM), 141 read/write head, 214, 233, 237

438

Understanding Personal Computers

read/write head assembly, 283 read-only, 101 read-only device, 366 real mode, 153, 396, 398 real mode addressing, 393 real-time clock, 69 reboot cold, 66 power-on, 66 warm, 66 Recommended Standard 232,327 Recommended Standard 232 (RS-232), 331 Reduced Instruction Set Computer (RlSC), 130 redundancy, 348 refresh rate, 183 register, 20,41, 127 segment, 129 repeat rate, 155 RES, 316 Reset, 75 reset button, 165 resistive heads, 240 resolution, 182 resolution (SVGA), 182 resolution (VGA), 182 resolution enhancement algorithm, 316 resource conflict, 92 result code, 357 retire unit, 132 reverse channel communication, 296 RGB color model, 178 ribbon cable, 5, 14 RIP, 310 RISC,130 RLE, 298, 350 RLL,206 RLL codeword, 413

RLL decision tree, 413 RLL encoding, 413 ROM, 141 electrically erasable programmable (EEPROM), 142 erasable programmable (EPROM), 142 programmable (PROM), 142 ROM BIOS, 49, 64,141 root directory, 261, 268 rotational latency time, 244 rotational velocity, 359 rotator actuator, 234 row decoder, 387 RS-232,331 RS-232-C, 331 run length encoding (RLE),350 run length limited (RLL), 413 Run-Length-Encoding (RLE),298

S scan code, 157 SCANDISK, 279 SCSI, 206, 218 active terminator, 225 device class, 220 differential, 223 fast, 222 narrow, 222 passive terminator, 224 single-ended, 223 ultra, 222 wide, 222 SCSI adaptor driver, 228 SCSI bridge, 207, 218 SCSI bus, 74, 207, 218

SCSI bus phase, 225 SCSI device embedded, 218 SCSI host adaptor, 206 SCSI ill number, 218 SCSI interface, 97, 218 SCSI port, 219 SCSI-2,221 SCSI-3,222 SDRAM,391 sector, 214, 241, 242 sector header, 241, 248 sector interleaving, 245 sector trailer, 248 sector translation, 215, 257 seek time, 244 segment, 393, 395 segment descriptor, 398 segment type field, 398 segment offset format, 393 segmented addressing, 393 segmented memory model, 398 sequence byte, 272 sequence number, 272 sequential access, 141 serial communication, 319 serial device, 8 serial port, 8 service number, 49 service routine, 20, 49, 65,89 servo mechanism, 235 servo platter mechanism, 235 set associative cache, 380 setup program, 69 shadow mask, 178 shadowing ROM, 144, 201

Index

shift key, 162 signal repeater, 223 signal strength, 340 SIMD, 122 SIMM,139 capacity, 146 chip count, 147 data pins, 146 labeling, 147 packaging, 145 pin count, 146 Single cartridge configuration, 305 single in-line memory module (SIMM), 139 Single Instruction, Multiple Data SIMD), 122 skew, 222 skewing, 246, 319 slave drive, 211 slot pitch, 181 Small Computer Systems Interface, 206 soft-sectored, 243 software driver, 16 software flow control, 334 software interrupt, 60, 89 space, 321 SPEC, 417 spindle, 283 SPP, 294 square wave, 336 SRAM, 135, 141 S-register, 356 ST506,206 stack segment register (SS),394 Standard Parallel Port (SPP),294 standards, 61 start bit, 157, 320

439

start of frame character, 322 startup process, 64 Startup Process, 66 static eliminator, 315 static RAM, 135, 141 status register, 103,295 Status register, 41 stepper motor, 233, 283 stop bit, 157, 320 Storage register, 42 storage subsystem, 10 string, 23 stripe pitch, 181 super VGA resolution (SVGA), 182 super VGA standard, 188 Superl/O chip, 330 superscalar technology, 127 support chips, 5, 109 SVGA,188 swap ftle, 401 symmetric multiprocessing (SMP),133 synchronization, 409 synchronous,77,225,390 synchronous data transfer, 319 synchronous DRAM, 391 system BIOS, 49 system board, 5 system clock, 110 system crash, 154 system timer, 112

tera,31 terminal mode, 355 text mode, 190 thermal ink jet, 304 thin-ftlm head, 239 throughput, 364 TIA, 18,331 timeout, 330 timer, 110 timing diagram, 77 toggle key, 162 tower configuration, 4 tpi,237 track,214,233,242 track header, 248 track skew, 246 tracks per inch (tpi), 237 track-to-track seek time, 244 tractor feed, 303 trailing edge, 77 transducer, 96 transfer corona wire, 315 transfer roller, 315 transistor, 385 transmission bit rate, 322 trapping an interrupt, 61 trellis-coded modulation, 344 tricolor cartridge, 305 trigger level, 329 true color, 196 two-wire connection, 339 typematic action, 155

T

DART,327 DART output bit rate, 352 DART output data bit rate, 352 UltraATA,217 ultra SCSI, 222

target, 225 Telecommunications Industry Association (TIA),331

U

440

Understanding Personal Computers

UMB,150 unicode, 28 unidirectional mode, 296 Universal Asynchronous ReceiverfTransmitter (UART),327 universal serial bus (USB), 176 Universal Serial Bus (USB), 319 upper ASCn characters, 26 upper memory address block, 153 upper memory block (UMB),150 USB, 176

V V.32,353 V.32bis, 354 V.34,354 V.42,354 V.42bis, 354 V.fast,354 variable-length code, 350 vector graphics, 310 verbose mode, 357 vertical recording, 239 vertical refresh rate, 183 vertical retrace, 183 vertical scan rate, 183 vertical sync, 183 vertically flat tube, 174 very low frequency (VLF),174

VESA, 18, 184 VESA local bus, 87 VGA, 184, 188 VGA resolution (VGA), 182 video adaptor, 97 video adaptor card, 173 video controller, 97 video graphics array (VGA),188 video memory, 189 Video memory, 188 video memory bandwidth, 201 video mode, 190 Virtual 8086 mode, 154 virtual address space, 400 virtual machine, 154 virtual memory, 401 virtual memory management, 154 virtual memory manager (VMM),400 VL bus, 87 VLF,174 VMM,400 voice coil motor, 234 VRAM,144

Windows accelerator, 199 Windows API, 21, 65 Windows Application Programmer's Interface, 65 Windows Application Programming Interface, 21 Windows RAM (WRAM),189 word, 24 word length, 30 word line, 386 WORM, 366 WRAM, 189 Write Once Read Multiple (WORM), 366 write protect, 283 write-back caching, 136 write-black writing, 315 write-only, 101 write-through caching, 136 write-white writing, 315 writing to the hardware, 47

W

x-line, 156 XON/XOFF flow control, 334

wait state, 78, 135 warm reboot, 165 wavelength, 340 wide SCSI, 222 Winchester technology, 286

X

y y-line, 156