CCNP 01 - Methods

Troubleshooting Methodology

1. Define and State the Problem
2. Gather Facts and Analyze the Problem
3. Review and Evaluate Alternatives and Possibilities
4. Design a Plan of Action (POA)
5. Implement the Solution
6. Evaluate and Observe the Solution (Further Analysis)

Certification Summary

   Today in most enterprise networks many components are connected and attached to existing backbone infrastructures, to support the evolving strategies of a corporation. In this enterprise network it is likely that traffic traversing the network is violating the 80/20 rule. The 80/20 rule simply implies that 80 percent of traffic on a given network is local (destined for targets in the same workgroup); and not more than 20 percent of traffic requires internetworking. With mergers and acquisitions common in industry today, it is difficult to plan, deploy, and implement resources local to all users. As a result, the routers, switches, servers, and other nodes on the network are typically over utilized, thus bottle-necking the LAN segments and the backbone.

   As network administrators managing existing networks, plan and design new ones, they must make strategically sound decisions to accommodate additional users and devices. When new users are added to an existing LAN segment at the corporate office in New Jersey, and they require access to resources at branch locations in Baltimore, Richmond, and Atlanta, options are pretty limited. It makes no sense to deploy the same resources locally for one user. That is, from a design perspective it’s probably logical; however, it can become quite expensive. This is becoming a major concern for network administrators—the 80/20 rule has of late become the 20/80 rule.

   In today’s world of data communications and networking, complex heterogeneous network environments often complicate your efforts in solving network problems. And as the network evolves and grows, so does your challenge to deliver quality network service. Modern businesses rely on sophisticated networks for their livelihood; they can tolerate neither downtime nor degrading network performance. Yet it remains true that the very network on which the business depends is in most instances ignored until real problems occur, and you are faced with a crisis situation. Suddenly you’re getting calls from the Data Center VP, the head of Operations, the Finance VP—everyone is looking for estimated uptimes and root-cause analyses.

   To handle the occurrence of these problems and other complicated issues, LAN administrators must develop a thorough troubleshooting methodology to help resolve and isolate network problems. You can almost guarantee that there will be problems within your network because of design, installation, and product shortcomings. Troubleshooting is the process of:

Exposing relevant information
Eliminating irrelevant information
Deducing, pinpointing, and isolating the problem area
Devising a corrective action solution to correct the problem

In this chapter we will explore some of the symptoms and causes for many network problems prevalent in networks today. Often, the length of user downtime and out-of-service delays will depend on the plan of action you implement in your efforts to identify and resolve problematic areas. This chapter also outlines a working model for problem solving, presenting a generic model for you to plan your own course of action. Finally, the chapter gives an overview the characteristics of various protocols.

Symptoms, Causes, and Actions

On any given day, your network could be throttled to its knees. This is somewhat unlikely to happen all of a sudden. When they’re in trouble, networks are just like sick children. Just as children will tell a parent when they’re not feeling up to par, likewise your network will definitely alert you when it becomes ill. Usually when children are sick you’ll began to notice symptoms: a loss of appetite, high fever, runny noise, coughing and wheezing, stomach aches, and so on. The symptoms indicate the presence of disorder, and you might see anything from a trace indication foretelling the onset of a problem, to major subjective evidence of a severe physical disturbance. Your network, as well, will show symptoms when it’s ill.

One of the major symptoms commonly found in networks throughout organizations today is congestion. Initially, your network was probably designed to accommodate a reasonably small to medium-sized work group, to allow information sharing. Likewise, many interstate highways and parkways were designed to handle a reasonable size of automobile traffic. However, today we see traffic congestion in networks just as we do on highways. Network administrators and city officials are constantly planning and designing workarounds to help alleviate traffic. How often do you see network administrators implement weekend change-controls to fix an emerging or an existing network problem? How often do you see construction workers drilling and building alternative routes in the wee hours of the night to help accommodate the traffic demands?

Slow Response Time

A significant result of network congestion is slow response time, and this delay is of primary concern to the network administrator. Historically, LANs have been designed around data rates in the millions of bits per second—far in excess of most communications device capabilities. However, technological innovations and advancements in computing and data communications have changed this paradox somewhat. Many devices in today’s network infrastructures have the potential to use the full channel capacity of a typical LAN. When many of these powerful devices share the channel, it’s almost guaranteed that congestion will occur somewhere in your network. Congestion is a statistical phenomenon; it occurs as a function of traffic patterns. Network congestion will be revealed to LAN administrators and users in many ways.

Essentially all LANs have a certain capacity for carrying data. When presented with a short-term overload, the LAN distributes that load over time. Under a casual load, the time it takes a host to submit a frame for transmission over the LAN will be short. When the offered load is heavy, the average delay will increase. Simply stated, it takes longer to send the same amount of data under congestion conditions than it does when the load is casual. Usually this is transparent to users. However, when the delay is noticeable, the solution center or call center (help desk) gets flooded with calls from users complaining about slow response times.

Performance Alerts and Notifications

Many variables of LAN operations can be used to measure and evaluate network performance. Depending on configurations, some variables are measured by standard controllers and host software; others require special network-monitoring equipment, such as protocol analyzers or remote monitors (RMONs). To get a close-up look at the performance of your network, you can use various quantifiable and qualitative metrics to help identify potential problems. Following are some key indicators that can be used.

Channel Utilization

The percentage of time in which the channel is busy (or fully utilized) carrying data is known as the channel utilization. To determine what constitutes reasonable utilization levels, you should take into consideration the number of stations on a LAN segment, application behavior, frame-length distribution, traffic patterns, time of day, and so on. Pay close attention to utilization levels such as the following; they can be helpful in determining excessive load conditions:

Utilization exceeds 10–20% averaged over an 8-hour work day.
Utilization exceeds 20–30% averaged over the heaviest-load hours of the day.
Utilization exceeds 50% averaged over the heaviest-load 15 minutes of the day.

Exam Watch: For every short-term period, network utilization may be peaking toward 100%—without noticeable network delays or performance degradation. This is possible during a large FTP file transfer between two high-performance end stations. One thing to keep in mind is that each application environment behaves differently, depending on variables such as bandwidth, routing protocols, and so on.

User Complaints

We’ve all had our share of complaints from end-users. When the phone queues are backed up, it’s a strong possibility that a crisis situation is occurring on your network. Therefore, we consider user dissatisfaction as the ultimate manifestation of LAN congestion. You can collect and compile, store and compare statistics, but none of that matters if users are complaining about poor network performance. Often overlooked, user reaction and feedback are critical metrics in the early steps of evaluating and troubleshooting your network.

One thing to keep in mind is that user complaints do not actually mean there is a LAN congestion problem. Often what’s reasonable to the user is not necessarily indicative of poor network performance. Users generally have high expectations for technology and are reluctant to recognize that technology has its shortcomings.

Collision Rates

When there is an increased rate of collision on an Ethernet segment, it can be indicative of the offered load. Usually LAN administrators view modest collision rates as normal; nor does a high rate necessarily indicate a severe problem. On an Ethernet segment, collision information can be used to redistribute the offered load over the available time, thus maximizing application throughput and channel utilization. On a functional Ethernet segment, you can have what is considered short-term collision rates as high as 20%–25%; anything exceeding 30%, however, should be studied and administered carefully.

Monitoring and worrying about the number of collisions seen on Ethernet networks is a growing preoccupation of network administrators these days. There have been many debates and misconceptions about what an "acceptable" collision rate is, and how to determine when your network is on the verge of collapsing. Even under the most extreme offered loads, collisions use up a very small percentage of available channel capacity. Generally, you can ignore collision statistics as long as users are satisfied with response times and application efficiency. On the other hand, if users begin complaining, consider investigating collision rates as one source of explanation.

Application Performance Degradation

LAN congestion will ultimately affect application throughput. From the user’s perspective, "everything is slow." A daily routine file transfer that normally takes 15 to 20 seconds on average is exceeding one minute. Normal application transactions are behaving sluggishly. If utilization peaks are reached and network loads become burdensome, there is a good chance that an application will be brought to its knees. This may cause a user request to time out, servers to disconnect, and sessions to hang. Usually under these extreme conditions, applications and processes must be restarted and re-initialized to restore functionality.

Within an enterprise network, there are additional symptoms and problems other than network congestion and slow response. However, these two issues are the focus of our discussion here because they are what network administrators must typically troubleshoot on a daily basis—especially in a large distributed environment with diverse application requirements. With these problems at hand, network administrators are looking at other designs to help alleviate and reduce channel loads. Let’s take a look at one of these: the switching approach to networking.

Generic Problem-Solving Model

   As a troubleshooter, you must have a process or procedure defined to help you resolve a particular network problem. When a plumber comes to your house to fix a plumbing problem, he or she implements a standard procedure to identify the root cause. Likewise, an automobile mechanic must resort to a method or procedure of identifying problematic area in your vehicle. In a network environment, it is imperative for you to use a systematic approach to troubleshooting. A good start would be to devise a generic problem-solving model to aid your efforts to identify, isolate, and resolve the problem.

   A problem-solving model starts with a broad domain, and as you drill down each phase, the domain narrows as potential signs and symptoms are eliminated. This model flirts with the unknowns and possibilities of a given problem situation. For instance, let’s use a simple mathematics problem as supporting evidence:

   In this problem, you have an unknown (x) with a number of possibilities (not necessarily correct answers). You work to eliminate the obvious wrong answers (process of elimination).

Steps to Solving a Problem

The math problem shown earlier is a simplistic view of problem solving. Although applying a problem-solving model to a network crisis situation is much more difficult, there still is a fundamental approach one must take when presented with a number of unknowns and possibilities.

A generic problem-solving model has at least six phases (shown in Figure 1-1) that need to be successfully satisfied before a problem has been resolved. Each phase represent a process in which a particular action is required. This entire process is not just for troubleshooting inter-networking problems but can be used in various problematic domains. From this foundation you can build a problem-solving process to suit your own particular crisis situation. Here are the steps of the problem-solving process: Define and state the problem.

Gather facts and analyze the problem.
Review and evaluate alternatives and possibilities.
Design a plan of action (POA).
Implement the solution.
Evaluate and observe the solution (further analysis).
Repeat the process (if needed)

1. Define and State the Problem

   A good rule of thumb is to make a list of resources—people, documentation, Web sites, contact information, and the like—that are relevant to the problem you are trying to solve. Utilize these resources to clarify any unfamiliar terms or concepts, and to help identify the problem domain. At this point in the process you are in search of symptoms—evidence that a problem does exists. When analyzing a network problem, you need to make a clear problem statement. This will help you will see the big picture and not unduly simplify or complicate the problem domain.

   Try to define the problem in terms of a set of symptoms and potential causes of those symptoms. For example, say a host or multiple hosts on a particular LAN segment are not responding to service requests from clients. One possible cause may be that a host’s or router’s configuration is incorrect, resulting in timeouts for service requests.

   A good rule of thumb is to ask yourself several questions: Which problem should I address? If there are several, how do I choose the most critical one?

2. Gather Facts and Analyze the Problem

   After you have compiled supporting evidence for the existence of the problem and have defined what that problem is, you can focus on analyzing your findings more thoroughly. You are looking for relevant data that might explain why the problem exists and/or how it occurred. Collect information from sources such as router logs, users’ experience, trouble tickets, output from router diagnostics commands, software release notes, protocol analyzer traces, and other troubleshooting commands. This particular phase is aimed at evaluating collected information and data to further isolate possible causes. Ask questions of affected users, network administrators, managers, and other key people.

   Using the facts that are gathered, you can begin to eliminate irrelevant issues and narrow the problem domain. For instance, you can completely eliminate hardware as a contributing factor if findings point to other problematic areas, allowing you to focus and pay more attention to pertinent symptoms and specific problem domains.

3. Review and Evaluate Alternatives and Possibilities

   First and foremost, your problem solving should begin with an established criteria for evaluating solutions. A good starting point is to set an objective with your team. Based on your definition of the problem and root-cause analysis, the objective should be the specific goal that acceptable solution should attain. If the problem domain is too complex to handle with one objective, the alternative is to evaluate proposed solutions and make a list of Musts and Wants. The Musts are the basic items that require a solution. The Wants are qualities that are desirable in any solution; these should be prioritized from most desirable to least desirable.

4. Design a Plan of Action (POA)

   This planning phase is one of the most overlooked steps of the problem-solving model. Designing a plan of action (POA) involves planning and brainstorming. Begin with the most common problem and devise a POA in which only one variable is manipulated.

   What are the various alternatives for solving the problem? Once you’ve established some basis for evaluating the solutions, as described in the preceding Review and Evaluate step, it’s important that you brainstorm the solutions. From the list of possible solutions that emerge from this thinking session, select the one that best fits your needs based on your criteria evaluation.

5. Implement the Solution

   In this step, ask yourself "How do I make sure the solution is implemented correctly and effectively?" Here you’re simply putting the plan into action, performing tasks and testing to ensure that the solution has truly corrected problematic areas.

6. Evaluate and Observe the Solution (Further Analysis)

   How did the solution work? What needs to be changed? In this phase you can continue to collect and evaluate information from router logs, users, output from router diagnostics commands, software release notes, protocol analyzer traces, and other troubleshooting commands.

   Be on guard for intermittent problems—a symptom that cannot be reproduced by any known procedure. Intermittent problems are very difficult to resolve (troubleshoot). Because you don’t know how to reproduce the symptom, it’s difficult to know whether the symptom went away because of a test you performed, or because of random chance.

   If in fact you can’t reproduce the symptom, chances are your only alternative is routine maintenance and guesswork, and you may not be successful in trying to resolve the problem.

Repeat the Process

   If your network’s problem persists, repeat steps 2 through 6. Look again at your facts and try some different solutions.

Pointers for Problem Solving

   One thing I’m always mindful to avoid is rushing to a solution too soon. It’s best to avoid focusing too much attention on one solution, too early in the troubleshooting process. We’ve all been guilty of this at some point. Thoroughly go through each phase; don’t omit anything. Refrain from acting on the first suggestion of a solution, or before the problem has been adequately defined. Troubleshooting can be like finding a needle in a haystack, so be patient—solutions do not always come on the first go-around.

   Many people have a low tolerance for the uncertainty of problem solving; therefore, they dodge problem-solving activities. This impatient approach leads to what I call "patchwork" or "a quick-fix"; this attitude often seeks to eliminate the problem without any thorough evaluation or analysis. Try to remain open-minded, willing to endure the ambiguity and doubt, so that you can perform a thorough analysis by using every step of the problem-solving model.

Recommended Troubleshooting Procedures

   In today’s world, businesses develop sophisticated enterprise network strategies to gain a competitive advantage in their industry. Because the business relies heavily on its enterprise network, there is no room for ongoing out-of-service delays or poor network performance. Often the small symptoms are ignored, and over time they will flare up into a crisis situation. With every crisis situation you will need a staff of competent individuals who can troubleshoot the problem.

   Before you can effectively troubleshoot, you must be able to determine if a problem does exist. In other words, you must have an idea of what your network looks like before it becomes ill. If you never get a clear picture of a functional network, then it will be difficult to determine the true status of your network if a problem does exist. A parent knows how a child behaves before the child becomes sick. By knowing the child’s behavior patterns, you can detect a problem if that child’s behavior deviates from what is considered the norm. The same applies with your network.

   As a network administrator, you should take the pulse of your network through a regularly scheduled sampling of network traffic. These samples will give you an indication of the state of your network, identifying potential areas of concern, potential equipment failures, and performance loss due to the network’s having superseded its existing configuration. By taking your network pulse, you can strategically plan maintenance schedules that will help maintain the integrity of your network and fix subtle discrepancies in your network as they emerge.

   You can use a LAN tester or a protocol analyzer to collect network data and store it in the test equipment. Small cable meters and LAN testers can be used to capture and gather readings for a short period of time. Some common devices that can store data for future analysis are Fluke’s LANMeter series, Microtest’s Compas, and Scope’s FrameScope. The large-scale and complex protocol analyzers tend to be PC-equipped. These LAN devices can sample data over a longer period of time, storing megabytes of data. Generally, this information is used to track intermittent network glitches.

   An effective troubleshooter should have a hardcopy of network topology, and know and understand host and IP addresses of network devices, routing protocols, circuit identification number (WAN links), slot and port numbers, MAC addresses, and so on. A troubleshooter without a network diagram is like a repairperson without a tool belt.

   Consider the following situation: You get a call from the manager of Finance. Users are complaining about network connectivity problems. How do you proceed to troubleshoot the problem? First and foremost you need to have a thorough understanding of the existing network and the protocols directing the flow of network traffic. The following key elements will assist you with your initial troubleshooting. (More tools may be available to you, depending on the problem domain.)

Topology map
Access to workstations, PCs, and host configuration information
IP and MAC addresses and host names (DNS)
Configuration information for the router on the same LAN segment
Ping, Traceroute, Telnet, Netstat, and other commands to help determine device status
LAN test devices and protocol analyzers to monitor and evaluate network traffic

   The Ping application sends packets of information to a specific address or range of addresses and then notifies the sender when a reply is received, indicating size of packets, packet loss, and the amount of time it took.

Exam Watch: After monitoring network traffic, you can use a ping test to actively test your network. These tests will provide you with detailed statistical information (depending on command options used) about network traffic and particular devices.

   The Traceroute application provides more descriptive information than Ping. Traceroute returns host and timing information about each "hop" that a data packet takes to get to a specific address. This tool is very helpful when used to determine where packets are being discarded or slowed down.

   Another helpful tool for troubleshooting is SNMP (Simple Network Management Protocol), used to monitor routers within a given network.

   The basic rule of troubleshooting is to isolate the problem by a process of elimination. A good starting point would be to identify the scope of the problem:

Are there any common symptoms?
Is the problem only on a single device?
Are several, but not all, devices experiencing the same problem?
What services are being affected?

Documentation Methods

Managing a network environment is not an easy task, becoming gradually more complicated as the network grows and changes. The more detailed your documentation, the easier it is for you to troubleshoot. Documenting your network is an ongoing process. You should update your documentation whenever a node changes, when an IP address is added or removed, when software or hardware is installed and upgraded, when a network fails, and so on.

Gather and collect statistical information baselining your network, and use that information for documentation. The information can be converted into a spreadsheet or chart for observation and analysis. Accurately documenting your network is critical. Generally, a good rule of thumb is to develop data sheets for your network with the following statistical information:

Utilization levels (peaks, averages)
Frame transmission per second (peak, averages)
Frame size (peak, averages)Collision rates (peak, averages)
Jabbers, runts, and fragment totals
CRC errors

Exam Watch: Baselining your network is critical in determining the health of your network, in addition to identifying the problem areas. Baselining also provide network analysis for problem isolation and resolution.

   CiscoWorks (discussed in Chapter 8) is an effective management tool for proactive monitoring, providing the functionality for effective baselining. If you can effectively manage your network by deploying network management systems to ensure fault, performance, configuration, security, and accounting management, then half of the battle is won. You are then in a better position to anticipate and identify failures and network problems—before they effect users, servers, and workstations.

   For documentation in today’s environment, you can build Web sites with on-line documentation on your corporate Intranet. You can develop interactive databases that interface with the front-end of network devices, polling data to populate data records. Posting generic diagrams and system information for internal use only is also a viable method. Since everyone has their own methods for creating documentation, it’s in your best interest to install a method for consolidating information. This improves data integrity, and ensures that everyone on your team will have access to the same relevant and accurate documentation. Discrepancies in the documentation can result in ineffective troubleshooting and problem solving, thereby prolonging outage situations.

   Cisco’s Web site is a good resource for documentation and troubleshooting tips: http://www.cisco.com. There you’ll find helpful on-line documentation of Cisco products, many troubleshooting tips, and assistance with problem resolution.

On the Job: In the network arena, the elements within the network are constantly changing. Within a year’s time the network topology could change at least twice a month. Documentation must be updated to reflect the changes in network design, configurations, hardware locations, contact numbers, process owners, and so on. Ultimately, controlling a full-blown crisis situation could depend on how well you keep accurate and precise documentation. Documentation to a troubleshooter is like a blueprint for an interior decorator, or scouting notes for a coach. In essence, documentation provides the framework for effectively building a clear understanding of a given environment. This will help you conceptualize and interpret the big picture.

Protocol Characteristics Review

   You’ll need to be familiar with several protocols in order to deliver effective troubleshooting and problem analysis. The flow of traffic across the network is governed by protocols. They work just like traffic regulations and restrictions (speed limits, traffic lights, one-way zones, and so on) that control and regulate the flow of traffic on highways from one destination to another. This section reviews these protocols, discussing their various characteristics.

Connection-Oriented vs. Connectionless

   Connection-oriented is comparable to sending a letter that requires a return receipt. It requires adjacent nodes to acknowledge each frame or packet of data received. Connectionless service is comparable to sending a letter by standard, regular mail. There is no guarantee that the letter will be delivered on time, thus providing best-effort delivery.

   We can use two transport protocols to make the distinction: TCP and UDP. TCP is connection-oriented, reliable protocol. It provides a control mechanism to ensure that data is sent and received successfully. For instance, if you send certified mail, it requires a signature for delivery, an acknowledgment indicating that the item was delivered and received. TCP provides the reliability by using sequence numbers and acknowledgments. TCP is also responsible for breaking messages into segments, re-assembling them at the destination station, resending anything that wasn’t received, and re-assembling messages from the segments. In contrast, UDP is connectionless, responsible only for transmitting the message. UDP provides no software checking for segment delivery, and thus is deemed unreliable.

   A connection-oriented protocol has the following characteristics: Request for service establishes the circuit (session setup and disconnect, handshakes)

Provides virtual circuits
Provides switched virtual circuit
Provides permanent virtual circuit
Provides sequential delivery of packets and frames
Single path through network for all packets and frames

A connectionless protocol has the following characteristics:

Provides dynamic flow through the network
Provides alternative paths
Receiving nodes must be able to manage nonsequential acknowledgments
There is potential for nonsequential delivery of packets and frames

Figure 1-2 indicates some connection-oriented and connectionless protocols.

Figure 2: Connection-oriented and connectionless protocols

Ethernet

   Ethernet is the oldest of the LAN technologies commonly used throughout the business industry today. Regardless of its ancient history, Ethernet is still deployed today in traditional LAN infrastructures.

   Ethernet’s design forces stations to contend against each other for access to media. This contention-based nature of Ethernet prevents stations from exploiting the available 10 Mbps bandwidth. Realistically speaking, each individual station on the shared media only receives a small percentage of the available bandwidth. However, each station is capable of receiving all transmissions from all stations, but only in a half-duplex mode. This means a station cannot send and receive data at the same time.

   In the Ethernet architecture, when a station desires to transmit, it checks the network to make sure that no other station on that segment is transmitting. If by chance the network is idle, the station then proceeds with its transmission. The sending station monitors the network to ensure that no other station on the segment is transmitting. There is a strong possibility that two stations on the same network can transmit at the same time, which inevitably produces a collision (Figure 1-3). Once a collision is detected, a special bit pattern called a jam signal may be sent by other stations to ensure that all stations recognize the collision and defer any pending transmissions. All stations then stop transmitting for a randomly selected time frame before they are permitted to retransmit.

Figure 3: Collision domain on an Ethernet LAN segment

Ethernet uses an access method called Carrier Sense Multiple Access/Collision Detection (CSMA/CD) to detect the interference caused by simultaneous transmission by two or more stations. CSMA/CD manages collisions; however, it also increases transmission time in two ways:

If two stations transmit simultaneously, the information transmitted will collide at some point; therefore, each station must stop its transmission and retransmit at a later time.
Once a stations sends information, an Ethernet LAN will not transfer any other information until that information has successfully reached its destination.

   In essence, although CSMA/CD slows up Ethernet networks, it can effectively manage transmission on the wire. As a network administrator you must weigh the pros and cons and determine the most strategically sound solution for your local networking.

Token Ring

   Token Ring, developed by IBM, is a LAN technology in which all stations are connected in a ring or star topology. Token Ring helps reconcile the contention-based access to the transmission medium by granting every station equal access. Token Ring uses a token-passing process to prevent collision between two stations that want to send data at the same time. The token passing allows only the station with the token to have access to the ring for data transmission. When a station wants to transmit, it waits upon the arrival of the token. Essentially, to transmit you need to possess the token. Upon the arrival of the token, the station creates a data frame and transmits it onto the wire. The stations then relay the frame around the ring until the frame reaches its respective destination.

   In Token Ring implementation, stations on the ring are connected through a central hub: Media Access Unit (MAU). Often, network administrators will interconnect MAUs to extend the Token Ring LAN connectivity from various floors and parts of the building. Usually, a MAU can connect up to eight Token Ring stations.

   The Token Ring operation is quite simple: A station can only transmit when it receives the token frame (though exceptions can be negotiated). A station that receives the token with no data to transmit simply forwards the token to the next station, and this continues until the token reaches a station with data for transmission. The station that seizes the token to transmit data must also alter the T bit of the frame. In Token Ring implementation there is a mechanism in place to permit certain user-designated stations higher priority than other stations on the ring. This allows stations with the highest priority frequent access to the network for data transmission.

   To help detect faults on a Token Ring network, an active monitor is assigned the role of ring maintenance. The active monitor will discard any frame that traverses the ring continually. This frame can prevent all other stations of the ring from transmitting.

FDDI

Fiber Distributed Data Interface, commonly know as FDDI, is a standard for data transmission on fiber-optic lines within a LAN. FDDI implementation is based primarily on Token Ring implementation. An FDDI network contains two Token Rings: a primary ring, and a secondary ring for failure recovery. FDDI provides extensive reliability by providing up to 100 Mbps capacity. If the secondary ring is not needed for failure recovery, it can carry data, too, increasing capacity to 200 Mbps. The increased capacity makes FDDI the most feasible LAN solution for client/server applications. FDDI can also be deployed for workgroups or at the backbone.

FDDI implementation uses a dual-counter/rotating-ring topology to provide and maintain the necessary fault recovery and data integrity. FDDI consist of the following entities:

Physical Layer Media Dependent (PMD) defines the physical characteristics of the media interface connectors, cabling, and the services necessary for transmitting signals throughout the network.
Physical Layer Protocol defines the rules for encoding and framing data for transmission, clocking requirements, and line states.
Media Access Control mechanism defines the FDDI timed-token protocol, frame and token construction, and transmission on the FDDI ring, in addition to ring initialization and fault isolation functionality.

On an FDDI network you can have stations (nodes with no master ports) and concentrators (nodes with master ports). FDDI standards define two types of stations: dual attached station (DAS) and single attached stations (SAS). The DAS connects both the primary and the secondary rings. The SAS connects only one ring; it cannot wrap the ring in case of a fault or link ring failure.

FDDI initializes the ring and transmit data in the following manner: Connections are made between the stations on the ring and their neighbors.
Stations negotiate the target token rotation time (TTRT), using the claim-token process.
After a node has initialized the ring, the ring begins to operate in a steady state. A steady state simply means that the stations exchange frames using what is called a timed-token protocol (TTP). TTP defines how the TTRT is set, the length of time for which a station can claim the token, and the method by which a station initializes the ring. The ring will remain in the steady state until a new claim-token process occurs (a new station joins the ring).
The stations then pass the token around the FDDI ring.
A station on the ring captures the token when it wants to transmit data, and then transmit data to the downstream neighbor.
Each station reads and repeats frames as they receive them. If an error is detected, the stations will then set an error indicator.
A frame circulates on the ring until it reaches the station that first transmitted it. The station then removes the frame.
When the first station has sent all of its frames (or exceed the available transmission time) the token is then released back to the ring.

Transmission Control Protocol/Internet Protocol (TCP/IP)

Early research done by the U.S. Defense Advanced Research Projects Agency (DARPA) in computer interconnectivity gave life to the initial development of the Internet suite of protocols. The TCP/IP suite is governed by a set of standards that specify how computers communicate and the conventions for interconnecting networks and routing traffic. The Department of Defense presented a reference model for this new development, which separates the functions performed by communication protocols into manageable, stackable layers. Each layer controls a certain aspect for data transmission from source to destination. This model is not only designed for protocols to communicate across the network, but can also be used as a platform to develop protocols.

Four layers form this architectural design: the Application layer, Transport layer, Internetwork layer and the Network Interface layer. The Internetwork layer is commonly used to describe TCP/IP. The TCP/IP stack maps closely to the OSI reference model in the lower layers (see Figure 1-4). The information used in TCP/IP is transferred in a sequence of datagrams.

Figure 4: OSI model and TCP/IP internetworking model

Application Layer

   This layer of the architectural model provides the functionality for users and programs. It also provides the services accessed by user applications to communicate over the network. It is the layer in which user-access network processes reside, including all the processes with which users interact directly. One of the most important responsibilities of the Application layer is to mange sessions (connections) between applications.

   Application protocols exists for file transfers, Telnet, World Wide Web, e-mail, remote login, and network management elements (SNMP).

Transport Layer

   The Transport layer provides host-to-host communications. It ensures end-to-end data integrity and provides a reasonably reliable communication service for network elements that want to carry out a two-way communication. This layer also handles flow control, retransmission, and other data management mechanisms. The TCP and UDP protocols reside at the Transport layer.

Internetwork Layer

   These days, the Internetwork layer is called simply Internet Protocol (IP). This layer provides the basic packet-delivery service for all TCP/IP networks. IP is an unreliable, connectionless, best-effort datagram service that does not guarantee packet delivery. If by chance packets are discarded (for whatever reason) the sender and receiver are not notified.

   At this layer, a logical systematic scheme for host addresses is used, commonly know as IP addresses. These addresses identify devices connected to the network. Furthermore, other autonomous networks in the world use this address implementation to communicate on the information superhighway. The Internetwork and higher layers use these IP addresses for routing functionality. IP addressing consists of a 4-byte or 32-bit addressing scheme with a dotted decimal notation (as in 141.142.153.25). The entire scheme comprises two parts: a network portion and a host portion (see Figure 1-5). The maximum number for each octet is 255; it contains all 1s. Usually the first and last octet are reserved for the network and broadcast addresses, respectively.

Figure 5: Internet address format

Figure 1-6 shows the three conventional Classes (A, B, and C), their address ranges, and local numbers; and gives an example of each addressing scheme. In addition to these addresses, there are special Class D addresses used for multicasting, and a Class E address used for experimenting. The following are the totals of local addresses for each class:

Class A range Û 1.0.0.0 to 126.0.0.0 Û Number of host addresses: 16,000,000
Class B range Û 128.0.0.0 to 191.255.0.0 Û Number of host addresses: 64,000
Class C range Û 192.0.0.0 to 233.255.255.0 Û Number of host addresses: 254

Figure 6: Conventional address classes

Network Interface Layer

The Network Interface layer is responsible for getting data across the physical network (Ethernet, Token Ring, etc.). IP datagrams are encapsulated into frames at this layer. Mapping of an IP address to a physical address (MAC) is also performed by this TCP/IP layer. Once information is received from the peer layer (Internetwork), the Network Interface layer is then responsible for routing and adding its necessary routing information to the data (frame headers). The protocols at this layer perform three important functions:

They define how to use the network to transmit frames across a physical connection.
They exchange data between the computer and the physical network.
They deliver data between two end-stations on the same LAN segment, by using the physical addresses of the end nodes.

Novell IPX

The Novell NetWare Internetwork Packet Exchange (IPX) protocol was derived from the early Xerox Network Systems (XNS) Internet Transport Protocols. The XNS protocol suite is comparable to the TCP/IP protocol suite. Novell IPX utilizes a connectionless datagram protocol. IPX is NetWare’s network layer service, and provides addressing as well as routing capabilities.

Internetwork Packet Exchange (IPX) transmits each data unit as an independent entity without establishing a logical connection between the two end-stations. This protocol uses the IEEE 802.3 frame format without the LLC layer.
Sequence Packet Exchange (SPX) protocol is a connection-oriented protocol used for reliable peer-to-peer (client server) communications based on the XNS Sequence Packet Protocol. SPX is reliable, guaranteeing sequenced packet delivery.
NetWare Core Protocol (NCP) is Novell’s upper-layer protocol for connection-oriented transmissions. This protocol provides the framework that facilitates interaction between workstations and servers (client/server relationship).

   The devices on an IPX network use Service Advertisement Protocol (SAP) to advertise NetWare services and addresses. SAP advertisements make service available dynamically. Depending on SAP implementation in an IPX network, SAP broadcast can be propagated every 60 seconds; however, you can increase this window to effectively manage SAP broadcasts.

   In the IPX world, the node number and the network number define the network address, expressed in the network.node format. The network number is a 4-byte (32-bit) number that identifies a physical network. Each physical chuck of wire is required to have a single network number, which is the same number that you bind when a file server is configured. In an IPX implementation, Novell servers are inherently routers; therefore, the network number is contained and advertised from the file server.

   The node number identifies a node on the network as shown in Figures 1-7 and 1-8. The node number is a 48-bit MAC layer address of the physical interface.

Figure 7: Addressing scheme used in an IPX network

Figure 8: IPX network address, with interface and network number and node

Exam Watch: In the example given in Figure 1-7, a serial interface has a MAC address. Theoretically, this is not realistic, however. In the IPX implementation, serial interfaces use the default Novell address—usually the first address of an active interface.

AppleTalk

   AppleTalk is considered one of the purest client/server networking systems that rely on nodes (usually Macintosh workstations) sharing network resources (servers, printers, files, and so forth). In most cases, the interaction between devices is transparent to the users, since AppleTalk engages in network interaction and broadcasting functions, permitting communication between client and server. Many network professionals in the industry refer to AppleTalk as a "plug and play" network. An AppleTalk user can plug right into the network without making major configuration changes. The network services on the AppleTalk network are easy to locate.

   The native routing protocol for AppleTalk is the Routing Table Maintenance Protocol (RTMP). Routers on the network use this protocol to exchange and update routing information. RTMP calculates the shortest path to each destination.

   As an AppleTalk network grows, so does the traffic that occurs when hosts attempt to locate other devices on the network, and when routing information is exchanged by the routers on the network. One way to alleviate this enormous traffic overhead is to use Cisco IOS filters, ultimately improving the scalability of the AppleTalk network.

   Devices on the network are deployed in logical groups, commonly know as zones. The zone provides a way to localize and manage broadcast traffic and to create "communities of interest" (see Figure 1-9). Communities of interest are equivalent to segmented users in an Ethernet network. Users are logically grouped to share network resources.

Figure 9: An example of AppleTalk’s Communities of Interest implementation

A nonextended network allows 127 hosts and 127 server per network (shown in Figure 1-10). Only a single network number is allowed per wire, and only a single zone is allowed per wire. An extended network allows a total of 253 devices per wire (any combination of hosts and severs). A range of network numbers, called the cable range, is allowed per wire.

Figure 10: AppleTalk’s implementation of nonextended and extended networks

Certification Summary

   The information presented in this chapter presents a realistic framework that can be used to attack most (if not all) network crisis situations. However, if the framework is not utilized properly, quality solutions will not emerge from your troubleshooting efforts. The quality solutions will depend on your commitment and the quality you put into the problem-solving steps.

   By getting a complete and accurate symptom description and reproducing the symptoms, you ensure in most instances that you have fixed the symptom, or are at least on your way to resolving the problem.

   Avoiding steps could be catastrophic. Stay focused and try to avoid this pitfall. Time is precious and one mistake can often negatively impact the entire troubleshooting process.

   When you are at work, you are usually working on a team with other analysts, LAN administrators, and LAN specialists. Usually troubleshooting becomes a team effort; everyone is contributing expertise and helping to get the problem resolved. However, your certification will be a solo effort; therefore, you can’t rely on anyone but yourself. It is important for you to develop effective troubleshooting skills on your own. It is even more important for you to understand the problem domain and be able to thoroughly document your findings while analyzing and evaluating symptoms.

   Understanding the way in which protocols control the flow of data across the network will provide you with extra ammunition needed for effective troubleshooting. Although each protocol behaves differently, they all are designed to regulate and dictate the flow of traffic from source to destination over some common path.

   Finally, remember that troubleshooting is a set of procedures, mental tools, and attitudes that anyone can master. It’s up to you to develop the skills that are needed to become an effective troubleshooter. Have an idea of what your network looks like and how it behaves when it’s healthy.

Two-Minute Drill

One of the major symptoms commonly found in networks throughout organizations today is congestion.
A significant result of network congestion is slow response time, and this delay is of primary concern to the network administrator.
Many variables of LAN operations can be used to measure and evaluate network performance.
To determine what constitutes reasonable utilization levels, you should take into consideration the number of stations on a LAN segment, application behavior, frame-length distribution, traffic patterns, time of day, and so on.
For every short-term period, network utilization may be peaking toward 100%—without noticeable network delays or performance degradation. This is possible during a large FTP file transfer between two high-performance end stations. One thing to keep in mind is that each application environment behaves differently, depending on variables such as bandwidth, routing protocols, and so on.
As a troubleshooter, you must have a process or procedure defined to help you resolve a particular network problem.
The steps of the problem-solving process are: Define and state the problem.

Gather facts and analyze the problem.
Review and evaluate alternatives and possibilities.
Design a plan of action (POA).
Implement the solution.
Evaluate and observe the solution (further analysis).
Repeat the process (if needed).

Troubleshooting can be like finding a needle in a haystack, so be patient—solutions do not always come on the first go-around.
Before you can effectively troubleshoot, you must be able to determine if a problem does exist.
As a network administrator, you should take the pulse of your network through a regularly scheduled sampling of network traffic.
You can use a LAN tester or a protocol analyzer to collect network data and store it in the test equipment.
Some common devices that can store data for future analysis are Fluke’s LANMeter series, Microtest’s Compas, and Scope’s FrameScope.
An effective troubleshooter should have a hardcopy of network topology, and know and understand host and IP addresses of network devices, routing protocols, circuit identification number (WAN links), slot and port numbers, MAC addresses, and so on.
After monitoring network traffic, you can use a ping test to actively test your network. These tests will provide you with detailed statistical information (depending on command options used) about network traffic and particular devices.
The more detailed your documentation, the easier it is for you to troubleshoot.
Documenting your network is an ongoing process.
Baselining your network is critical in determining the health of your network, in addition to identifying the problem areas. Baselining also provide network analysis for problem isolation and resolution.
CiscoWorks (discussed in Chapter 8) is an effective management tool for proactive monitoring, providing the functionality for effective baselining.
You’ll need to be familiar with several protocols in order to deliver effective troubleshooting and problem analysis.
TCP is connection-oriented, reliable protocol. It provides a control mechanism to ensure that data is sent and received successfully.
UDP is connectionless, responsible only for transmitting the message. UDP provides no software checking for segment delivery, and thus is deemed unreliable.
Ethernet is the oldest of the LAN technologies commonly used throughout the business industry today.
Ethernet uses an access method called Carrier Sense Multiple Access/Collision Detection (CSMA/CD) to detect the interference caused by simultaneous transmission by two or more stations.
Token Ring helps reconcile the contention-based access to the transmission medium by granting every station equal access. Token Ring uses a token-passing process to prevent collision between two stations that want to send data at the same time.
Fiber Distributed Data Interface, commonly know as FDDI, is a standard for data transmission on fiber-optic lines within a LAN.
The TCP/IP model is not only designed for protocols to communicate across the network, but can also be used as a platform to develop protocols.
Novell IPX utilizes a connectionless datagram protocol. IPX is NetWare’s network layer service, and provides addressing as well as routing capabilities.
AppleTalk is considered one of the purest client/server networking systems that rely on nodes (usually Macintosh workstations) sharing network resources (servers, printers, files, and so forth)

CtMh

Search This Blog