Monitoring Servers Lesson 11
Skills Matrix Technology SkillObjective DomainObjective # Using the Reliability and Performance Console Monitor servers for performance evaluation and optimization 3.2 Using AuditingMonitor and maintain security and policies 3.3
Software Logs It is common practice for software products to save information about their ongoing activities to chronological lists called logs. By examining the logs, administrators can track the activity of the software, document errors, and extract analytical information. Logs are traditionally text files, which administrators open in an editor application, but the Windows operating systems have long used a graphical application for this purpose.
Event Viewer The operating system component that generates the Windows logs is called Windows Eventing. The primary function of the Windows Eventing engine, as always, is to record information about system activities as they occur and package that information in individual units called events. The application you use to view the events is called Event Viewer. In Windows Server 2008, Event Viewer takes the form of a Microsoft Management Console (MMC) snap-in.
Event Viewer The Event Viewer snap-in appears in Windows Server 2008 as a separate console, accessible from the Administrative Tools program group, and as part of other consoles, including Server Manager (under the Diagnostics node), and Computer Management (under System Tools). As with all snap-ins, you can also add Event Viewer to a custom MMC console.
Event Viewer
Custom View in the Event Viewer Console
Windows Logs When you expand the Windows Logs folder in the Event Viewer console, you see the following logs: – Application – Security – Setup – System – Forwarded Events
Windows Event Logs Information — An event that describes a change in the state of a component or process as part of a normal operation. Error — An event that warns of a problem that is not likely to affect the performance of the component or process where the problem occurred, but that could affect the performance of other components or processes on the system. Warning — An event that warns of a service degradation or an occurrence that can potentially cause a service degradation in the near future, unless you take steps to prevent it. Critical — An event warning that an incident resulting in a catastrophic loss of functionality or data in a component or process has occurred.
Event Properties Dialog Box
Viewing Applications and Services Logs When you expand the Applications and Services Logs folder in the console, you find additional logs for the various applications and services installed on the computer. Many of the roles and features that you can add to a Windows Server 2008 computer include their own logs that appear in this folder.
Types of Logs The four types of logs that can appear in this folder are as follows: – Admin — Contains events targeted at end users or administrators that indicate a problem and offer a possible solution. – Operational — Contains events that signify a change in the application or service, such as the addition or removal of a printer. – Analytic — Contains a high volume of events tracking application operation activities. – Debug — Contains events used by software developers for troubleshooting purposes.
Types of Logs By default, only the Admin and Operational logs are visible in the Event Viewer console, because these are the logs that can be useful to the average administrator. The Analytic and Debug logs are disabled and hidden, because they typically contain large amounts of information that is of interest only to developers and technicians. To display and enable these log types.
The Analytic and Debug Log Folders
The Properties Sheet for an Analytic Log
Additional Logs The Event Viewer console comes preconfigured with a large collection of additional logs for Windows Server When you expand the Microsoft and Windows folders in the Applications and Services Logs folder, you see a long list of Windows components. Each of these components has a pathway, called a channel, to its own separate log.
Windows Component Logs
Custom Views Another means of locating and isolating information about specific events is to create custom views. A custom view is essentially a filtered version of a particular log, configured to display only certain events. The Event Viewer console now has a Custom Views folder, in which you can create filtered views and save them for later use.
Custom Views
Reliability and Performance Console A computer’s performance level is constantly changing as it performs different combinations of tasks. Monitoring the performance of the various components over a period of time is the only way to get a true picture of the system’s capabilities.
Reliability and Performance Snap-in While the Event Viewer snap-in enables you to review system events that have already occurred, the Reliability and Performance snap-in enables you to view system information on a continuous, real-time basis. Like Event Viewer, Reliability and Performance is an MMC snap-in that you can launch as a separate console from the Administrative Tools program group; view from within another console, such as Server Manager or Computer Management; or add to a custom MMC console.
Resource Overview When you launch the Reliability and Performance Monitor console, you see the Resource Overview screen. This screen contains four real-time line graphs that display information about four of the server’s main hardware components. Each of the four components also has a separate, expandable section below the graphs, displaying more detailed information in text form, such as the resources being utilized by individual applications and processes.
Resource Overview
Performance Monitor Performance Monitor is another tool within the Reliability and Performance Monitor console that displays system performance statistics in real time. The difference between Performance Monitor and Resource Overview is that Performance Monitor can display hundreds of different statistics (called performance counters) and that you can create a customized graph containing any statistics you choose.
The Default Performance Monitor Display
The Performance Monitor Histogram View
Performance Monitor Report View
Adding Counters In the Add Counters dialog box, you have to specify the following four pieces of information to add a counter to the display: – Computer – Performance object – Performance counter – Instance
The Add Counters Dialog Box
Reliability Monitor Reliability Monitor is a new addition to Windows Server 2008 that automatically tracks events that can have a negative effect on system stability and uses them to calculate a stability index.
The Reliability Monitor Display
Bottleneck A bottleneck occurs when a component is not providing an acceptable level of performance compared with the other components in the system. Bottlenecks can appear for a variety of reasons including: – Increased server load – Hardware failure – Changed server role
Monitoring Processor Performance Processor: % Processor time – Specifies the percentage of time that the processor is busy. – This value should be as low as possible, with anything remaining below 85 percent most of the time being acceptable. – If this value is consistently too high, you should attempt to determine which process is using too much processor time, upgrade the processor, or add another processor, if possible.
Monitoring Processor Performance System: Processor Queue Length – Specifies the number of program threads waiting to be executed by the processor. – This value should be as low as possible, with values less than 10 being acceptable. – If the value is too high, upgrade the processor or add another processor.
Monitoring Processor Performance Server Work Queues: Queue Length – Specifies the number of requests waiting to use a particular processor. – This value should be as low as possible, with values less than 4 being acceptable. – If the value is too high, upgrade the processor or add another processor.
Monitoring Processor Performance Processor: Interrupts/sec – Specifies the number of hardware interrupts the processor is servicing each second. – The value of this counter can vary greatly and is significant only in relation to an established baseline. – A hardware device that is generating too many interrupts can monopolize the processor, preventing it from performing other tasks. – If the value increases precipitously, examine the various other hardware components in the system to determine which one is generating too many interrupts.
Monitoring Memory Performance A memory leak is the result of a program allocating memory for use but not freeing up that memory when it is finished using it. Over time, the computer’s free memory can be totally consumed, degrading performance and ultimately halting the system. Memory leaks can be fast, causing an almost immediate degradation in overall server performance, but they can also be slow and difficult to detect, gradually degrading system performance over a period of days or weeks. In most cases, memory leaks are caused by third-party applications, but operating system leaks are not unprecedented.
Monitoring Memory Performance Memory: Page Faults/Sec – Specifies the number of times per second that the code or data needed for processing is not found in memory. – This value should be as low as possible, with values below 5 being acceptable. – If this value is too high, you should determine whether the system is experiencing an inordinate number of hard faults by examining the Memory: Pages/Sec counter. – If the number of hard page faults is excessive, you should either determine what process is causing the excessive paging or install more random access memory (RAM) in the system.
Monitoring Memory Performance Memory: Pages/Sec – Specifies the number of times per second that required information was not in RAM and had to be accessed from disk or had to be written to disk to make room in RAM. – This value should be as low as possible, with values from 0 to 20 being acceptable. – If the value is too high, you should either determine what process is causing the excessive paging or install more RAM in the system.
Monitoring Memory Performance Memory: Available Mbytes – Specifies the amount of available physical memory in megabytes. – This value should be as high as possible and should not fall below 5 percent of the system’s total physical memory, as this might be an indication of a memory leak. – If the value is too low, consider installing additional RAM in the system.
Monitoring Memory Performance Memory: Committed Bytes – Specifies the amount of virtual memory that has space reserved on the disk paging files. – This value should be as low as possible and should always be less than the amount of physical RAM in the computer. – If the value is too high, this could be an indication of a memory leak or the need for additional RAM in the system.
Monitoring Memory Performance Memory: Pool Non-paged Bytes – Specifies the size of an area in memory used by the operating system for objects that cannot be written to disk. – This value should be a stable number that does not grow without a corresponding growth in server activity. – If the value increases over time, this could be an indication of a memory leak.
Monitoring Disk Performance PhysicalDisk: Disk Bytes/sec – Specifies the average number of bytes transferred to or from the disk each second. – This value should be equivalent to the levels established in the original baseline readings or higher. – A decrease in this value could indicate a malfunctioning disk that could eventually fail. – If this is the case, consider upgrading the storage subsystem.
Monitoring Disk Performance PhysicalDisk: Avg. Disk Bytes/Transfer – Specifies the average number of bytes transferred during read and write operations. – This value should be equivalent to the levels established in the original baseline readings or higher. – A decrease in this value indicates a malfunctioning disk that could eventually fail. If this is the case, consider upgrading the storage subsystem.
Monitoring Disk Performance PhysicalDisk: Current Disk Queue Length – Specifies the number of pending disk read or write requests. – This value should be as low as possible, with values less than 2 being acceptable per disk spindle. – High values for this counter can indicate that the drive is malfunctioning or that it is incapable of keeping up with the activities demanded of it. – If this is the case, consider upgrading the storage subsystem.
Monitoring Disk Performance PhysicalDisk: % Disk Time – Specifies the percentage of time that the disk drive is busy reading or writing. – This value should be as low as possible, with values less than 80 percent being acceptable. – High values for this counter can indicate that the drive is malfunctioning, that it is incapable of keeping up with the activities demanded of it, or that a memory problem is causing excess disk paging. – Check for memory leaks or related problems and, if none are found, consider upgrading the storage subsystem.
Monitoring Disk Performance LogicalDisk: % Free Space – Specifies the percentage of free space on the disk. – This value should be as high as possible, with values greater than 20 percent being acceptable. – If the value is too low, consider adding more disk space.
Monitoring Disk Performance Most storage subsystem problems, when not caused by malfunctioning hardware, are resolvable by upgrading the storage system. These upgrades can include any of the following measures: – Install faster hard disk drives. – Install additional hard disk drives and split your data among them, reducing the I/O burden on each drive. – Replace standalone drives with a RAID (Redundant Array of Independent Disks) array. – Add more disk drives to an existing RAID array.
Monitoring Network Performance Network Interface: Bytes Total/sec – Specifies the number of bytes sent and received per second by the selected network interface adapter. – This value should be equivalent to the levels established in the original baseline readings or higher. – A decrease in this value could indicate malfunctioning network hardware or other network problems.
Monitoring Network Performance Network Interface: Output Queue Length – Specifies the number of packets waiting to be transmitted by the network interface adapter. – This value should be as low as possible, and preferably zero, although values of two or less are acceptable. – If the value is too high, the network interface adapter could be malfunctioning or another network problem might exist.
Monitoring Network Performance Server: Bytes Total/Sec – Specifies the total number of bytes sent and received by the server over all of its network interfaces. – This value should be no more than 50 percent of the total bandwidth capacity of the network interfaces in the server. – If the value is too high, consider migrating some applications to other servers or upgrading to a faster network.
A Baseline As mentioned earlier, performance bottlenecks can develop over a long period of time, and it can often be difficult to detect them by observing a server’s performance levels at one particular point in time. A baseline is simply a set of readings, captured under normal operating conditions, which you can save and compare to readings taken at a later time. By comparing the baseline readings to the server’s current readings at regular intervals, you can discern trends that might eventually affect the computer’s performance.
Data Collector Set To capture counter statistics in the Reliability and Performance Monitor console for later review, you must create a data collector set. A data collector set is a means of gathering, compiling, and storing information from various sources, including performance counters, event traces, and the Windows registry. At its simplest, data collector sets can function as the equivalent of performance logs in earlier Windows versions. You select the counters you want to monitor, and the console records their information for later evaluation.
The Performance Monitor Information Collected Using a Data Collector Set
Auditing Auditing is the process by which administrators can track specific security-related events on a Windows Server 2008 computer. To audit security events, you must enable specific Group Policy settings for a computer. Once you activate these settings, the system tracks the specified activities and records them as events in the Security log, which you can access using the Event Viewer snap-in.
The Audit Policies Container in a Group Policy Object
The Properties Sheet for an Audit Policy
Group Policy Objects
Summary The primary function of the Windows Eventing engine, as always, is to record information about system activities as they occur and package that information in individual units called events. The application you use to view the events is called Event Viewer.
Summary When you expand the Windows Logs folder in the Event Viewer console, you see the following logs: Application, Security, Setup, System, and Forwarded Events. The Windows event logs contain different types of events, as follows: Information, Error, Warning, and Critical.
Summary There are four types of logs that can appear in the Applications and Services Logs folder, as follows: Admin, Operational, Analytic, and Debug. When you launch the Reliability and Performance Monitor console, you see the Resource Overview screen, which contains four real-time line graphs that display information about four of the server’s main hardware components.
Summary While the Event Viewer snap-in enables you to review system events that have already occurred, the Reliability and Performance snap-in enables you to view system information on a continuous, real-time basis.
Summary Performance Monitor is a tool within the Reliability and Performance Monitor console that displays system performance statistics in real time. The difference between Performance Monitor and Resource Overview is that Performance Monitor can display hundreds of different statistics (called performance counters) and you can create a customized graph containing any statistics you choose.
Summary Reliability Monitor is a new addition to Windows Server 2008 that automatically tracks events that can have a negative effect on system stability and uses them to calculate a stability index.
Summary A bottleneck is a component that is not providing an acceptable level of performance compared with the other components in the system. Auditing is the process by which administrators can track specific security-related events on a Windows Server 2008 computer.