Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Process Lifetime Parent Child. 2 Process Performance Issues " The maximum number of processes allowed is 30,000. " Executables that use shared libraries.

Similar presentations


Presentation on theme: "1 Process Lifetime Parent Child. 2 Process Performance Issues " The maximum number of processes allowed is 30,000. " Executables that use shared libraries."— Presentation transcript:

1 1 Process Lifetime Parent Child

2 2 Process Performance Issues " The maximum number of processes allowed is 30,000. " Executables that use shared libraries take fewer system resources (disk space, memory, I/O and so on. " Threads are more efficient that multiple processes. " Zombie processes cause no performance problems.

3 3 Multithreading " A thread is a logical sequence of program instructions. " The kernel is multithreaded. " Multiple tasks may be running in the kernel simultaneously and independently. " A user process can have many application threads that execute independently of each other. " Fewer system resources are used that mulltiple processes. " Special programming techniques are required.

4 4 Process Thread Examples

5 5 Performance Issues Multithreading an application allows it to: " Be broken into separate tasks that can be scheduled and executed independently. " Take advantage of multiprocessors with less overhead than multiple processes. " Share memory without going through the overhead and compexity of IPC mechanisms. " Use a cleaner programming model for certain types of applications. " Extend the program more easily.

6 6 Locking " Locks are used to synchronize threads by serialization. " They protect critical data from simultaneous write access. " Locks must be used when threads share writable data. " SunOS provides four types of locks. " Which type is used depends on the requirements. " A bad locking design can cause performance problems. " Locking problems usually require a significant reprogramming.

7 7 Locking Problems / Lock contention / Granularity / Inappropiate lock type / Deadlock / "Lost" locks / Race conditions / Incomplete implementation

8 8 The lockstat Command = Lock use in the kernel is identified. = Unidentified delays may be caused by lock contention. = Excessive counts may indicate a problem. # lockstat lpstat Adaptive mutex block: 2 events Count indv cum1 rent nsec Lock Caller ----------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 50% 50% 1.00 87500 0xf5ae43d0 esp_poll_loop+0xcc 1 50% 100% 1.00 151000 0xf5ae3ab6 esp_poll_loop+0x8c -----------------------------------------------------------------------------------------------------------------------------------------------------------------

9 9 The clock Routine O The clock executes at interrupt level 10. O Most system timing is run off this clock. O Each time the clock routine executes is a tick. O For most processors, there are 100 ticks per second. O Ticks per second can be set to 1000 for real-time processing. O This is the limit of normal timing resolution. O The time-of-day clock will run slow.

10 10 Process Monitoring Using ps You need tp identify active processes before determining; Which process is causing a delay Which resource is bottlenecking the process The ps command enables you to check the status of processes. The ps command helps determine how to set process priorities. The BSD version, /usr/ucb/ps -aux, provides the best performance related data.

11 11 Scheduling States

12 12 Scheduling Classes Unix provides four scheduling classes by default. These are: " TS - The timesharing class, for normal user work. Proiorities are adjusted based on CPU usage. " IA - The interactive class, derived from the timesharing class, provides better performance for the task in the active window in OpenWindows or CDE. " SYS - The system class, also called he kernel priorities, is used for system threads such as the page daemon and clock thread. " RT - The real-time class has the highest priority in the system, except for interrupt handling; it is even higher than the system class.

13 13 Dispatch Parameter Table Issues For time-sharing class processes: Reducing time quanta favors interactive processes. Raising time quanta favorscompute-bound and large processes. Using the ts_maxwait and ts_lwait fields controls CPU starvation. Slightly raising the values of ts_tqexp causes the priority of compute -bound processes to drop more slowly. Changing the table can be done to fit your workload.

14 14 The dispadmin Command Displays or changes scheduler parameter Uses options: -l - List available schedulingclasses -c class - Specify the class whose parameters are to be displayed or changed -g - Displays configure parameters Provides a simple way of formatting control file -s file - Sets parameters from a file

15 15 The interactive Scheduling Class Is used to enhance interactive performance Is the default scheduling class for processes in Common Desktop Environment and Open Windows sessions Uses most of the time-sharing class facilities Boosts the priority of the task in the active window by 10 points Priority is reset when it is no longer the active window. Does not boost processes changed using nice or other commands

16 16 Processor Sets " Allows exclusive use of groups of processors by certain processes " Also known as CPU fencing " Is very different from the pbind(1M) " Is managed by the psrset (1M) command " Is controlled by the root user " Has system-defined processor sets which can be used by and user " Forces DR to release bindings if necessary

17 17 The Run Queue : A count of kernel threads waiting to run is kept. : It contains a total of all system dispatch queues. : It is not scaled by the number of CPUs. : A depth of 3-5 per CPU is usually OK : This depends on the type of work being run. : There is no way to tell what is waiting or how long it has been waiting

18 18 CPU Activity " User - The user process is running " System - Kernel is running. " This includes system thread time and user system calls. " Wait I/O - The CPUs are idle, but a disk device is active. " Idle - The system is waiting Some reports add Wait I/O and Idle. " This is usually reported as Idle time. " You can check the tool's man page.

19 19 CPU Control and Monitoring Processor control and information is reported by: " mpstat - Displays CPU usage statistics " psradm - Enables and disables individual CPU " psrinfo - Determines which CPUs are enabled " prtconf - Shows device configuration " prtdiag - Prints system configuration and diagnostic information Process Manager - Shows current activity " psrset - Manages processor sets

20 20 What is Cache? Keeps accessed data near its user Must provide high-speed data access Holds a small subset of the available data Is used by hardware and software Has many different types around the system Is critical for performance Can be managed in many different ways

21 21 SRAM and DRAM " Hardware caches are usually SRAM. " Static RAM does not need a refresh. " DRAM data fades after about 10 cycles. " Refresh rewirtes data, but this delays access. " SRAM does not fade, so no refresh is needed. " SRAM takes four times as many transistors as DRAM. " SRAM will always be more costly and take more power.

22 22 CPU Caches A There are usually two levels of cache in a CPU. A Level one (internal) - Usually 4-40 Kbytes in size A Level two (external) - Usually.5-8 Mbytes in size A A level one cache operates at CPU speeds. A Data must be in the level one cache before the CPU can use it.

23 23 Cache Hit Rate " The performance of a cache depends on the hit rate. " The cache hit rate is how often requested data is available " The cache hit rate depends on: " Size of the cache " Fetch rate " Locality of data references " Cache structure

24 24 Cache Hit Rate / Misses are very expensive / For example: / Hit cost is 20 units, miss cost is 600 units. / At 100 percent hit rate, cost = 100 x 20 = 2000. / At 99 percent ht cost = (99 x 20) + (1 x 600) = 1980 + 600 = 2580. / A one percent miss cost is 580 + 2000 or 29 percent degradation / The exact numbers depend on the system.

25 25 System Cache Hierarchies

26 26 The Memory Free Queue Main memory is a fully associative disk caceh managed by the O.S Move ins and move outs occur as in any cache. For memory, this is called paging. Moves involving disk are very expensive. A move out followed by a move in is too costly. Free memory pages are kept available to avoid the need for move outs before a move in. You always need to have "enough" pages on the free queue.

27 27 The paging Mechanism Every quarter second a check is made to see if the amount of free memory is less than the quantity specified be the lotsfree parameter. If it is, the page daemon is run to replenish the free memory queue. Pages are stolen from their users if they have not been used recently. Pages are scanned to determine which can be stolen. The more pages needed, the faster the scan rate. If the page daemon cannot keep up, swapping may occur.

28 28 Priority Paging " Is included with the Solaris 7 OS but is disabled by default " A kernel patch is required for Solaris 2.6 and 2.5.1 OS " Is activated be setting priority_paging to 1 in /etc/system " Has a new tuning parameter, cachefree, whose default value is twice lotsfree " Starts the page daemon running at cachefree " When enabled, steals only file system pages (data files) until lotsfree is reached " Is very good for large users of random I/O

29 29 Swapping A Swapping is a last resort (desperation swapping). A If the page daemon consistently cannot keep up with the demand for memory, memory use must be cut. A The number of swap is not important; the fact that there are swaps is. A A process will be swapped when its last LWP is swapped out A Its memory will not be freed until then. A Do not try to tune swaps, try to eliminate them.

30 30 "tmpfs" • It allows the creation of a virtual memory ramdisk such as /tmp • It uses virtual memory like any other user. • It uses real memory and swap space. • If used heavily, /tmp can fill main memory • You can limit the use of /tmp. • The size option in the vfstab entry is used. • You must not move it to a hard drive partision


Download ppt "1 Process Lifetime Parent Child. 2 Process Performance Issues " The maximum number of processes allowed is 30,000. " Executables that use shared libraries."

Similar presentations


Ads by Google