Performance tuning is the improvement of system performance. This is typically a computer application, but the same methods can be applied to economic markets, bureaucracies or other complex systems. The motivation for such activity is called a performance problem, which can be real or anticipated. Most systems will respond to increased load with some degree of decreasing performance. A system's ability to accept higher load is called scalability, and modifying a system to handle a higher load is synonymous to performance tuning.

Systematic tuning follows these steps:

  1. Assess the problem and establish numeric values that categorize acceptable behavior.
  2. Measure the performance of the system before modification.
  3. Identify the part of the system that is critical for improving the performance. This is called the bottleneck.
  4. Modify that part of the system to remove the bottleneck.
  5. Measure the performance of the system after modification.

This is an instance of the measure-evaluate-improve-learn cycle from quality assurance.

A performance problem may be identified by slow or unresponsive systems. This usually occurs because high system loading, causing some part of the system to reach a limit in its ability to respond. This limit within the system is referred to as a bottleneck.

A handful of techniques are used to improve performance. Among them are code optimization, load balancing, caching strategy, and distributed computing, and self-tuning.

Contents

Performance analysis

See the main article at Performance analysis

Performance analysis, commonly known as profiling, is the investigation of a program's behavior using information gathered as the program executes. Its goal is to determine which sections of a program to optimize.

A profiler is a performance analysis tool that measures the behavior of a program as it executes, particularly the frequency and duration of function calls. Performance analysis tools existed at least from the early 1970s. Profilers may be classified according to their output types, or their methods for data gathering.

Performance engineering

See the main article at Performance engineering

Performance engineering is the discipline encompassing roles, skills, activities, practices, tools, and deliverables used to meet the non-functional requirements of a designed system, such as increase business revenue, reduction of system failure, delayed projects, and avoidance of unnecessary usage of resources or work.

Several common activities have been identified in different methodologies:

  • Identification of critical business processes.
  • Elaboration of the processes in use cases and system volumetrics.
  • System construction, including performance tuning.
  • Deployment of the constructed system.
  • Service management, including activities performed after the system has been deployed.

Code optimization

See the main article at Optimization (computer science).

Enhancing performance by rewriting specific portions of a program to run faster is one form of code optimization. The term code optimization can refer to improving the implementation of a particular algorithm for performing a task (code tuning). It can also refer to utilizing a better algorithm. Examples of code optimization include improving the code so that work is done once before a loop rather than inside a loop or replacing a call to a simple selection sort with a call to the more complicated algorithm for a quicksort.

Caching strategy

Main article: Cache

Caching is a fundamental method of removing performance bottlenecks that are the result of slow access to data. Caching improves performance by retaining frequently used information in high speed memory, which reduces access "isha" time and thus improves performance. Caching is an effective manner of improving performance in situations where the principle of locality of reference applies.

The methods used to determine which data is stored in progressively faster storage are collectively called caching strategies.

Load balancing

A system can consist of independent components, each able to service requests. If all the requests are serviced by one of these systems (or a small number) while others remain idle then time is wasted waiting for used system to be available. Arranging so all systems are used equally is referred to as load balancing and can improve over-all performance.

Load balancing is often used to achieve further gains from a distributed system by intelligently selecting which machine to run an operation on based on how busy all potential candidates are, and how well suited each machine is to the type of operation that needs to be performed.

Distributed computing

Main article: Distributed computing

Distributed computing is used to increase the performance of operations that can be performed in parallel, by concurrently executing multiple operations. Operations may be distributed across multiple processes on a single CPU, taking advantage of multitasking, multiple processes across multiple CPUs, or across multiple machines. As operations are executed concurrently, ensuring synchronization between processes is essential to ensure correct results.

As the trend of increasing the potential for parallel execution on modern CPU architectures continues, the use of distributed systems is essential to achieve performance benefits from the available parallelism. High performance cluster computing is a well known use of distributed systems for performance improvements.

Distributed computing and clustering can negatively impact latency while simultaneously increasing load on shared resources, such as database systems. To minimize latency and avoid bottlenecks, distributed computing can benefit significantly from distributed caches.

Self-tuning

Main article: Self-tuning

A self-tuning system is capable of optimizing its own internal running parameters in order to maximize or minimize the fulfillment of an objective function; typically efficiency or error. Self-tuning systems typically exhibit non-linear adaptive control. Self-tuning systems have been a hallmark of the aerospace industry for decades, as this sort of feedback is necessary to generate optimal multi-variable control for nonlinear processes.

Bottlenecks

In any program performing a task, for any resource, there is a minimum amount of that resource needed to accomplish the task. A bottleneck is any part of the program spending the resource for poor reasons and thus spending more than necessary.

A way to identify bottlenecks is to randomly sample a unit of the resource, such as a point in time, and make a full inquiry into the chain of reasoning of why that unit of resource is being spent. The chain of reasoning will often contain a weak reason. Then more samples are taken. If the weak reason applies in some fraction of the samples, one can safely conclude that it is costing approximately that fraction of the resource. Since the weak reason is identified, doing that part of the task differently can save some of the resource, and remove the bottleneck.

It is often the case that a given chain of reasoning, if it is long enough, contains multiple weak reasons. This can have the effect of multiplying the amount of resource being used, above its optimum.

In the case of single-thread programs, and the resource being time, the chain of reasoning of why a point in time is being spent can be determined from the state of the program and its data at that point in time. A useful part of the state is the call stack. The instruction or system call at the "bottom" of the call stack is the one spending the current time increment, and the "call instructions" above it give steps in the chain of reasons why. A way to identify bottlenecks is to ask which call instructions that appear on a large enough fraction of samples have weak rationale. If an instruction with a weak reason appears on a large enough fraction of samples, it is a true bottleneck.

It is often the case that costly function calls are invisible in the source code because they are inserted by the compiler, so simply reviewing code may not find them.

The number of samples to take depends on how much precision is desired in measuring the cost of the bottlenecks. Profiling programs (performance analysis) can take large number of samples. Small numbers of samples can suffice for large bottlenecks. It is important to only make use of samples that occur during the time when the program is actually working. Samples that occur while the program is waiting for user input, for example, can imply that the human is the bottleneck!

Programs often contain multiple bottlenecks, of varying costs. Removal of any bottleneck reduces total expenditure of the resource, with the result that the remaining bottlenecks can take a larger fraction of the remainder, and are thus easier to find on subsequent attempts.

Examples of bottlenecks include:

  • A hot spot is a tight inner loop where the program counter spends much of its time. (In older systems, the memory chips would actually have hot spots.) For example, if one often finds at the bottom of the call stack a linear search algorithm instead of binary search, this would be a hot spot bottleneck. However, if another function is called in the search loop, such as string compare, then the string compare function would be found at the bottom of the stack, and the call to it in the loop would be at the next level up. In this case, the bottleneck and hot spot are separated. The bottleneck (i.e. the linear search) is not a hot spot, and the hot spot (i.e. string compare) is not a bottleneck.
  • Data structures that are too general for the problem at hand can also impair performance, by adding extra layers of processing. For example, if a collection of objects remains small, a simple array with linear search could be much faster than something like a "dictionary" class, complete with hash coding. With this kind of bottleneck, the program counter is most often found in "housekeeping" such as dynamic memory allocation/de-allocation as these collections are being constructed and then 'destructed'.
  • A function is written to collect a set of useful information (from a database, for example) and that function is called multiple times, because no attempt has been made to cache the results from a prior call.

In the process of finding and removing bottlenecks, it is important to prove their existence, such as by sampling, before acting to remove them. There is a strong temptation to guess. Guesses, by definition, are often wrong, and investing time in them is itself a bottleneck.

[1][2]

See also

External links


No comments have been added.



Your name:

City:

Country:

Your comments:

Security check *
(Please enter the number into adjoining box)