- By
**ozzy** - Follow User

- 100 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Optimizing General Compiler Optimization' - ozzy

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Problem: Optimizing optimizations

- A compiler usually has many optimization settings (e.g. peephole, delayed-branch, etc)
- gcc 3.3 has 54 optimization options
- gcc 4 has over 100 possible settings
- Very little is known about how these options affect each other
- Compiler writers typically include switches that bundle together many optimization options
- gcc –O1, -O2, -O3

…but can we do better?

- It is possible to perform better than these predefined optimization settings, but doing so requires extensive knowledge of the code as well as the available optimization options
- How do we define one set of options that would work well with a large variety of programs?

Motivation for this paper

- Since there are too many optimization settings, an exhaustive search would cost too much
- gcc 3: 2^50 different combinations!
- We want to define a systematic method to find the optimal settings, with a reduced search space
- Ideally, we would like to do this with minimal knowledge of what the options actually will do

Big Idea

- We want to find the biggest subsets of compiler options that positively interact with each other
- Once we obtain these subsets, we will try to combine them together, under the condition that they do not negatively affect each other
- We will select our ultimate optimal compiler setting from the result of these set combinations

Full vs. Fractional Factorial Design

- Full Factorial Design: explores the entire search space, with every possible combination
- Given k options, this will take O(2^k) time
- Fractional Factorial Design: explores a reduced search space, that is representative of the full search space
- This can be done using orthogonal arrays

Orthogonal Arrays

- An Orthogonal Array is a matrix of 0’s and 1’s.
- The rows represent the experiments to be performed.
- The columns represent the factors that the experiment tries to analyze
- Any option is equally likely to be turned on/off.
- Given a particular experiment with a particular option turned on, all the other options are still equally likely to be turned on/off

Algorithm – Step 1

- Finding maximum subsets of positively interacting options
- Step 1.1: Find a set of options that give the best overall improvement
- For any single optimization setting i, compute the average speedup for all the settings in the search space in which i is turned on
- Select M of the highest average improvement settings

Algorithm – Step 1(cont.)

- Step 1.2: Iteratively add new options to the already obtained sets, to get a maximum set of positively reinforcing optimizations
- Ex: If using options A and B together produces a more optimal setting than just using A, then add B
- If using {A, B} and C together produces a more optimal setting than {A, B}, then add C to {A, B}

Algorithm – Step 2

- Take the sets that we already have and try to combine them together, assuming that they do not negatively influence each other.
- This is done to maximize the number of settings turned on for each set
- Example:
- If {A, B, C} and {D, E} do not counteract each other, then we can combine them into {A, B, C, D, E}
- Otherwise, leave them separate

Algorithm – Step 3

- Take the resulting sets from step 2, and select the one with the best overall improvement.
- The result would be the ideal combination of optimization settings, according to this methodology.

Comparing results

- The compiler setting obtained by this methodology outpeforms –O1, -O2, and –O3 on almost all the SPECint95 benchmarks
- -O3 performs better on li (39.2% vs. 38.4%)
- The new setting delivers the best performance for perl (18.4% vs. 10.5%)

Conclusion

- The paper introduced a systematic way of combining compiler optimization settings
- Used a reduced search space, constructed as an orthogonal array
- Can be done with no knowledge of actual options
- Can be done independently of architecture
- Can be applied to a wide variety of applications

Future work

- Using the same methodology to find a good optimization setting for a particular domain of applications
- Applying the methodology to newer versions of the gcc compiler, such as gcc 4.0.1

Download Presentation

Connecting to Server..