tutorial on code optimization

Tips for Optimizing C/C++ Code

1. Code for correctness first, then optimize!

  • This does not mean write a fully functional ray tracer for 8 weeks, then optimize for 8 weeks!

  • Perform optimizations on your ray tracer in multiple steps.

  • Write for correctness, then if you know the function will be called frequently, perform obvious

  • Then profile to find bottlenecks, and remove the bottlenecks (by optimization or by improving the algorithm). Often improving the algorithm drastically changes the bottleneck – perhaps to a function you might not expect. This is a good reason to perform obvious optimizations on all functions you know will be frequently used.

2. People I know who write very efficient code say they spend at least twice as long optimizing code as they spend writing code.

3. Jumps/branches are expensive. Minimize their use whenever possible.

  • Function calls require two jumps, in addition to stack memory manipulation.

  • Prefer iteration over recursion.

  • Use inline functions for short functions to eliminate function overhead.

  • Move loops inside function calls (e.g., change for(i=0;i<100;i++) DoSomething(); into DoSomething() { for(i=0;i<100;i++) { … } } ).

  • Long if…else if…else if…else if… chains require lots of jumps for cases near the end of the chain (in addition to testing each condition). If possible, convert to a switch statement, which the compiler sometimes optimizes into a table lookup with a single jump. If a switch statement is not possible, put the most common clauses at the beginning of the if chain.

4. Avoid/reduce the number of local variables.

  • Local variables are normally stored on the stack. However if there are few enough, they can instead be stored in registers. In this case, the function not only gets the benefit of the faster memory access of data stored in registers, but the function avoids the overhead of setting up a stack frame.

  • (Do not, however, switch wholesale to global variables!)

5. Reduce the number of function parameters.

  • For the same reason as reducing local variables – they are also stored on the stack.

6. If you do not need a return value from a function, do not define one.

7. Try to avoid casting where possible.

  • Integer and floating point instructions often operate on different registers, so a cast requires a copy.

  • Shorter integer types (char and short) still require the use of a full-sized register, and they need to be padded to 32/64-bits and then converted back to the smaller size before storing back in memory. (However, this cost must be weighed against the additional memory cost of a larger data type.)

8. Use shift operations >> and << instead of integer multiplication and division, where possible.

9. Simplify your equations on paper!

  • In many equations, terms cancel out… either always or in some special cases.

  • The compiler cannot find these simplifications, but you can. Eliminating a few expensive operations inside an inner loop can speed your program more than days working on other parts.

10. Consider ways of rephrasing your math to eliminate expensive operations.

  • If you perform a loop, make sure computations that do not change between iterations are pulled out of the loop.

  • Consider if you can compute values in a loop incrementally (instead of computing from scratch each iteration).

11. Avoid unnecessary data initialization.

  • If you must initialize a large chunk of memory, consider using memset().

12. For most classes, use the operators += , -= , = , and /= , instead of the operators + , - , * , and / .*

  • The simple operations need to create an unnamed, temporary intermediate object.

  • For instance: Vector v = Vector(1,0,0) + Vector(0,1,0) + Vector(0,0,1); creates five unnamed, temporary Vectors: Vector(1,0,0), Vector(0,1,0), Vector(0,0,1), Vector(1,0,0) + Vector(0,1,0), and Vector(1,0,0) + Vector(0,1,0) + Vector(0,0,1).

  • The slightly more verbose code: Vector v(1,0,0); v+= Vector(0,1,0); v+= Vector(0,0,1); only creates two temporary Vectors: Vector(0,1,0) and Vector(0,0,1). This saves 6 functions calls (3 constructors and 3 destructors).

13. Delay declaring local variables.

  • Declaring object variable always involves a function call (to the constructor).

  • If a variable is only needed sometimes (e.g., inside an if statement) only declare when necessary, so the constructor is only called if the variable will be used.

14. Avoid dynamic memory allocation during computation.

  • Dynamic memory is great for storing the scene and other data that does not change during computation.

  • However, on many (most) systems dynamic memory allocation requires the use of locks to control a access to the allocator. For multi-threaded applications that use dynamic memory, you may actually get a slowdown by adding additional processors, due to the wait to allocate and free memory!

  • Even for single threaded applications, allocating memory on the heap is more expensive than adding it on the stack. The operating system needs to perform some computation to find a memory block of the requisite size.


I understand why the class should use += instead of +, but for basic data types, why not use += ? aren’t they the same for basic data types ? ( i += 1 ) == ( i = i + 1 ) ?? and what about ++ and –

thx for this tutorial, it was really helpful : )

1 Like

you are welcome : )

1 Like

Yeah really helpful tutorial. Thanks for sharing. Also, here at https://samedaypapers.com/coursework-writing-assistance I found a great article about optimization tips for Ruby. I think this article will be interesting for you too.