I've been using the Visual Studio 2008 beta for a while now, and recently I discovered the built in support for unit testing. It's pretty slick, well integrated, and soon I found myself using it to answer some performance questions that had been lingering for some time related to how to best pass large value types around in C#. In my case, I'm interested in Matrix3D, a struct with a 4x4 matrix for a total of 16 doubles in it weighing in at 128 bytes. That's quite a chunk of data to pass in as a parameter to a function or to copy from the stack as a return value. Because I have so much code that depends on passing matrices and other large value types around, I decided to use the new unit testing features to try to measure which approach would produce the best performance. This is what I found out.
I've attached the .cs file that has these tests below. The tests are pretty simple, and come in two forms: getter and setter. Each type has several forms: using properties, using ref or out, using pass/return by value, and direct member access. This seems to cover all the possible ways to get value data in or out of another object. For each permutation, I also added another form that does a small amount of logic to try to prevent the optimizer from doing any kind of test-breaking optimizations, but as it turns out it wasn't neccessary as the results show the simpler forms of the functions have a distinct timing profile so I have left these results out of the table below but the tests are in the attached file if you are interested in that.
When setting the values, I also tried to use explicit construction and member setting just to be thorough (and they were the slowest of all, which was no surprise). Then, I added a set of setter functions that passed the parameters to an intermediate function before reaching the internal property or reference setting functions to simulate a multi-layer case where a matrix is passed to a function that then stores it in the final destination. These multi-layer functions come in two forms, one that has a single layer below and another that passes the matrix down through two functions before reaching the destination. I know that isn't the clearest description, so check out the attached code for clarification.
The timing results are from performing the various get & set functions 100000000 times. I'm not convinced that the test system is measuring timing correctly, because there are some inconsistencies between runs that I can't explain. However, multiple runs do show an average that is reasonably consistent. In my attempts to stabilize the timing, I realized that I needed to ensure all the functions were JITted before they were fired up, so in the test class constructor I invoke each test function once so they are ready for the actual timing test. Once that was in place, timing became much more consistent but not completely.
Here is the result of the timing on my machine, a Vista 64 AMD X2 4400:
Net Parameter Timing
Over the course of multiple runs, the variance in timing suggest that setting by reference is nearly as fast as setting by direct member access. Unfortunately, despite the fact that getting data via direct member access is by far the fastest way (about 29 times faster than the next fastest), direct member access is only occasionally an option because in many (if not most) situations, parameters need to percolate through a variety of functions before reaching the final destination and side effects need to occur as a result of the data change. In short, direct member access is great if it is an option but it usually isn't.
The Winners:
Get: Using Get functions with out parameters.
As you can see from the table, getting the data from the class via direct member access is the fastest way to extract value type data from the object. There is no question that if you want your code that pulls a Matrix3D or other large value type from some other object to perform quickly, you should try to provide direct member access to it. However, for reasons mentioned above, this is not always an option. The next best thing is using a Get function with an out parameter. While it is true that the return value and property forms are nearly as fast, they both suffer substantial penalties if the value must be returned up the stack more than a single frame, which is a very common thing to see. The out form is effectively immune to that problem, with only a slight overhead of passing in the pointer to the getter function.
Set: Using Set functions with ref parameters.
Setting data by direct member access is only slightly faster than setting via ref value function call, but unfortunately if you want to get the data, you are already committed to exposing it to writing at any time due to it being a public member variable. This has some pretty severe implications for application design, and often this is simply not an option when there are other data structures that need to updated when something changes. The performance overhead of setting by value through two layers of function calls is nearly twice that of passing by reference.
Conclusions
It is clear that with Matrix3D, the answer is to use references instead of passing or getting by value. One thing to consider however is that Matrix3D is a pretty large value type, and because the differences between pass by value and pass by reference is just under double the cost for nested calls, it may be that smaller values (such as Vector3) would get better or equivalent performance by using pass by value. Further testing would be required to know the answer to that.
That is all I have time for today, but I do hope to revisit this in the future. Please let me know if you found this helpful.
| Attachment | Size |
|---|---|
| ParameterTests.cs | 9.49 KB |