Ask a Jedi: Why is one UDF faster than another? Variables?
Patrick Stormthunder asks:
After some testing on my company's site for the purpose of optimization, I came across something that has shaken my faith in, well, variables. I am hoping you can help me understand these results.
What follows is a simple implementation of a "null" value with the test for null (isNull()) written two ways, the first with a variable reference and the second without one. Running the first test 1000 times takes 7000 milliseconds on my development machine. Running the second test takes 25 milliseconds.
Before showing Patrick's code, I'll say right away that I'm not a big fan of the 'run this code block 1 zillion times' speed test. I don't think it's very reliable or realistic. Yes, I've done it myself, but at the end of the day I think you can do better. As an example of why I don't trust speed tests, when I ran his code, my results were incredibly different. The slow code took, on average, 30ms, while the fast code took about 8ms. At that level, I'd simply not worry about it and use what is more appropriate/better written/etc. Let's take a quick look at his code though:
cfset variables.null="|\| |_| |_ |_"/>
<cffunction name="isNull" hint="is the valueless?" access="public" output="false">
<cfargument name="value" required="true" />
<cfreturn arguments.value EQ variables.null />
</cffunction>
<cftimer label="test1">
<cfloop index="i" from="0" to="1000">
<cfset myTestVar=isNull(i) />
</cfloop>
</cftimer>
<cfset variables.null="|\| |_| |_ |_"/>
<cffunction name="isNull2" hint="is the valueless?" access="public" output="false">
<cfargument name="value" required="true" />
<cfreturn arguments.value EQ "|\| |_| |_ |_" />
</cffunction>
<cftimer label="test2">
<cfloop index="i" from="0" to="1000">
<cfset myTestVar=isNull2(i) />
</cfloop>
</cftimer>
As you can see - he has two UDFs. Each takes a value and compares it to a null variable which is a string. The first UDF is the slow one - but as I said, even though my test showed it to be about 3 times as slow, it was still around 30 ms. What bugs me though and what I think is more important is the break of proper encapsulation. The UDF directly accesses the outside Variables scope, and shoot, perhaps that is the reason for the slow down. Either way, even if the speed tests were reversed, I'd recommend the second UDF simply because it is better written and more encapsulated.
Oh a whim I went ahead and added a third test:
<cftimer label="test3">
<cfloop index="i" from="0" to="1000">
<cfset someres = (i eq variables.null)>
</cfloop>
</cftimer>
This returned results right in the middle of the two UDFs. I then wrote one more test:
<cftimer label="test4">
<cfloop index="i" from="0" to="1000">
<cfset someres = compare(i, variables.null) is 0>
</cfloop>
</cftimer>
This code block ran the fastest, although only slightly more than the second UDF, and I'm sure if I used compare in the UDF it would catch up. But you know what? I hate compare. I can never remember it's API so at the end of day, I'll use EQ just to make things more readable.
p.s. I'll use this blog entry as another excuse to recommend ColdFire. Check out how nicely displayed the timer results are shown at the bottom my browser:

Comments
The initial post had me wondering whether the sample code was simplified from a CFC, and Patrick's follow-up has confirmed this.
In my experience - and this is experience from investigating someone's question on the Adobe CF forums, not my own "real world" experience - is that accessing the variables scope in CFC methods has quite an overhead. It seems that this situation could be another example of this.
My - completely baseless - suspicion here is that when the code is compiled, the direct-access-to-the-variables-scope is replaced with a getter method (and a poorly performing one): it's the only "sensible" thing I could come up with.
Can I recommend to Patrick that he tries the same tests, except using a function-local variable (ie: a VAR) for the NULL variable, and see what you get?
--
Adam
This:
<cffunction name="isNull" hint="is the valueless?" access="public" output="false">
<cfargument name="value" required="true" />
<cfset var null="|\| |_| |_ |_"/>
<cfreturn arguments.value EQ null />
</cffunction>
was taking between 7000 and 9000 ms for 1000 iterations
while this:
<cffunction name="isNull" hint="is the valueless?" access="public" output="false">
<cfargument name="value" required="true" />
<cfset var null="|\| |_| |_ |_"/>
<cfreturn arguments.value EQ "|\| |_| |_ |_" />
</cffunction>
was taking 25 to 30 ms for 1000 iterations.
This timing, by the way, is based on coldfusion's native debugging output, not <cftimer> calls, which I will try for comparison when I get back into work tomorrow. The 7000 ms I mentioned in my first post was the rough average of 30 or so trials.
I was initially led into this testing when I was wondering what was taking certain of my more iterative functions a lot of time to run, and I found that isNull cals seemed to be a frequent culprit. Thy consistently run slowly in every instance in my application- I wonder if Ray's vastly different testing results indicate a different coldfusion or java version than mine.
I take his point on this 1000 times method being a poor man's benchmark, but I was having trouble isolating what was making isNull run slowly, and this was the best I could come up with at the time.
I'm still confused as to how you are seeing 7k-9k iterations. If you run _my_ test code, do you see the same? Are you recreating the CFC on each loop iteration or just calling the method each time?
In answer to his first question, here are the times I got for three trials of the tests he proposed:
test1: 5822, 9112, 8746
test2: 26, 25, 45
test3: 4449, 5308, 6329
test4: 7, 7, 6
So it looks like there's something in my environment that consistently makes EQ take much longer with variables on the right side than with constants - and I got a similarly fast result for compare().
I am running CF 8,0,0,176276 and JVM 1.6.0_01-b06 on a 32 bit Vista machine.
A coworker just ran Ray's 4 tests with results very similar to Ray's. She has the same CF, the same JVM, and the same OS as me. Also, I just got similar results to Ray's running his tests on our QA server. So this seems like a personal problem that I should stop worrying myself and others about. Thanks again for all the support on this.


The variable named null that isNull() tests for is defined only once so as to be able to set variables to null or test them for null without duplicate definitions for the null value. The actual definition is:
<cfset null="|\| |_| |_ |_"/>
<cffunction name="isNull" hint="is a variable valueless?"
access="public" output="false">
<cfargument name="value" required="true" />
<cfif NOT IsSimpleValue(arguments.value)>
<cfreturn false />
</cfif>
<cfreturn arguments.value is null />
</cffunction>
So if I wanted to change the value of null from "|\| |_| |_ |_" to "poodle", i wouldn't have to change both the cfset statement and the function, only the cfset. To me, this approach seems like good practice. Its purpose is encapsulation: preventing other parts of the application from having to know what null is (they don't even have to know it's a string), but allowing them to set things to it and test for it.
Do you still think this is the wrong approach, or had I just not provided enough information the first time round?