We run a pretty big application (10 million query hits per day). We have 5 load balanced Coldfusion servers, although 1 MySQL server. With this setup, we've been fairing pretty well. We have about 1% or less failed calls to the database because of the NULL pointer error mentioned above. Either way 1% is a lot to me, so I had to fix this error. Okay, so here's everything I was able to figure out about this error:
Coldfusion has a setting if you want to pool your connections or not. If you pool your connections, each database that is being connected to (even if it is on same Mysql server) still creates its own connection. The only time the same connection is actually used is if you query the same database. Although there are obvious performance gains.
Secondly, Mysql has its own set of settings. There are 4 that concern us: interactive_timeout, connect_timeout, wait_timeout, and max_connections. The important one here is the wait_timeout. That is the number of seconds before Mysql internally kills a connection. Now, to nearly remove all NULL pointer errors (as we have down to below 1%) you can set the Mysql wait_timeout to 120 seconds, and the Coldfusion connection timeout to 1 minute from the datasource page in your Administrator and still enable connection pooling. This way, Coldfusion kills a connection before Mysql does.
Unfortunately, Coldfusion is so dumb that it still doesn't necessarily kill all connections in exactly 1 minute. I love the software, but I must admit - this is not its strong point. As a matter of fact, it doesn't even get around to all of them in 120 seconds, and hence why I still saw 1% NULL pointer errors. Shame.
So, if you have a high load application like ours, you know that disabling connection pooling is not much of an option with a Mysql timeout of even 120 seconds, because we reach over 1000 simultaneous connections during peak hours, and 1000 is about all that Mysql can handle on 1 box that we have. Not to mention the performance impact without connection pooling.
But the performance is actually not THAT significant as you may imagine. It really depends on your application. If you have a lot of queries PER request that connect to the SAME database, then yes, you can probably be better off with connection pooling. But, if you have only 2-3, you'll fair fine without it. So here's a solution for those that have a lot of requests, but not a lot of queries per request: DISABLE CONNECTION POOLING from your Coldfusion administrator. Then tune your Mysql wait_timeout to 5 seconds. This will kill all connections that are longer than 5 seconds therefore you won't ever have hundreds or thousands of simulatenous Mysql connections. As a matter of fact, you can tune it to 1 second timeout, but I do not recommend that.
So this was the "recommendation" by everyone. But I noticed yet again a bug in this specific implementation. If you have 1 query running at 12:00:00 and then a second on that same page running at 12:00:06 and your timeout is only 5 seconds, that second query will fail since Coldfusion still tried to use the same connection with POOLING DISABLED! Coldfusion, still pools the connections on a PER REQUEST basis! Ugh!
So you might think, let's just catch a database exception using cftry/cfcatch, and re-run the query. That will NOT work. And the reason is because Coldfusion creates a connection on the first query ran on the requesting page. It then re-uses the connection for the rest of the request. If you re-run the query, it still tries to use the same connection.
So once again, we are back to square one. And unfortunately, our application has 1 big feature that runs for about 60-120 seconds. So there is no way for us to use a large timeout (too many sleeping connections), or lower the number of seconds some parts of our application run - since they are designed to run that way.
Two things I have thought of, one of them I have tested.
1) Creating 2 Coldfusion datasources that connect to same database. If a connection fails on 1 datasouce, catch the exception and re-try the query on the second datasource.
2) Do what we did - abandon CFQUERY.
Yes, we have abandoned CFQUERY and instead used Java objects to create our own connections whenever we wanted to, and kill / open new connections whenever / however we want to.
The advantage - not one single error from null pointers on our application. Zero. Application is running smoothly and we can still support all these queries on 1 Mysql box with a timeout of 5 seconds. Also, you have much more control over your queries.
The disadvantage - well - you are not using Cfquery. That means, no Cfqueryparam, therefore you have to write your own Prepared Statements (since that is what Cfqueryparam mainly does for you). What I do is place the query the same way, except in a CFSAVECONTENT tag. Then I just pass it to the Java code which executes it and returns me the Coldfusion query. Here are three links on getting you running with executing queries through Java objects via Coldfusion:
Once again, shame on Adobe for not fixing this bug 1 year after release and hundreds of complaints. All they have to do is check if the connection exists, if it doesn't, create a new connection and issue a query with the new connection. How HARD can that possibly be? For all the thousands we pay for the software they can't do that? Or at least give us a tag/function to create a new Coldfusion mysql connection from within the request.
This is not the only work around from them that I had to implement with Java. Sometimes I think that maybe it is worth jump-shipping to .NET since I am almost writing the same amount of code with all these work arounds.
Best of luck to anyone that comes across this problem. Don't forget to shoot yourself at the end.