Using CFDBINFO and CFZIP for quick database backups

All major database products have tools to let you backup their databases. MySQL makes it super simple with their command line tools. But what if you want to do it with ColdFusion? Doesn't everyone want to do everything with ColdFusion? I know I do! So let's look at a quick example.

My database backup code will work like so:

  1. Get a list of tables from a datasource.
  2. Select all rows from the table.
  3. Convert the query into WDDX.
  4. Zip the giant XML string and store the result.

The code for all this is incredibly simple:

<cfset datasource="blogdev">

<cfdbinfo datasource="#datasource#" name="tables" type="tables">

This code simply uses the new cfdbinfo tag to query the tables from the datasource.

<!--- One struct to rule them all... --->
<cfset data = structNew()>

I'm going to store all my queries in one struct.

<cfloop query="tables">
   <!--- grab all data from table --->
   <cfquery name="getData" datasource="#datasource#">
   select   *
   from   #table_name#
   </cfquery>
   
   <cfset data[table_name] = getData>
</cfloop>

Then I loop over each table and select *. Notice I store the query into the struct. By the way - the cfdbinfo tag also lets you get the columns from a database table. But since this is a "quickie" script, I don't mind using the select *.

<!--- Now serialize into one ginormous string --->
<cfwddx action="cfml2wddx" input="#data#" output="packet">

Then we convert the structure into one XML packet.

<!--- file to store zip --->
<cfset zfile = expandPath("./data.zip")>

<!--- Now zip this baby up --->
<cfzip action="zip" file="#zfile#" overwrite="true">
   <cfzipparam content="#packet#" entrypath="data.packet">
</cfzip>

Next I store the string into a zip using the new cfzip and cfzipparam tags. Notice how I feed the string data to the zip using cfzipparam. I don't have to store the text into a temporary file.

<cfoutput>
I retrieved #tables.recordCount# tables from datasource #datasource# and saved it to #zfile#.
</cfoutput>

The last thing I do is output a simple result message so you know how much data was backed up. Here is the complete source in one listing:

<cfset datasource="blogdev">

<cfdbinfo datasource="#datasource#" name="tables" type="tables">

<!--- One struct to rule them all... --->
<cfset data = structNew()>

<cfloop query="tables">
   <!--- grab all data from table --->
   <cfquery name="getData" datasource="#datasource#">
   select   *
   from   #table_name#
   </cfquery>
   
   <cfset data[table_name] = getData>
</cfloop>

<!--- Now serialize into one ginormous string --->
<cfwddx action="cfml2wddx" input="#data#" output="packet">

<!--- file to store zip --->
<cfset zfile = expandPath("./data.zip")>

<!--- Now zip this baby up --->
<cfzip action="zip" file="#zfile#" overwrite="true">
   <cfzipparam content="#packet#" entrypath="data.packet">
</cfzip>

<cfoutput>
I retrieved #tables.recordCount# tables from datasource #datasource# and saved it to #zfile#.
</cfoutput>

Comments

You could save some space by serializing the query to json instead of wddx.
# Posted By Jordan Clark | 11/28/07 5:41 PM
Sweet! I'll have to give this a try... Make sure to post the follow up on how to restore a database with ColdFusion!
# Posted By Dan | 11/28/07 6:02 PM
Since you are going to zip the file anyway, saving it to json doesn't gain much over saving it to xml. The XML version is a little more descriptive, and since it is WDDX, it can be read and used by more then just CF. AFAIK, the json encoding of queries is specific to CF.

I was thinking about writing something similar to this, only generating the create and insert statements required to extract a DB in a form that could easily be moved to a different server.
# Posted By Sam Curren | 11/28/07 6:39 PM
The other day I was looking into a Java solution for this called ddlutils. It can be used to do this very thing (as well as create an XML representation of your table structure). The cool thing about it is that it supposedly can be used to take that same data/schema and import it into a completely different DBMS. More to come on my experiences with it (ping me offline if you want to know more).
# Posted By todd sharp | 11/28/07 8:16 PM
Im having problems to read your "code blocks" in my Mac (Safari 3), the font size is way small.

Am I the only one? (I know u are on a mac too Raymond)
# Posted By Raul Riera | 11/29/07 2:31 AM
Raul, I see it. I'll look into fixing it a bit later today.
# Posted By Raymond Camden | 11/29/07 6:16 AM
Very interesting! What do you suppose would happen if this were done on a database with several tables of 100,000+ records? Timeout issues? Processing load an issue?
# Posted By James Edmunds | 11/29/07 10:31 AM
The world - as we know it - would come to an end. ;)

Um, I'd thin it would probably time out. You could add sanity checks in there - checking record count, etc.
# Posted By Raymond Camden | 11/29/07 10:36 AM
Any suggestions for using cfdbinfo to select all tables except a list of tables to skip?
# Posted By David Buhler | 11/29/07 10:49 AM
subscribing :)
# Posted By David Buhler | 11/29/07 10:49 AM
The result is a query. If you had a list of 'bad tables':

<cfset badlist = "foo,moo">

Then in your cfloop, you just do:

<cfif not listfind(badlist, table_name)>

Ie, if you don't find the table in the bad list, carry on.
# Posted By Raymond Camden | 11/29/07 10:52 AM
Ray, nice, this would allow you to, say, omit CDATA and CGLOBAL and use this on an otherwise digestibly-sized database.
# Posted By James Edmunds | 11/29/07 10:59 AM
If you wanted to get fancy, you could check the columns and ignore BLOBS and CLOBS.
# Posted By Raymond Camden | 11/29/07 11:05 AM
That's pretty cool.

IMO, a "skip-list" seems like a great way to compromise between incremental back-ups, and full back-ups for tables that never change.

Thanks Ray!
# Posted By David Buhler | 11/29/07 12:29 PM
I love this script! I truelly do!
Now that said (and it works great), is it possible to to save that backup data into a file other than a wddx packet?

say like a text file perhaps that is zipped up instead?
Just Curious...
# Posted By James Harvey | 1/30/08 8:32 AM
Well WDDX is a nice way to quickly convert data into a string. If you wanted to do it someother way, perhaps with JSON, that would be trivial in CF8 as well. Point is - you have to find _some_ way to convert a data structure (the query) into a simple string.
# Posted By Raymond Camden | 1/30/08 9:27 AM
Just curious since I am now doing this exact type thing. Is there any reason why you query each of the tables to gather all the data, then put it in a packet instead of doing a single BACKUP DATABASE **** TO DISK='***' type query and just zip up the .bak file?
# Posted By Erik | 4/15/08 11:12 AM
The idea was to write a generic solution in CF. SOmething DB specific would be better, but this could be applied to _anything_.
# Posted By Raymond Camden | 4/15/08 11:16 AM
yaay! now i can back up ... but how do we restore?
i m gonna read a bit more about json and wddx packets :)


good one
# Posted By ramzi | 4/27/08 4:18 AM
Another great little snippet there Ray! Just what I was after for my backup script, though wddx isnt something i have come accross before.

As an aside for bigger DB's what would you or anyone else recommend as a way of maybe splitting the file into "emailable" chunks?
# Posted By Mike | 6/19/08 5:35 AM
Considering it's all just text, I'd just use cfzip.
# Posted By Raymond Camden | 6/19/08 7:05 AM
Ray, any chance of a follow up on this on how you would use the wddx to get info back?
# Posted By Mike | 8/29/08 2:25 PM
That would involve:

a) First, you want to convert the WDDX packet back into native data. That is as simple as running cfwddx again, but with action=wddx2cfml.

b) This gives us a structure with table names as keys, and queries as data. For each key you would:

c) Loop over the query and do an insert of the data back into the table. Remember that CF gives you functions to inspect queries (columnlist) and get the individual cells, so this is something you can handle easily enough. :)
# Posted By Raymond Camden | 8/30/08 9:23 AM