Memory backup

Jan 16

von Sebastian am 16.01.2013 um 20:44 in English, Howto & Tutorial, Perl

Dieser Post wurde aus meiner alten WordPress-Installation importiert. Sollte es Darstellungsprobleme, falsche Links oder fehlende Bilder geben, bitte einfach hier einen Kommentar hinterlassen. Danke.

Some weeks ago, I wrote a post about aggregated logging by merging many "UPDATE ... SET count=count+1" SQL statements into one. But there's a problem: All of the data stored in memory is lost if the task or server crashes or if the database dies and doesn't come back until the task exists. Here is how I solved this problem.

I added the aggregate logging to a recent project currently merging more than 25 million UPDATE queries into less than 10.000 - each day! But I'm, some kind of paranoid for data loss and won't create such a solution without securing the data against crashes. My preconditions:

lightweight, low server resource usage
very stable
no external (database) server

I decided to create a server-local transaction file which is much less complicated than you might think. A local file will survive any server crash but might be lost on disk crashes. We're running at least RAID-1 on all servers plus hardware and S.M.A.R.T. monitoring which is safe enough for my data. The aggregated logging Gearman worker process is receiving single log events and flushing them as blocks into the database. The script has to open a file somewhere at the beginning:

use Fcntl qw(SEEK_SET SEEK_CUR O_RDWR O_CREAT O_EXCL);sysopen my $transaction_fh, "/tmp/transaction.$$", O_RDWR | O_CREAT | O_EXCL, oct(666) or die 'Error opening transaction file: '.$!;

You'll notice that I'm using sysopen instead of the default Perl open. Commands without "sys" prefix are doing a buffering for performance reasons, but my transaction file should survive process crashes where the Perl buffer might not be flushed. Every single log event is written by a sub:

sub add_trans {  my $line = shift;  # Append a custom EOF mark to the line-to-be-written  $line .= "\n\xff\n";
  # Write the data line to the transaction file and seek back before the EOF mark  syswrite $transaction_fh, $line;  sysseek $transaction_fh, -2, SEEK_CUR; # Set file pointer backwards two bytes (the EOF mark will be overwritten at the next write)
  return;}
# Within the log event processing:add_trans(join("\x00", $tablename, $count_column, $amount, @unique_keys));

Every log event will be one line in the transaction file. This sample is using \x00 (ASCII code 0), \n (CR) and \xff (ASCII code 255) as control chars, they must not be part of the data to be written under any circumstances!). Evey line gets a \xff suffix (and an additional line break) as my custom end-of-file (EOF) marker and sysseek is used to set the file pointer to the position of the newly written EOF marker. The next line will overwrite the EOF marker and add a new one but the EOF will stay there if anything goes wrong and the file isn't closed properly. Always add the data and the EOF marker as one single syswrite call. You might still end up with multiple write kernel calls but it's much less likely than two different Perl syswrite lines. The process will continue to collect log events and the transaction file will grow until the data should be flushed into the database. The process will loop through it's memory structures to write all tables counters into the database, there is no need to ever read the transaction file. Once a table is flushed (and the database confirmed the UPDATE or INSERT query), a flush information is written into the transaction file:

add_trans(join("\x00", '***FLUSH***', $tablename));

You might run into trouble if any of your tables is called ***FLUSH*** but I'm pretty sure that most SQL servers won't accept a star as valid char for table names. Remember the EOF marker? It's important once all tables are flushed. I don't want to close, delete and reopen the same file every few minutes because that won't be "lightweight", so I reset the file. There is no need to actually reduce the filesize because it will usually grow to the same size until the next flush is done. This is why I have my EOF marker:

sysseek $transaction_fh, 0, SEEK_SET;syswrite $TRANSFILE_FH, "\xff\n";sysseek $TRANSFILE_FH, -2, SEEK_CUR; # Set file pointer backwards two bytes

These lines add a EOF marker just as the first char of the transaction file and reset the file position to the very beginning. The next add_trans()call will overwrite the EOF marker as usual. A regular, planned exit of the process will delete the transaction file after everything has been flushed successfully. That's all within the log worker. I started the aggregated logging few month ago and only had two "orphaned" transaction files during these months. Every log worker is checking for transaction files in the directory dedicated to them and will report any file which doesn't belong to a running process. The two files were left after crashes for different reasons and I had to read them. Reading a transaction file is much more easy than writing it. The file contains one line per action performed by the original creator. It has no "final state record" but if you want to know the results - play the game again exactly as it has been played before. Imagine this transaction file (shown as a table):

VIEW_COUNT	2013-01-16	1
VIEW_COUNT	2013-01-16	2
CLICK_COUNT	2013-01-16	1
VIEW_COUNT	2013-01-16	2
*FLUSH*	CLICK_COUNT
\xff
VIEW_COUNT	2013-01-16	1
CLICK_COUNT	2013-01-16	1
*FLUSH*	CLICK_COUNT
*FLUSH*	VIEW_COUNT
\xff

Let's replay what happend before:

Count 1 (last column) for table VIEW_COUNT and date 2013-01-16. Create a hash key from tablename and date and set the value to 1 (increase from nothing to 1).
Add another 2 for table VIEW_COUNT and date 2013-01-16. Create a hash key from tablename and date and increase the value by 1.
Create a hash key from tablename CLICK_COUNT and date 2013-01-16 and increase the value by 1.
Add another 2 for table VIEW_COUNT and date 2013-01-16. Create a hash key from tablename and date and increase the value by 1.
Table CLICK_COUNT has been flushed successfully: Remove it from the hash because there is no need to rescue this count - it wasn't lost.
EOF marker ( \xff ) found - do not read anything below line 6.

Always stop reading at the first EOF marker - anything below has been left over from a previous run but has been flushed (the file pointer wouldn't have been resetted to the start of the file). Let's look at the hash: The table CLICK_COUNT has been deleted because it has been flushed successfully before, only VIEW_COUNT for 2013-01-16 is left having a counter value of 5. Add these 5 to the database and you're done.

This transaction file is not 100% perfect, direct disk I/O would be much more safe but it's safe enough for me. The crashes affected some 150.000 transaction lines which couldn't be flushed and would have been lost. I'm using a very simple script (launched manually) to read the leftover transaction file and write the missing data into the database - and delete the file once everything is done.