Aggregating database updates

Dieser Post wurde aus meiner alten WordPress-Installation importiert. Sollte es Darstellungsprobleme, falsche Links oder fehlende Bilder geben, bitte einfach hier einen Kommentar hinterlassen. Danke.

Statistics are nice, but may also increase database load. Counting web users (page impressions) may produce a lot of UPDATE requests setting counter=counter+1. I tried to merge them using Gearman.

Gearman is able to handle a huge amount of jobs using little resources. It has no replication and usually runs in memory but even a SQLite backup file requires much less disk I/O than a database updating rows and maybe indices.

The Gearman::Worker module (which is used by worker tasks) calls a "stop_if" callback after each single processed job and every 10 to 15 seconds if no job was received (but not during job execution). stop_if is currently undocumented but stable.

$worker->work( stop_if => sub {   return 0; });
Any true return value from stop_if will break the Gearman::Worker main loop between two jobs, all others break while a job is currently being processed.

Putting everything together

A simple worker would add all statistic input and flush it into the database as one, aggregated query.
# Create storage for number of callsmy $count;

# Create worker objectmy $worker = Gearman::Worker->new(job_servers => \@my_gearman_dispatchers);

# Register a job$worker->register_function('AggregateJob', 0, sub { my $job = shift; ++$count; # Increase the storage return 1;});

# Enter the main loop$worker->work( stop_if => sub { return 1 if $count > 100; # Safe exit the main loop after 100 items return 0; });

# Flush the counted number of items into the database$dbh->do('INSERT INTO statistics(datetime,amount) VALUES(NOW(),?)',$count);

Every call of the AggregateJob job triggers a call of the registered anonymous sub which increases the number of items stored in $count. At some point (here: once $count reaches 100), the main loop is finished (by a true return value from the stop_if callback) and $count is inserted into the database.

The stop_if sub may also do the flush and don't exit the worker:

 stop_if => sub {  return 0 if $count < 100; # Wait for enough value to flush  $dbh->do('INSERT INTO statistics(datetime,amount) VALUES(NOW(),?)',$count)   and $count = 0; # Flush the count but reset only on success  return 0; }
It will flush and reset the counter after 100 counts and continue the main loop.


stop_if will be called after each job and every 10 - 15 seconds while the worker is idle, do not increase the counter within the sub!

What happens if the worker dies for some reason, the server crashes or anything else? I'll take care of this in another post.




Noch keine Kommentare. Schreib was dazu

Schreib was dazu

Die folgenden HTML-Tags sind erlaubt:<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>