Don't use exec and database connections

Dieser Post wurde aus meiner alten WordPress-Installation importiert. Sollte es Darstellungsprobleme, falsche Links oder fehlende Bilder geben, bitte einfach hier einen Kommentar hinterlassen. Danke.

Typical Perl scripts (and others running as CGI-scripts) run and exit once they're done, but this isn't very efficient with medium or high request counts. Persistent solutions like FCGI and ModPerl avoid the additional interpreter loading and compiling phases, but start being challenging if any source file is changed.

A classic CGI script doesn't care: It exists after one request and everything is reloaded before the next run, but persistent interpreters need to check for changed files and trigger some kind of "recompile" themself. I can't force developers to trigger a manual restart for some reasons.

I'll explain how to detect if a recompile is necessary in another blog post, but how to recompile something in Perl? Either load the source file and push it through eval(), use do() or require, but all of them have major drawbacks: If one file is replaced within a running script, file-scope variables might end up in some unexpected state, initialization is done twice or not at all - not really usable in a production environment.

The YAWF framework has a simple, small development webserver which does a best-effort mod_perl simulation. It doesn't care about changed files or reloading - it's very likely that at least one source file or template changes between two requests to a developer webserver. Every request is running in a new, forked process using the webservers Perl interpreter but a fresh and clean environment. Good for testing, but not much better than classic CGI scripts in a production environment.

I wrote a basic worker/request handler lately and started to push real traffic on it today. It was using exec $0, @ARGV; to replace itself by a fresh process which could compile everything from scratch on any source change. Everything was fine until other tasks started to get "Too many connection" errors from a mySQL server.

It turned out that those exec calls successfully refreshed everything including some network connections, but not any database connection. I can't switch to system because those processes are under control of a scheduler, the process finishing at the system call would be treated as "finished" and restarted by the scheduler.

The check for source file changes now runs every 240 + int(rand(60)) seconds and the workers are respawned 10 to 300 seconds after their exit. The scheduler could respawn them up to 1 second after they're done but if any of the has a huge, resource-eating endless loop, it's copies might crash the whole server.


3 Kommentare. Schreib was dazu

  1. Paul "LeoNerd" Evans

    Another solution may just be to mark those database connections as FD_CLOEXEC. Ofcourse, actually getting the underlying file descriptor out of the MySQL connector library to set this flag may be non-trivial.. :(

  2. Max

    You have to set the flag FD_CLOEXEC on each file descriptor before exec'ing. See man fcntl(), commands F_GETFD and F_SETFD.

    else, you can close all the file descriptors you don't need anymore before exec'ing. See the implementation of close_all_fds_except() in the module AnyEvent::Util for an example.


  3. Sebastian

    DBI provides a function to return some DBI-internal string which could be passed to the "new" process who could re-open those db connections but I think this is too risky in this usage case...

Schreib was dazu

Die folgenden HTML-Tags sind erlaubt:<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>