Seitenanfang

DESTROY or __DIE__ - choose one

Dieser Post wurde aus meiner alten WordPress-Installation importiert. Sollte es Darstellungsprobleme, falsche Links oder fehlende Bilder geben, bitte einfach hier einen Kommentar hinterlassen. Danke.


Some people always complain, even about the language they've been working with for years while they has other options. One of them recently complained that Perl cleans up objects in unpredictable order during global destruction (after exit; and all END blocks). Here is what happend, how we tracked it down and solved it.

A simplified version of the module in question:

sub new {   my $class = shift;

my $self = bless { dbh => Projects::Database::connect() }, $class;

$SIG{__DIE__} = sub { Projects::Database::do($self->{dbh},"UPDATE logtable SET result='failed' WHERE pid=$$"); };

return $self;}

sub DESTROY { Projects::Database::do($self->{dbh}, "UPDATE logtable SET done=1 WHERE pid=$$");}

The calling script basically did a ->new() call of the module, called some methods and finally exited. Everything was fine until someone added a mostly unrelated patch to the project's database module. Suddenly the script started crashing within the DESTROY method.

The new patch was using an object stored in a module-wide variable of another module. It was working for months without any problems being called millions of times a day - until now.

The first fix was easy: The calling script was using a scriptwide variable for the object, clearing it with "undef" should force the garbage collector to clean up the object and run the DESTROY method before global destruction - and before unpredictable items will be cleaned up which might still be required for the DESTROY handler processing.

But it didn't work! We proofed that the object was still in place after it should have been destroyed.

Self reference?

I assumed a copy of the object reference within the object hash like
$VAR1 = {          'x' => $VAR1        };
...but I was wrong. A similar case (where a hashref somewhere within the hash tree of an object contained a reference to the object) some time ago, but not this time.

Reference count

I've been lost on this problem for two days until some page pointed me at a CPAN module - for reading out the reference count of a Perl object! Unfortunately I forget both the page URL and the module name.

I built a small subroutine based on the module, similar to the memory leak hunter subroutine:

sub _rc {    use B;    print scalar((caller)[2])."\t".B::svref_2object(shift)->REFCNT;}
It turned out that the reference count was 1 within the whole sub new even just before the return $self and got 2 just after the $object = Project::Module->new(); call, somewhere between the return and the assignment to $object. I started commenting out everything between bless and return - and got a reference count of 1 just before the return and just after the ->new call.

There are two variables holding the object's reference: $self and $object but $self is being destroyed at the same time when $object gets the value. The "1 reference" within the sub new is $self and the "1 reference" in the main script is $object.

Uncommenting the lines between bless and return brought back the "2 references" and I suddenly noticed the reason.

For the sake of the developer

The solution was hidden in Perls handling of variables for anonymous subroutines (not only for anonymous but also for names subs, but it's more obvious for anonymous ones):
sub foo {   my $bar;   my $baz = sub { print $bar; };}
The subroutine referenced by $baz will have access to $bar even after the foo subroutine has finished and all other variables have been discarded. This is a great advantage of Perl which really simplifies callback routines. JavaScript is using callback functions much more but lacks this feature and it's a pain passing values to them.

The anonymous subroutine is stored in $SIG{__DIE__} in this case and it's using $self. No copy of $self, but the original $self holding the return value of bless. This is why the reference count didn't increase just after the anonymous sub was defined.

This time solution was a destruction of $SIG{__DIE__} at the same time $object was undefined - and it worked!

TIMTOWTDI - but some are better

I don't like the global manipulation of the die handler. It might be OK for a script but most of our scripts also run in persistent environments like FastCGI, mod_perl or Gearman worker processes. Manipulating %SIG may cause unexpected situations in completely unrelated files at some time later, but it's not easy to refactor that module quickly and so I need to accept it for the moment.

I strongly suggest:

  • Don't change any global variable (reserved or not) without using local
  • Try to avoid copying object references, even without a DESTROY method - they'll stay in memory and block it for nothing
  • Remember Scalar::Util and the weaken function (but using it may be bad style)
  • Remember this post and the sub _rc shown above. Copy it into your file and start adding _rc(); called spread over your source
  • Don't complain about your language, start debugging. There are very very few Perl bugs left, most problems are based on misunderstanding of the developer
 

4 Kommentare. Schreib was dazu

  1. Matthew Musgrove

    I've started using something like this in my OO modules.


    use Devel::GlobalDestruction;
    ...
    sub DESTROY
    {
    my $self = shift;


    if ( in_global_destruction )
    {
    warn " *** In global destruction ***\n";
    warn " *** You have created a closure around the ", blessed $self, " object. ***\n";
    return;
    }
    ...
    }

  2. Paul "LeoNerd" Evans

    Is it possible the module you were thinking of was either Devel::Refcount or Test::Refcount?


    I use Test::Refcount a lot for this sort of thing. In your case for example, you've discovered that testing is_oneref() after a constructor is a good idea. If the constructor already yields more references than 1 then there is likely a bug. This is especially powerful when combined with Devel::FindRef, as it will find where the leaked references are.


    Your Scalar::Util::weaken() idea might also benefit my idea about weasels:


    http://leonerds-code.blogspot.co.uk/2010/05/weasels-in-code.html

  3. My idiom: weaken $self when I reference it in a closure that needs $self.


    sub get_callback()
    {
    my $self = shift;
    Scalar::Util::weaken($self);
    return sub {
    $self // return;
    ...
    # do something with $self
    }
    }

  4. Sebastian

    I'd expect $self to disappear if it's weakend in sub new, but a simple copy which is weakend and used in the callback should be a good workaround.

Schreib was dazu

Die folgenden HTML-Tags sind erlaubt:<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>