Frustating search disability

Dieser Post wurde aus meiner alten WordPress-Installation importiert. Sollte es Darstellungsprobleme, falsche Links oder fehlende Bilder geben, bitte einfach hier einen Kommentar hinterlassen. Danke.

Encoding is fun. A shirt stating "Schei? encoding" is very popular among German developers. My boss discovered a bad encoding problem yesterday and we've been spending hours searching for the reasons and a solution.

I need to encode and decode a complete hash tree (a value which might be a reference which might contain other references which might contain other references which...) and there is no "Encode::Tree" or "Hash::Encode" module on CPAN.

I wrote one for the job: Encode::Deep could encode and decode all kinds of nested references. It should be on CPAN right now but it isn't, because it turned out that Deep::Encode is already on CPAN, written in C and imported via XS.

It's 6 times faster than my pure-Perl version:

$ perl -MDeep::Encode -Ilib -MEncode::Deep -MBenchmark -le 'timethese(0,{DeepEncode => sub { deep_encode({ a => 1, b => ["abc", "ä"] },"iso-8859-1"); }, EncodeDeep => sub { Encode::Deep::encode("iso-8859-1", { a => 1, b => ["abc", "ä"] }); }});'Benchmark:runningDeepEncode, EncodeDeepfor at least 3 CPU seconds...

DeepEncode: 2 wallclock secs ( 3.07 usr + 0.00 sys = 3.07 CPU) @ 213954.72/s (n=656841)

EncodeDeep: 3 wallclock secs ( 3.20 usr + 0.00 sys = 3.20 CPU) @ 33071.88/s (n=105830)

But Deep::Encode lacks one important ability: It converts everything in-place and I need to get a deep copy to avoid modifying other modules data.

The core-module "Storable" has a dclone function which could be used as a workaround:

$ perl -MDeep::Encode -Ilib -MEncode::Deep -MBenchmark -MStorable=dclone -le 'timethese(0,{DeepEncode => sub { deep_encode(dclone({ a => 1, b => ["abc", "ä"] }),"iso-8859-1"); }, EncodeDeep => sub { Encode::Deep::encode("iso-8859-1", { a => 1, b => ["abc", "ä"] }); }});'Benchmark:runningDeepEncode, EncodeDeepfor at least 3 CPU seconds...

DeepEncode: 3 wallclock secs ( 3.24 usr + 0.00 sys = 3.24 CPU) @ 92233.33/s (n=298836)

EncodeDeep: 3 wallclock secs ( 3.25 usr + 0.00 sys = 3.25 CPU) @ 35525.23/s (n=115457)

Still 2,7 times faster than my pure Perl version. Yeah, I love doing work for nothing :-(

Some people love to establish circular references: A hash holding a reference to itself as a value or hidden somewhere in the nested references. They're not loved at all but happen from time to time and may be used by intend. My module could handle them but the competitor is running into a segmentation fault:

$ perl -MDeep::Encode -le '%a = (a => 1); $a{b} = \%a; Deep::Encode::deep_encode(\%a, "iso-8859-1");'Speicherzugriffsfehler (Speicherabzug geschrieben)
(A German one on my computer but still a SEGFAULT.)

I'll pass the results to JSON which also badly fails on circular references (but not as badly as a segmentation fault), so the XS based Dump::Encode will be our choice for now but Encode::Deep went to CPAN anyway. Use it whenever you need to support circular references and now please start slapping me for uploading a similar function of an existing module.


7 Kommentare. Schreib was dazu

  1. simon

    Any reason why you didn't report the segfault bug on the Deep::Encode RT tracker?

  2. [Imported from blog]

  3. [Imported from blog]

  4. [Imported from blog]

  5. [Imported from blog]

  6. [Imported from blog]

  7. [Imported from blog]

Schreib was dazu

Die folgenden HTML-Tags sind erlaubt:<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>