[imp] Excessive memory usage
Andrew Morgan
morgan at orst.edu
Tue Oct 7 05:12:16 UTC 2008
On Mon, 6 Oct 2008, Michael M Slusarz wrote:
> Your analysis, while logical, is flawed. Due to the way PHP handles
> variables internally, the memory is *not* being doubled by this
> particular function. I won't go into the details, but you can read
> about it here:
> http://porteightyeight.com/archives/111-The-Truth-About-PHP-Variables.html
>
> Here's a test script to demonstrate:
>
> class Test {
> var $a = array();
>
> function foo()
> {
> $this->a[] = str_repeat('0', 1000000);
> return $this->a[0];
> }
>
> function bar()
> {
> $this->a = null;
> }
> }
>
> print memory_get_usage() . "\n"; // Output: 86816
> $test = new Test();
> $b = $test->foo();
> print memory_get_usage() . "\n"; // Output: 1087848
> unset($b);
> print memory_get_usage() . "\n"; // Output: 1087944 (the Test->a
> variable still exists)
> $test->bar();
> print memory_get_usage() . "\n"; // Output: 87920 (sure enough, once
> all references to the data are removed, the memory is recovered)
I think we are in agreement here. I inserted calls to memory_get_usage()
throughout the MIME Viewer code, trying to track down what was happening
with these Appledouble attachments. In the process, I found that while
the Appledouble processing is a worst-case scenario, there is still extra
memory usage even with regular attachments.
Here is what seems to be happening, based on my reading of the code and
memory_get_usage() statements:
1. IMP parses the MIME headers of a message in preparation for displaying
the message.
2. For each MIME part, IMP determines the MIME type and calls the
appropriate MIME_Viewer driver.
3. The MIME_Viewer driver calls getMIMEPart(), which calls
getRawMIMEPart(), which calls _setContents(), which finally calls
getBodyPart().
4. getBodyPart() fetches the MIME part via imap_fetchbody(), stores a copy
of it in the _bodypart[] array, and returns the MIME part up the stack of
functions.
*By design* there are 2 copies of each attachment in memory. One in the
_bodypart[] array and the working copy.
This makes perfect sense if the working copy memory is freed after each
attachment is processed. In my test message, I have ten image
attachments, each roughly 3MB. As each attachment is processed, I would
expect to see the memory usage increase by 6MB (cache + working copy) and
then decrease by 3MB (working copy freed). Instead I see the memory usage
increase by 6MB each time. Obviously, all the memory is freed when the
script finishes, and there is overhead memory as these objects are
created.
Here is a small sample of my debug output:
Oct 06 14:31:33 HORDE [notice] [imp] before getDecodedMIMEPart 2: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] before getRawMIMEPart: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] before getMIMEPart: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] after getMIMEPart: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] before setContents: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] before getBodyPart: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] before imap_fetchbody: 9437184 <--
Oct 06 14:31:33 HORDE [notice] [imp] after imap_fetchbody: 12845056 <--
Oct 06 14:31:33 HORDE [notice] [imp] after getBodyPart: 16252928 <--
Oct 06 14:31:33 HORDE [notice] [imp] after setting contents: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after setContents: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after getRawMIMEPart: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after transferDecodeContents: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after getDecodedMIMEPart 2: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after convertMIMEPart: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after MIME_Contents: 18874368
Oct 06 14:31:33 HORDE [notice] [imp] after buildMessage: 18874368
Oct 06 14:31:33 HORDE [notice] [imp] ending: 18874368
Oct 06 14:31:33 HORDE [notice] [imp] before getBodyPart: 18874368
Oct 06 14:31:33 HORDE [notice] [imp] before imap_fetchbody: 18874368 <--
Oct 06 14:31:34 HORDE [notice] [imp] after imap_fetchbody: 24117248 <--
Oct 06 14:31:34 HORDE [notice] [imp] after getBodyPart: 29360128 <--
Oct 06 14:31:34 HORDE [notice] [imp] after setting contents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getRawMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after getMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before setContents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getBodyPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before imap_fetchbody: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after imap_fetchbody: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after getBodyPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after setting contents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after setContents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after getRawMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after transferDecodeContents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getDecodedMIMEPart 2: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getRawMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after getMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before setContents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getBodyPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before imap_fetchbody: 29360128 <--
Oct 06 14:31:35 HORDE [notice] [imp] after imap_fetchbody: 34340864 <--
Oct 06 14:31:35 HORDE [notice] [imp] after getBodyPart: 39321600 <--
Oct 06 14:31:35 HORDE [notice] [imp] after setting contents: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after setContents: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after getRawMIMEPart: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after transferDecodeContents: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after getDecodedMIMEPart 2: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after convertMIMEPart: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after MIME_Contents: 42991616
Oct 06 14:31:35 HORDE [notice] [imp] after buildMessage: 42991616
Oct 06 14:31:35 HORDE [notice] [imp] ending: 42991616
> Caching data may not be used in this particular situation, but it may be
> used in other situations. And caching data is much less resource
> intensive than having to do another IMAP access, especially with a
> c-cclient function (you need to read the entire data into memory on both
> the PHP and IMAP server, the overhead of the IMAP transaction between
> the two, and then the cost of sending this data across the wire).
I don't disagree with caching the data, in principle, although I was never
able to show that the operations I was performing actually used the
_bodypart[] cache.
In summary, I think there are 2 separate issues I'd like addressed:
1. Working copies of the attachments aren't being freed.
2. The _bodypart[] doesn't seem to be used, so it's just using up memory.
I'll keep looking at the code, trying to understand what is happening.
Please let me know if you want me to test anything else, or provide more
information. My goal here is not to criticize the developers or the
project, but to improve the code and make Horde/IMP more scalable.
Thanks,
Andy
More information about the imp
mailing list