[imp] Excessive memory usage

Andrew Morgan morgan at orst.edu
Tue Oct 7 05:12:16 UTC 2008


On Mon, 6 Oct 2008, Michael M Slusarz wrote:

> Your analysis, while logical, is flawed.  Due to the way PHP handles
> variables internally, the memory is *not* being doubled by this
> particular function.  I won't go into the details, but you can read
> about it here:
> http://porteightyeight.com/archives/111-The-Truth-About-PHP-Variables.html
>
> Here's a test script to demonstrate:
>
> class Test {
>    var $a = array();
>
>    function foo()
>    {
>        $this->a[] = str_repeat('0', 1000000);
>        return $this->a[0];
>    }
>
>    function bar()
>    {
>        $this->a = null;
>    }
> }
>
> print memory_get_usage() . "\n";  // Output: 86816
> $test = new Test();
> $b = $test->foo();
> print memory_get_usage() . "\n";  // Output: 1087848
> unset($b);
> print memory_get_usage() . "\n";  // Output: 1087944 (the Test->a
> variable still exists)
> $test->bar();
> print memory_get_usage() . "\n";  // Output: 87920 (sure enough, once
> all references to the data are removed, the memory is recovered)

I think we are in agreement here.  I inserted calls to memory_get_usage() 
throughout the MIME Viewer code, trying to track down what was happening 
with these Appledouble attachments.  In the process, I found that while 
the Appledouble processing is a worst-case scenario, there is still extra 
memory usage even with regular attachments.

Here is what seems to be happening, based on my reading of the code and 
memory_get_usage() statements:

1. IMP parses the MIME headers of a message in preparation for displaying 
the message.

2. For each MIME part, IMP determines the MIME type and calls the 
appropriate MIME_Viewer driver.

3. The MIME_Viewer driver calls getMIMEPart(), which calls 
getRawMIMEPart(), which calls _setContents(), which finally calls 
getBodyPart().

4. getBodyPart() fetches the MIME part via imap_fetchbody(), stores a copy 
of it in the _bodypart[] array, and returns the MIME part up the stack of 
functions.

*By design* there are 2 copies of each attachment in memory.  One in the 
_bodypart[] array and the working copy.

This makes perfect sense if the working copy memory is freed after each 
attachment is processed.  In my test message, I have ten image 
attachments, each roughly 3MB.  As each attachment is processed, I would 
expect to see the memory usage increase by 6MB (cache + working copy) and 
then decrease by 3MB (working copy freed).  Instead I see the memory usage 
increase by 6MB each time.  Obviously, all the memory is freed when the 
script finishes, and there is overhead memory as these objects are 
created.

Here is a small sample of my debug output:

Oct 06 14:31:33 HORDE [notice] [imp] before getDecodedMIMEPart 2: 9437184 
Oct 06 14:31:33 HORDE [notice] [imp] before getRawMIMEPart: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] before getMIMEPart: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] after getMIMEPart: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] before setContents: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] before getBodyPart: 9437184
Oct 06 14:31:33 HORDE [notice] [imp] before imap_fetchbody: 9437184 <--
Oct 06 14:31:33 HORDE [notice] [imp] after imap_fetchbody: 12845056 <--
Oct 06 14:31:33 HORDE [notice] [imp] after getBodyPart: 16252928 <--
Oct 06 14:31:33 HORDE [notice] [imp] after setting contents: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after setContents: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after getRawMIMEPart: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after transferDecodeContents: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after getDecodedMIMEPart 2: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after convertMIMEPart: 16252928
Oct 06 14:31:33 HORDE [notice] [imp] after MIME_Contents: 18874368
Oct 06 14:31:33 HORDE [notice] [imp] after buildMessage: 18874368
Oct 06 14:31:33 HORDE [notice] [imp] ending: 18874368

Oct 06 14:31:33 HORDE [notice] [imp] before getBodyPart: 18874368
Oct 06 14:31:33 HORDE [notice] [imp] before imap_fetchbody: 18874368 <--
Oct 06 14:31:34 HORDE [notice] [imp] after imap_fetchbody: 24117248 <--
Oct 06 14:31:34 HORDE [notice] [imp] after getBodyPart: 29360128 <--
Oct 06 14:31:34 HORDE [notice] [imp] after setting contents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getRawMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after getMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before setContents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getBodyPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before imap_fetchbody: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after imap_fetchbody: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after getBodyPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after setting contents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after setContents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after getRawMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after transferDecodeContents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getDecodedMIMEPart 2: 29360128 
Oct 06 14:31:34 HORDE [notice] [imp] before getRawMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] after getMIMEPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before setContents: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before getBodyPart: 29360128
Oct 06 14:31:34 HORDE [notice] [imp] before imap_fetchbody: 29360128 <--
Oct 06 14:31:35 HORDE [notice] [imp] after imap_fetchbody: 34340864 <--
Oct 06 14:31:35 HORDE [notice] [imp] after getBodyPart: 39321600 <--
Oct 06 14:31:35 HORDE [notice] [imp] after setting contents: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after setContents: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after getRawMIMEPart: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after transferDecodeContents: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after getDecodedMIMEPart 2: 39321600 
Oct 06 14:31:35 HORDE [notice] [imp] after convertMIMEPart: 39321600
Oct 06 14:31:35 HORDE [notice] [imp] after MIME_Contents: 42991616
Oct 06 14:31:35 HORDE [notice] [imp] after buildMessage: 42991616
Oct 06 14:31:35 HORDE [notice] [imp] ending: 42991616

> Caching data may not be used in this particular situation, but it may be
> used in other situations.  And caching data is much less resource
> intensive than having to do another IMAP access, especially with a
> c-cclient function (you need to read the entire data into memory on both
> the PHP and IMAP server, the overhead of the IMAP transaction between
> the two, and then the cost of sending this data across the wire).

I don't disagree with caching the data, in principle, although I was never 
able to show that the operations I was performing actually used the 
_bodypart[] cache.

In summary, I think there are 2 separate issues I'd like addressed:

1. Working copies of the attachments aren't being freed.

2. The _bodypart[] doesn't seem to be used, so it's just using up memory.

I'll keep looking at the code, trying to understand what is happening. 
Please let me know if you want me to test anything else, or provide more 
information.  My goal here is not to criticize the developers or the 
project, but to improve the code and make Horde/IMP more scalable.

Thanks,
 	Andy


More information about the imp mailing list