[Tickets #9201] Re: Check for ISO-8859-1/Windows-1252 improper charset labeling
bugs at horde.org
bugs at horde.org
Thu Aug 26 18:08:48 UTC 2010
DO NOT REPLY TO THIS MESSAGE. THIS EMAIL ADDRESS IS NOT MONITORED.
Ticket URL: http://bugs.horde.org/ticket/9201
------------------------------------------------------------------------------
Ticket | 9201
Updated By | Michael Slusarz <slusarz at horde.org>
Summary | Check for ISO-8859-1/Windows-1252 improper charset
| labeling
Queue | IMP
Version | Git master
Type | Enhancement
State | Feedback
Priority | 1. Low
Milestone | 5.0
Patch |
Owners | Jan Schneider, Michael Slusarz, Horde Developers
------------------------------------------------------------------------------
Michael Slusarz <slusarz at horde.org> (2010-08-26 14:08) wrote:
> My guess is that there is something weird going on with the DOM
> encoding/loading. It seems to be working perfect on my system - but
> that could be because I am using en_US.UTF-8. It might not be
> working properly on, e.g., de or fr locales.
>
> I would suggest playing around with charsets in Horde_Domhtml
> (located in the horde/Util package).
For reference... when I view the HTML part in a new window,
Horde_Domhtml is called once. The initial loadHTML() call fails as
the encoding is not auto-determined. It then moves into the forced
loadHTML() call after converting to UTF-8. The charset passed into
the constructor is UTF-8.
Pseudocode:
public function __construct($text, 'UTF-8)
{
$doc = new DOMDocument();
$doc->loadHTML($text);
// $doc->encoding is empty
$this->encoding = $doc->encoding;
if (!is_null($charset)) {
if (!$doc->encoding) {
$doc->loadHTML('<?xml encoding="UTF-8">' .
Horde_String::convertCharset($text, $charset, 'UTF-8'));
$this->encoding = 'UTF-8';
}
}
}
More information about the bugs
mailing list