[Tickets #4340] Problems with german umlaut
bugs@bugs.horde.org
bugs at bugs.horde.org
Tue Aug 29 01:31:06 PDT 2006
DO NOT REPLY TO THIS MESSAGE. THIS EMAIL ADDRESS IS NOT MONITORED.
Ticket URL: http://bugs.horde.org/ticket/?id=4340
-----------------------------------------------------------------------
Ticket | 4340
Updated By | s_gatterbauer at idlm.net
Summary | Problems with german umlaut
Queue | Jonah
Type | Enhancement
State | Feedback
Priority | 1. Low
Owners |
-----------------------------------------------------------------------
s_gatterbauer at idlm.net (2006-08-29 01:31) wrote:
today at evening I will try something like this in lib/Jonah.php :
looking in the first 80 characters of the source-file (should contain the
XML declaration ordered : version - encoding - standalone)
for the string after encoding= (should be the charset).
if (preg_match('/.*;\s?charset="?([^"]*)/', $content_type,
$match)) {
$result['charset'] = $match[1];
+ } else {
+ $t_start = strpos(substr($result['body'], 1, 60),
'encoding=') + 10;
+ if ($t_start) {
+ $t_stop = strpos(substr($result['body'], $t_start, 20),
'"', $t_start);
+ $result['charset'] =
strtolower(trim(substr($result['body'], $t_start, $t_stop - $t_start)));
+ }
}
not very inventive (I do not know php), but it should extract the right
thing.
yes - there is a problem with $t_stop if the encoding value is included in
single quotes (I will look after it).
I am not sure about preg_match , but the following should also work :
if (preg_match('/.*;\s?charset="?([^"]*)/', $content_type,
$match)) {
$result['charset'] = $match[1];
+ } elsif (preg_match('/.*\s?encoding="?([^"]*)/',
substr($result['body'], 1, 80), $match)) {
+ $result['charset'] = $match[1];
}
More information about the bugs
mailing list