[imp] problem with attachments in unicode (UTF16)
Michael M Slusarz
slusarz at horde.org
Thu Mar 27 18:44:58 UTC 2008
Quoting Otto Stolz <Otto.Stolz at uni-konstanz.de>:
> So, I think the best solution would be:
> - Provide a Charset field next to the file-selection widget
> for the user to specify the encoding of the file he chooses
> for uploading;
No. Because 99.9% of users have no idea idea what a charset is. Even
I, as a somewhat experienced user, have no idea what charset my text
docs are in (and nor do I care what their charset is).
> - if the user chooses a text file and a charset, tag the
> attachment so; optionally, warn if the uploaded file contains
> illegal data w.r.t. the charset chosen;
No. See above.
> - if the user chooses a text file, but leaves the Charset
> default value ‘unknown’ alone, try to guess the charset,
> as discussed earlier in this thread;
Alter a bit: if a user uploads a text file, attempt to "guess" the
charset. This will need to be done in PHP code. Possible perl
modules that may be useful to port to PHP for this purpose:
http://search.cpan.org/dist/Encode-Detect/
http://search.cpan.org/~dankogai/Encode-2.24/ (More specific, the
Encode::Guess module)
Fallback to the charset the browser is using since that is (most
likely) the charset used by the underlying OS.
michael
--
___________________________________
Michael Slusarz [slusarz at horde.org]
More information about the imp
mailing list