[imp] problem with attachments in unicode (UTF16)
    Michael M Slusarz 
    slusarz at horde.org
       
    Thu Mar 27 18:44:58 UTC 2008
    
    
  
Quoting Otto Stolz <Otto.Stolz at uni-konstanz.de>:
> So, I think the best solution would be:
> - Provide a Charset field next to the file-selection widget
>    for the user to specify the encoding of the file he chooses
>    for uploading;
No.  Because 99.9% of users have no idea idea what a charset is.  Even  
I, as a somewhat experienced user, have no idea what charset my text  
docs are in (and nor do I care what their charset is).
> - if the user chooses a text file and a charset, tag the
>    attachment so; optionally, warn if the uploaded file contains
>    illegal data w.r.t. the charset chosen;
No.  See above.
> - if the user chooses a text file, but leaves the Charset
>    default value ‘unknown’ alone, try to guess the charset,
>    as discussed earlier in this thread;
Alter a bit: if a user uploads a text file, attempt to "guess" the  
charset.  This will need to be done in PHP code.  Possible perl  
modules that may be useful to port to PHP for this purpose:
http://search.cpan.org/dist/Encode-Detect/
http://search.cpan.org/~dankogai/Encode-2.24/  (More specific, the  
Encode::Guess module)
Fallback to the charset the browser is using since that is (most  
likely) the charset used by the underlying OS.
michael
-- 
___________________________________
Michael Slusarz [slusarz at horde.org]
    
    
More information about the imp
mailing list