Charsets of files Back
How to check the charset of a specified file?
REF: https://stackoverflow.com/questions/805418/how-can-i-find-encoding-of-a-file-via-a-script-on-linux
On Linux/UNIX/OS X/cygwin:
file -i xxx
Or using uchardet - An encoding detector library ported from Mozilla, which can detect some Chinese standard charsets like GB18030, etc. Various Linux distributions (Debian, Ubuntu, openSUSE, Pacman, etc.) provide binaries.
uchardet xxx
On Windows:
The (Linux) command-line tool 'file' is available on Windows via GnuWin32:
http://gnuwin32.sourceforge.net/packages/file.htm
If you have git installed, it's located in C:\Program Files\git\usr\bin.
How to convert the charset of a specified file in the Linux?
On Linux/UNIX/OS X/cygwin:
Gnu iconv suggested by Troels Arvin is best used as a filter. It seems to be universally available. Example:
$ iconv -f UTF-8 -t ISO-8859-15 in.txt > out.txt
As pointed out by Ben, there is an online converter using iconv.
Gnu recode (manual) suggested by Cheekysoft will convert one or several files in-place. Example:
$ recode UTF8..ISO-8859-15 in.txt
This one uses shorter aliases:
$ recode utf8..l9 in.txt
Recode also supports surfaces which can be used to convert between different line ending types and encodings:
Convert newlines from LF (Unix) to CR-LF (DOS):
$ recode ../CR-LF in.txt
Base64 encode file:
$ recode ../Base64 in.txt
You can also combine them.
Convert a Base64 encoded UTF8 file with Unix line endings to Base64 encoded Latin 1 file with Dos line endings:
$ recode utf8/Base64..l1/CR-LF/Base64 file.txt
On Windows with Powershell (Jay Bazuzi):
PS C:\> gc -en utf8 in.txt | Out-File -en ascii out.txt
(No ISO-8859-15 support though; it says that supported charsets are unicode, utf7, utf8, utf32, ascii, bigendianunicode, default, and oem.)
As the plugin is integrated with a code management system like GitLab or GitHub, you may have to auth with your account before leaving comments around this article.
Notice: This plugin has used Cookie to store your token with an expiration.