How can I convert a string like Žvaigždės aukštybėj užges or äüöÖÜÄ to Zvaigzdes aukstybej uzges or auoOUA, respectively, using Bash?

Basically I just want to convert all characters which aren't in the Latin alphabet.


12/29/2009 2:56:49 PM

Accepted Answer

Depending on your machine you can try piping your strings through

iconv -f utf-8 -t ascii//translit

(or whatever your encoding is, if it's not utf-8)

12/29/2009 3:04:08 PM

You might be able to use iconv.

For example, the string:

Žvaigždės aukštybėj užges or äüöÖÜÄ

is in file testutf8.txt, utf8 format.

Running command:

iconv -f UTF8 -t US-ASCII//TRANSLIT testutf8.txt

results in:

Zvaigzdes aukstybej uzges or auoOUA

