Toggling string case in linux bash

It’s quite an academic task, but anyway useful sometimes. I’ve collected different ways to do it in terminal in linux. Some of them work with UTF-8 characters (some it will toggle case for “й”, “ё” and so on. It will not in general handle special ligatures, such as “ß”” and “fi”.)

Ways are: sed, perl, python, awk, tr, bash, dd.

Using sed

Works with UTF-8 characters.

It is quite straightforward and allows to add custom rules easily. For example, I’ve added special handling for ligatures “ß”” and “fi”. I should note that conversion SS -> ß is not correct in general. So you may want to remove it.

$ echo ''Wie hйЮёßen Siefi тест'' | sed ''s/.*/\U&/;s/ß/SS/g;s/fi/FI/g''
WIE HЙЮЁSSEN SIEFI ТЕСТ

$ echo ''WIE HЙЮЁSSEN SIEFI ТЕСТ'' | sed ''s/ß/SS/g;s/fi/FI/g;s/.*/\L&/''
wie hйюёssen siefi тест

Using perl

Doesn’t work with UTF-8 characters.

I’m not a perl-ninja, may be there is a more efficient way. But it works.

$ echo ''Wie hйЮёßen Siefi тест'' | perl -ne ''print uc($_)''
WIE HйЮёßEN SIEfi тест

$ echo ''WIE HЙЮЁSSEN SIEFI ТЕСТ'' | perl -ne ''print lc($_)''
wie hЙЮЁssen siefi ТЕСТ

Using python

Doesn’t work with UTF-8 characters.

Python nowadays sometimes said to be replacement for perl. It can not convert Cyrillic letters (UTF-8) too.

$ echo ''Wie hйЮёßen Siefi тест'' | python -c "import sys; [sys.stdout.write(arg.upper()) for arg in raw_input()]; print "\n""
WIE HйЮёßEN SIEfi тест

$ echo ''WIE HЙЮЁSSEN SIEFI ТЕСТ'' | python -c "import sys; [sys.stdout.write(arg.lower()) for arg in raw_input()]; print "\n""
wie hЙЮЁssen siefi ТЕСТ

Using awk

Doesn’t work with UTF-8 characters in mawk, works with UTF-8 characters in gawk.

Default awk in Ubuntu 12.04 is mawk. To get UTF-8 support you have to install gawk and use it.

$ echo ''Wie hйЮёßen Siefi тест'' | gawk ''{for (i=1; i<=NF; i++) printf toupper($i)" "} END {print ""}''
WIE HЙЮЁßEN SIEfi ТЕСТ

$ echo ''WIE HЙЮЁSSEN SIEFI ТЕСТ'' | gawk ''{for (i=1; i<=NF; i++) printf tolower($i)" "} END {print ""}''
wie hйюёssen siefi тест

Using tr

Doesn’t work with UTF-8 characters.

It works with current locale. But I work in us locale and my native language is Russian.

It is the easiest way I believe. It also fits the purpose of tr — to translate and delete characters. It is possible to add custom rules such as “tr ‘ё’ ‘Ё’”, but it caused new strange symbols to appear in the output.

$ echo ''Wie hйЮёßen Siefi тест'' | tr ''[:lower:]'' ''[:upper:]''
WIE HйЮёßEN SIEfi тест

$ echo ''WIE HЙЮЁSSEN SIEFI ТЕСТ'' | tr ''[:upper:]'' ''[:lower:]''
wie hЙЮЁssen siefi ТЕСТ

Using bash

Doesn’t work with UTF-8 characters.

Warning! It’s weird way for converting strings, but a good way to convert variables in bash scripts.

$ export a=''Wie hйЮёßen Siefi тест'' ; echo ${a^^}
WIE HйЮёßEN SIEfi тест

$ export a=''WIE HЙЮЁSSEN SIEFI ТЕСТ'' ; echo ${a,,}
wie hЙЮЁssen siefi ТЕСТ

Using dd

Doesn’t work with UTF-8 characters.

The bad news it outputs more information than only toggled string. Just for collection.

$ echo ''Wie hйЮёßen Siefi тест'' | dd conv=ucase
WIE HйЮёßEN SIEfi тест
0+1 records in
0+1 records out
32 bytes (32 B) copied, 0.000124458 s, 257 kB/s

$ echo ''WIE HЙЮЁSSEN SIEFI ТЕСТ'' | dd conv=lcase
wie hЙЮЁssen siefi ТЕСТ
0+1 records in
0+1 records out
31 bytes (31 B) copied, 7.913e-05 s, 392 kB/s

Using php

Doesn’t work with UTF-8 characters.

PHP can be used as scripting language for general purposes with php-cli.

$ echo ''Wie hйЮёßen Siefi тест'' | php -r "print strtoupper(fgets(STDIN));"
WIE HйЮёßEN SIEfi тест

$ echo ''WIE HЙЮЁSSEN SIEFI ТЕСТ'' | php -r "print strtolower(fgets(STDIN));"
wie hЙЮЁssen siefi ТЕСТ