Script to fix filenames in the wrong encoding
If you have files with Cyrillic filenames (e.g. день
) and pack them as a ZIP archive on Windows, and then unpack on Mac using the standard archive utility, the filenames are often in the wrong encoding. For example: бвгѓ•≠м
Here is a bash script that renames them to the correct ones:
function rename() {
tr '†°Ґ£§•с¶І®©TMЂђ≠а-р' 'а-еёж-нр-яЁ' <<< "1ドル" | sed $'s/Г\xcc\x81/о/g;s/у\xcc\x81/п/g;s/ш\xcc\x86/щ/g'
}
function renamefile() {
local new="$(rename "2ドル")"
if [[ "2ドル" != "$new" ]]; then
mv "1ドル/2ドル" "1ドル/$new"
echo "$new"
fi
}
function scan() {
ls -1 "1ドル" | while read file; do
if [ -d "1ドル/$file" ]; then
scan "1ドル/$file"
fi
renamefile "1ドル" "$file"
done
}
scan "${1-.}"
Usage:
<script> <dir_with_files_with_wrong_filenames>
However, some users complained:
You can't run it twice - the names will be corrupted again.
I threw the script into the Downloads directory, launched it, but for some reason it started renaming from the root directory instead - and corrupted filenames EVERYWHERE.
I then replaced
scan "${1-.}"
with
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
scan "${1-${SCRIPT_DIR}}"
But I'm not sure this really fixes the second issue and also is generally safe enough. Could someone make a good safety review?
- 207
- 1
- 6