If you have files with Cyrillic filenames (e.g. день
) and pack them as ana ZIP archive on Windows, and then unpack this archive on Mac using the standard archive utility, the filenames are often in the wrong encoding. For example: бвгѓ•≠м
Here is a bash script that renames them to the correct ones:
function rename() {
tr '†°Ґ£§•с¶І®©TMЂђ≠а-р' 'а-еёж-нр-яЁ' <<< "1ドル" | sed $'s/Г\xcc\x81/о/g;s/у\xcc\x81/п/g;s/ш\xcc\x86/щ/g'
}
function renamefile() {
local new="$(rename "2ドル")"
if [[ "2ドル" != "$new" ]]; then
mv "1ドル/2ドル" "1ドル/$new"
echo "$new"
fi
}
function scan() {
ls -1 "1ドル" | while read file; do
if [ -d "1ドル/$file" ]; then
scan "1ドル/$file"
fi
renamefile "1ドル" "$file"
done
}
scan "${1-.}"
Usage:
<script> <dir_with_files_with_wrong_filenames>
However, some users complained:
You can't run it twice - the names will be corrupted again.
I threw the script into the Downloads directory, launched it, but for some reason it started renaming from the root directory instead - and corrupted filenames EVERYWHERE.
I then replaced
scan "${1-.}"
with
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
scan "${1-${SCRIPT_DIR}}"
But I'm not sure this really fixes the second issue and also is generally safe enough. Could someone make a good safety review?
If you have files with Cyrillic filenames and pack them as an archive on Windows, and then unpack this archive on Mac using the standard archive utility, the filenames are often in the wrong encoding. For example: бвгѓ•≠м
Here is a bash script that renames them to the correct ones:
function rename() {
tr '†°Ґ£§•с¶І®©TMЂђ≠а-р' 'а-еёж-нр-яЁ' <<< "1ドル" | sed $'s/Г\xcc\x81/о/g;s/у\xcc\x81/п/g;s/ш\xcc\x86/щ/g'
}
function renamefile() {
local new="$(rename "2ドル")"
if [[ "2ドル" != "$new" ]]; then
mv "1ドル/2ドル" "1ドル/$new"
echo "$new"
fi
}
function scan() {
ls -1 "1ドル" | while read file; do
if [ -d "1ドル/$file" ]; then
scan "1ドル/$file"
fi
renamefile "1ドル" "$file"
done
}
scan "${1-.}"
Usage:
<script> <dir_with_files_with_wrong_filenames>
However, some users complained:
You can't run it twice - the names will be corrupted again.
I threw the script into the Downloads directory, launched it, but for some reason it started renaming from the root directory instead - and corrupted filenames EVERYWHERE.
I then replaced
scan "${1-.}"
with
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
scan "${1-${SCRIPT_DIR}}"
But I'm not sure this really fixes the second issue and also is generally safe enough. Could someone make a good safety review?
If you have files with Cyrillic filenames (e.g. день
) and pack them as a ZIP archive on Windows, and then unpack on Mac using the standard archive utility, the filenames are often in the wrong encoding. For example: бвгѓ•≠м
Here is a bash script that renames them to the correct ones:
function rename() {
tr '†°Ґ£§•с¶І®©TMЂђ≠а-р' 'а-еёж-нр-яЁ' <<< "1ドル" | sed $'s/Г\xcc\x81/о/g;s/у\xcc\x81/п/g;s/ш\xcc\x86/щ/g'
}
function renamefile() {
local new="$(rename "2ドル")"
if [[ "2ドル" != "$new" ]]; then
mv "1ドル/2ドル" "1ドル/$new"
echo "$new"
fi
}
function scan() {
ls -1 "1ドル" | while read file; do
if [ -d "1ドル/$file" ]; then
scan "1ドル/$file"
fi
renamefile "1ドル" "$file"
done
}
scan "${1-.}"
Usage:
<script> <dir_with_files_with_wrong_filenames>
However, some users complained:
You can't run it twice - the names will be corrupted again.
I threw the script into the Downloads directory, launched it, but for some reason it started renaming from the root directory instead - and corrupted filenames EVERYWHERE.
I then replaced
scan "${1-.}"
with
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
scan "${1-${SCRIPT_DIR}}"
But I'm not sure this really fixes the second issue and also is generally safe enough. Could someone make a good safety review?
If you have files with Cyrillic filenames and pack them as an archive on Windows, and then unpack this archive on Mac using athe standard archive utility, the filenames are often in the wrong encoding. For example: бвгѓ•≠м
Here is a bash script that renames them to the correct ones:
function rename() {
tr '†°Ґ£§•с¶І®©TMЂђ≠а-р' 'а-еёж-нр-яЁ' <<< "1ドル" | sed $'s/Г\xcc\x81/о/g;s/у\xcc\x81/п/g;s/ш\xcc\x86/щ/g'
}
function renamefile() {
local new="$(rename "2ドル")"
if [[ "2ドル" != "$new" ]]; then
mv "1ドル/2ドル" "1ドル/$new"
echo "$new"
fi
}
function scan() {
ls -1 "1ドル" | while read file; do
if [ -d "1ドル/$file" ]; then
scan "1ドル/$file"
fi
renamefile "1ドル" "$file"
done
}
scan "${1-.}"
Usage:
<script> <dir_with_files_with_wrong_filenames>
However, some users complained:
You can't run it twice - the names will be corrupted again.
I threw the script into the Downloads directory, launched it, but for some reason it started renaming from the root directory instead - and corrupted filenames EVERYWHERE.
I then replaced
scan "${1-.}"
with
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
scan "${1-${SCRIPT_DIR}}"
But I'm not sure this really fixes the second issue and also is generally safe enough. Could someone make a good safety review?
If you have files with Cyrillic filenames and pack them as an archive on Windows, and then unpack this archive on Mac using a standard archive utility, the filenames are often in the wrong encoding. For example: бвгѓ•≠м
Here is a bash script that renames them to the correct ones:
function rename() {
tr '†°Ґ£§•с¶І®©TMЂђ≠а-р' 'а-еёж-нр-яЁ' <<< "1ドル" | sed $'s/Г\xcc\x81/о/g;s/у\xcc\x81/п/g;s/ш\xcc\x86/щ/g'
}
function renamefile() {
local new="$(rename "2ドル")"
if [[ "2ドル" != "$new" ]]; then
mv "1ドル/2ドル" "1ドル/$new"
echo "$new"
fi
}
function scan() {
ls -1 "1ドル" | while read file; do
if [ -d "1ドル/$file" ]; then
scan "1ドル/$file"
fi
renamefile "1ドル" "$file"
done
}
scan "${1-.}"
Usage:
<script> <dir_with_files_with_wrong_filenames>
However, some users complained:
You can't run it twice - the names will be corrupted again.
I threw the script into the Downloads directory, launched it, but for some reason it started renaming from the root directory instead - and corrupted filenames EVERYWHERE.
I then replaced
scan "${1-.}"
with
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
scan "${1-${SCRIPT_DIR}}"
But I'm not sure this really fixes the second issue and also is generally safe enough. Could someone make a good safety review?
If you have files with Cyrillic filenames and pack them as an archive on Windows, and then unpack this archive on Mac using the standard archive utility, the filenames are often in the wrong encoding. For example: бвгѓ•≠м
Here is a bash script that renames them to the correct ones:
function rename() {
tr '†°Ґ£§•с¶І®©TMЂђ≠а-р' 'а-еёж-нр-яЁ' <<< "1ドル" | sed $'s/Г\xcc\x81/о/g;s/у\xcc\x81/п/g;s/ш\xcc\x86/щ/g'
}
function renamefile() {
local new="$(rename "2ドル")"
if [[ "2ドル" != "$new" ]]; then
mv "1ドル/2ドル" "1ドル/$new"
echo "$new"
fi
}
function scan() {
ls -1 "1ドル" | while read file; do
if [ -d "1ドル/$file" ]; then
scan "1ドル/$file"
fi
renamefile "1ドル" "$file"
done
}
scan "${1-.}"
Usage:
<script> <dir_with_files_with_wrong_filenames>
However, some users complained:
You can't run it twice - the names will be corrupted again.
I threw the script into the Downloads directory, launched it, but for some reason it started renaming from the root directory instead - and corrupted filenames EVERYWHERE.
I then replaced
scan "${1-.}"
with
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
scan "${1-${SCRIPT_DIR}}"
But I'm not sure this really fixes the second issue and also is generally safe enough. Could someone make a good safety review?
Script to fix filenames in the wrong encoding
If you have files with Cyrillic filenames and pack them as an archive on Windows, and then unpack this archive on Mac using a standard archive utility, the filenames are often in the wrong encoding. For example: бвгѓ•≠м
Here is a bash script that renames them to the correct ones:
function rename() {
tr '†°Ґ£§•с¶І®©TMЂђ≠а-р' 'а-еёж-нр-яЁ' <<< "1ドル" | sed $'s/Г\xcc\x81/о/g;s/у\xcc\x81/п/g;s/ш\xcc\x86/щ/g'
}
function renamefile() {
local new="$(rename "2ドル")"
if [[ "2ドル" != "$new" ]]; then
mv "1ドル/2ドル" "1ドル/$new"
echo "$new"
fi
}
function scan() {
ls -1 "1ドル" | while read file; do
if [ -d "1ドル/$file" ]; then
scan "1ドル/$file"
fi
renamefile "1ドル" "$file"
done
}
scan "${1-.}"
Usage:
<script> <dir_with_files_with_wrong_filenames>
However, some users complained:
You can't run it twice - the names will be corrupted again.
I threw the script into the Downloads directory, launched it, but for some reason it started renaming from the root directory instead - and corrupted filenames EVERYWHERE.
I then replaced
scan "${1-.}"
with
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
scan "${1-${SCRIPT_DIR}}"
But I'm not sure this really fixes the second issue and also is generally safe enough. Could someone make a good safety review?