Here are some problems I noticed in your article:
1) "These restrictions only apply to Windows - Linux, for example, allows
use of " * : < > ? \ / | even in NTFS." -- is "/" supposed to be in that list?
2) You state that changing IFS and banning newlines and tabs in filenames would make things like 'cat $file' safer, but you should also state that shell glob characters would also need to be removed (namely *?[]).
3) You state (or at least imply) that there is no way to reliably use filenames from find, but there is a POSIX compliant and known portable method:
find . -type f -exec somecommand {} \;
or for more complex cases:
find . -type f -exec sh -c 'if true; then somecommand "1ドル"; fi' -- {} \;
For xargs fans, on all but files with newlines, you can do
find . -type f | sed -e 's/./\\&/g' | xargs somecommand
This is a feature of xargs and is specified by POSIX. It disables various quoting problems with xargs that you don't mention.
4) Your setting of IFS to a value of tab and newline is overly complicated. Simply use IFS=`printf \\n\\t`. It is only trailing newlines that are removed. If the different behaviour this causes with "$*" is not desired, one can set IFS=`printf \\t\\n\\t`. I know of no tool or POSIX restriction that says characters may not be repeated in IFS.
Otherwise great article! It really would be so nice to use line-separated commands in `` and not have to worry about things breaking. And although most of the thoughts expressed here are well known to me, the idea of getting the kernel to check the validity of UTF-8 filenames is fantastic!
Posted Mar 28, 2009 19:50 UTC (Sat)
by dwheeler (guest, #1216)
[Link]
Thanks for your comments! In particular, you're absolutely right about swapping the order of \t and \n in IFS - that makes it MUCH simpler. I prefer IFS=`printf '\n\t'` because then it's immediately obvious that \n and \t are the new values. I've put that into the document, with credit.
Wheeler: Fixing Unix/Linux/POSIX Filenames
Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds