changing your internal field separator
Wow; that sounds pretty exciting doesn’t it?⌗
If you do much shell scripting (I frequent bash) at some point you’ll come across “IFS” and learn to appreciate the usefulness in manipulating it. Let’s look at a script showing that — and then dive into the details.
#!/usr/bin/env bash
## test function
ifs_test()
{
cd ~/tmp || exit 1
for entry in `ls -lah`
do
echo $entry
done
}
## let’s see how it behaves
ifs_test
## let’s modify IFS and run it again
oIFS=$IFS
IFS="
"
ifs_test
## let’s be anal and restore IFS explicitly
IFS=$oIFS
exit
So what’s happening here and how is it relevant?⌗
A number of times I’ve had to write scripts to automate the processing of files, which can turn out to be any combination of renaming them, moving them around, or checking if they are valid based on some set of rules. One thing that can often bite me is file names with unexpected spaces. You may say, “but people know how to name the files correctly, there’s documentation for it even.” I’m sure you’re right, but you’re not accounting for the fact that these people are human. Humans make mistakes. You can get clever and pass results through another Unix tool, but I prefer to avoid extra dependencies when the shell I’m working in can handle it cleanly and efficiently.
As an example look at the bash function ifs_test
in our example. It’s just a trivial for loop which echoes out the results of the ls -lah
command. The first time it is called this is the result we get:
[chadhs@mac scratch]$ ./ifstest.sh
drwxr-xr-x
4
chadhs
staff
136B
Mar
24
19:36
.
drwxr-xr-x+
39
chadhs
staff
1.3K
Mar
24
19:36
..
-rw-r--r--
1
chadhs
staff
576B
Jul
31
2012
pckeyboardhack.tmp
-rw-r--r--
1
chadhs
staff
0B
Mar
24
17:17
so-secret-filename-is-classified.md
Ouch. That was super ugly; how do we fix it?⌗
Before explaining that, let’s see how the output looks the second time the function ifs_test
is called.
drwxr-xr-x 4 chadhs staff 136B Mar 24 19:36 .
drwxr-xr-x+ 39 chadhs staff 1.3K Mar 24 19:36 ..
-rw-r--r-- 1 chadhs staff 576B Jul 31 2012 pckeyboardhack.tmp
-rw-r--r-- 1 chadhs staff 0B Mar 24 17:17 so-secret-filename…
Ahh… that’s much better.⌗
So to fully understand what is going on each time the function ifs_test
is called we need to understand what the variable IFS is. By default the internal shell variable IFS is set to recognize spaces, tabs, and new lines as a field delimiter (aka Internal Field Separator). Seeing that there are multiple spaces in each of our results from the ls -lah
command in the for loop, our shell is treating every single white space (compare the first result to the second) as a delimiter for a new item in our list of results renerated in the for loop.
Now for the fix⌗
You’ll notice in our example that before the function ifs_test
is called the second time, we set a variable called oIFS to store the default value of IFS, and then change IFS to only treat a new line as a field separator / delimiter. Once we no longer need this behavior we then set IFS back to it’s default value 1.
Wrapping it all up⌗
The reason I spent a little time writing this up and sharing it was the simple fact that learning about how to alter IFS to control how the shell delimits results saved me some pain in a project recently. My scripts continued to run and do their job but I’d get a lot of little errors when someone didn’t follow the documentation; and trusting humans is just bad for business. ;-)
-
In theory we can just unset IFS to return it to it’s default value, but I like to do it manually & explicitly. ↩︎