May 31, 2020By Joe Lafiosca← Back to Blog

Q&A: Shell Command Parsing


A software development student using bash as her shell was having trouble changing into a specific directory at the command line. She had listed her current directory and seen something like this:

~/project$ ls
'Advanced CSS'     some.workspace     README.md

She then tried to enter the Advanced CSS directory with this command:

~/project$ cd Advanced css
bash: cd: too many arguments

I taught her how to use bash's tab completion feature, but I mentioned that the problem with the command above was that the directory name needed its spaces to be escaped. She asked what that meant, and I shared the following short lesson.

Your shell (e.g. bash) is just another program which provides an interface between you and the operating system. Its main mode is an input loop. You type a command, and the shell interprets and runs the command. When you type that command, the shell receives it as a string of input from you, and it has to parse that string to figure out what you're trying to do.

If you type in cd Advanced css, the shell receives that entire string and splits it on the whitespace, yielding an array like:

[
  'cd', // argument 0
  'Advanced', // argument 1
  'css', // argument 2
]

The shell assumes the first item of that array, in the 0-index position, is the program or built-in command that you are trying to run, and the rest of the items are arguments of that command. (Technically the command is also an argument: argument 0.)

Now, on her system the cd command expected a single argument, the directory to change into. But bash looked at the list and saw Advanced and css as two separate arguments. Both were passed to cd, and it didn't know what to do with them; hence it complained that there were too many arguments. When you want to pass a single argument that contains spaces in it, like Advanced css, you need to communicate that to the shell somehow.

In bash there are many ways of accomplishing this. Here are five of them:

  1. You can put a backslash before each space: cd Advanced\ css
  2. You can wrap the entire string in single quotes: cd 'Advanced css'
  3. You can wrap the entire string in double quotes: cd "Advanced css"
  4. You can use tab completion to let the shell do it automatically: cd Adv<tab>
  5. You can use a global expression instead of the full string: cd Adv*

Approaches 1 through 3 above are examples of "escaping" the spaces. They are called this because the special characters (backslash, single quote, and double quote) tell bash's parser to temporarily escape from its normal parsing mode which splits arguments on whitespace. Approaches 4 and 5 tell bash to expand the argument, given enough information to identify the directory you want, and it will automatically escape them. (It's important to note that these examples would not work as given if there were another file or directory present which also started with Adv.)

When using any of these methods, the argument list bash parses looks like this instead:

[
  'cd', // argument 0
  'Advanced css', // argument 1
]

It can then verify that there's a directory named Advanced css and change into it.

Part of my purpose in explaining the above is to try to demystify the shell by breaking it into components that are familiar even to new programmers. As I stated at the beginning, it is simply a program that accepts inputs and does things with them. It's essentially just the program that your operating system automatically starts for you when you open up a terminal session. You can write your own simple shell, and perhaps we can explore that together in future articles.