September 3, 2018

The Smallest Bash Script in the Universe

Welcome back 👋🏽

As usual, if you find my ramblings interesting and want to read more, or think they're rubbish and want to tell me about it, remember to scroll down to leave a comment before, during, or after your read.

tl;dr

If you're here to see the smallest Bash script, look no further.

Here it is:

#!/bin/bash

Read on if you're wondering why it's not just an empty file like 10-years-ago-me.

The Smallest Kinda Sorta Bash Script in the Universe

I'll let you in on a little secret.

Back in the day I had this misconception that a Bash script was any text file with execute permission. In other words, I thought that shell_script in the following example would always be executed using bash as the interpreter:

# Create an empty file
$ : >shell_script

# Mark that file as executable
$ chmod u+x shell_script

# Run that file
$ ./shell_script
$

To explain why shell_script is only kinda sorta the smallest Bash script, we gotta go a little deeper and learn how programs are executed on Linux.

fork and execve

Most programs running on Linux run at least some code from the C programming language.

If they're not directly written in C, they're very likely either running code from the C runtime or are running in an interpreter that uses the C runtime. And BTW, for the purposes of this post you can think of the C runtime as a bunch of functions that have already been written for C programmers. This code sits in a shared library, ready to run.

The two functions we'll be focusing on are ones you wouldn't ever want to implement yourself as they interact with the kernel to do fun and exciting things that you don't want to get wrong. Whoever had to review my implementations of these functions in that university operating systems course can attest to that. I'm sorry for subjecting you to that, professor I Forgot What your Name Is.

Let's follow a C program named shelly as it runs on a Linux box and executes another program named script. Whenever shelly wants to run another program, it needs to run two functions from the C runtime: fork() and execve().

shelly_exec_fork-00e2df12-3657-47b0-8adf-98f1335ea70c

Process Splitting with fork

fork() tells the kernel to clone the currently running process into two processes. It's sort of like how cells split in a petri dish. In Linux, however, the resulting pair of processes forms a hierarchy with the original process becoming the parent of the new process. Once it's done, the first instruction of either process is a return from fork().

So after fork() completes, there will be two shelly processes.

shelly_fork-a1c3c30d-8377-4163-a4a2-640fa20dfda1

Process Replacement with execve

execve() is a little more interesting.

It asks the kernel to replace the program running in the current process with some other program. execve() quite literally obliterates the memory image of the running program and replaces it with a new memory image for the desired program. When its done, execution continues at the first instruction of that new program. So in our example above, after execve() completes, there will be one process running the code for the shelly program and one process running the code for the shell_script program.

Since no remnants of shelly 's code will survive execve(), the call accepts data that the kernel will pass to the new program when it starts. This data is passed in two forms which you may have heard of:

environment variables (a list of key-value pairs)
arguments (an array of strings).

Arguments usually describe what a process should call itself and what precisely it should do. Environment variables usually describe the system that a process is running in or other software systems that a process may need to work with.

Since the environment of a process will very likely be the same as its children, that set of key-value pairs is usually just copied from the already running program to execve() with maybe a few additions.

Arguments are usually invocation-specific, and so are instead passed to execve() explicitly every time it's run.

shelly_exec-766e63c7-f055-4dc9-8ffd-2700c1553128

What does execve actually do?

execve()'s only purpose in life is to ask the kernel to run programs from executable files.

It does this by asking the kernel to identify the type of program in an executable file and to run the appropriate kernel code to load it and set it executing. The kernel has handlers for programs in a handful of different binary formats, but the handler we're interested in is right here.

Scripts Must Begin with #!

Here's the beginning of the real magic of executing a script file:

load_script-80abb39b-7a25-4494-add4-22f669abaa11-1

The most important code (highlighted) checks the first two characters of the first line of the program file. If these characters are # followed by !, the program is considered a script. It then goes on to parse out the text following the #! into two words: an interpreter and an optional argument to that interpreter.

With these values in hand, the kernel repeats what it was asked to do with a few changes:

Instead of loading and executing the script file, load and execute the interpreter.
Use the optional argument from the #! line as the first argument to the interpreter.
Use the script's name as the second argument.
For the remaining arguments, simply copy them from the arguments execve() was originally given.

Running the Actual Smallest Bash Script in the Universe

So let's say shelly tried to run execve() on a program with the following contents:

$ cat shell_script
#!/bin/bash
$

Even if shelly forks and executes shell_script, the kernel will execute /bin/bash with a first argument of shell_script.

The Smallest Bash Script

And that's why the smallest Bash script is:

#!/bin/bash

...and why it's important to always begin shell scripts with #!.

The end.

Um...really?

objection

You may notice that if you list a bunch of commands in a text file and run it in Bash, those commands still execute as if the file were a Bash script:

# Create a file containing two lines of shell
$ cat > shell_script <<COMMANDS
echo Hello
echo world
COMMANDS

# Mark the file as executable
$ chmod u+x shell_script

# Execute the file
$ ./shell_script
Hello
world
$

I haven't been lying to you. Attempts to ask the kernel to execute a text file that doesn't start in #! will always fail. If Bash is doing the executing, though, the shell will do you a solid.

Don't believe me? Take a peek at the source:

bash-execute-command-3c6e6edf-5dd4-4433-b5ea-a72c07b8d5dd

Your script may not always be executed by Bash (e.g. from a cron job, by the backend of a web service, or by a continuous integration system) so it's important to always start scripts with #! followed by the interpreter and an optional argument.

Twenty Two Tabs

Home

About