> /dev/null

All you need to be proficient at Bash scripting

Disclaimer

  • This is not intended to be a complete guide

  • I am self-taught in Bash

Plan

  • What is bash?

  • Usage for developers

  • How should I write it?

  • Cheatsheet

  • Scripting advice

  • Pitfalls

  • Bash swiss army knife

  • Portability

What is bash?

  • Free, open source Unix Shell

  • Written by Brian Fox for GNU Project in 1989

  • Replacement for Bourne Shell

  • POSIX compliant

  • Bash means Bourne-Again Shell

  • Default version on Mac is 3.2 (2006) for licensing issues

  • Default version on most linux distributions is 4.4 (2016)

  • Latest version is 5.2.15 (2022)

Usage for developers

  • On CI for builds

  • On developer computers for tooling

  • On staging stacks via SSH for debugging

How should I write it?

Lint away!

 

Shellcheck in editors

Shellcheck is available in:

No excuses!

Cheatsheet

A cheatsheet you can come back to at any moment
A longer course from MIT

Process input

For programs that read standard input

sort <   hello.txt          # Sorts hello.txt
sort <<< "${var}"           # Sorts content of variable var

Process output

printf 'hello' >  out.txt          # Writes standard output (hello) to out.txt
printf 'hello' >> out.txt          # Appends standard output (hello) to out.txt
command 2> err.txt                 # Writes standard error to err.txt
command &> all.txt                 # Writes standard output & error to all.txt
command > all.txt 2>&1             # Writes standard output & error to all.txt (POSIX way)
printf 'hello' > /dev/null         # Discards standard output
printf 'hello' 1>&2                # Write hello to standard output (syntactic sugar: >&2)

Mixing input & output

printf 'hello' | grep 'toto'      # Pipes standard output (hello) to grep

Variables

name='Toto'                  # /!\ No spaces around = in bash assignations /!\
echo "Hello ${name}\n"       # Variable is substituted in double quotes, prints 'Hello Toto'
echo 'Hello ${name}\n'       # Variable is not substituted in simple quotes, prints 'Hello $name'

Functions

functionName() {      # Classic syntax for declaration
    scriptName=$0     # $0 is the script name
    firstArgument=$1  # $n is the script's nth argument
    allArgs=$@        # $@ is all the arguments
    argsNumber=$#     # $# is the number of arguments
}

Return codes

grep toto <<< 'toto'        # Return code is 0 === success
grep toto <<< 'tata'        # Return code is 1. Any code !== 0 is an error
lastReturnCode=$?           # $? contains the return code of the last command
command && { printf 'OK'; } # The block after && executed if command succeeds
command || { printf 'KO'; } # The block after || executed if command fails

Share results

# $() retrieves the standard output of a method
greeting="$(echo "Hello $name")"
# Less known, <() puts the standard output of a method in a temp file
# greetingFile is the path to a temporary file where the greeting was written
greetingFile=<(echo "Hello $name")

Conditionals

Conditionals in bash are expressions.

Return code 0 = true, any other is false

if grep --silent 'toto' <<< 'tata'; then # Executed if grep returns 0 else # Executed if grep returns anything else fi
[[ "${var}" = 'toto' ]] # Expressions in [[]] return 0 if true, non-zero otherwise [[ "${var}" = 'toto' || "${var}" = 'tata' ]] # Composite conditionals
# Which means you usually write if [[ "${var}" == 'toto' ]]; then # executed if var is toto fi

Globbing

In a folder containing

.
├── bar
├── img.png
├── img.jpg
├── img.svg
├── foo1
├── foo2
├── foo3
└── foo99
rm foo?          # Removes foo1, foo2 & foo3
rm foo??         # Removes foo99
rm foo*          # Removes foo1, foo2, foo3 & foo99
rm img.{svg,png} # Removes img.svg & img.png
rm img.*         # Removes img.svg img.jpg & img.png
rm foo{1..2}     # Removes foo1 & foo2
rm foo{*,99}     # Removes ?
# Only foo99!

Regex

[[ 'toto' =~ (to){3} ]]     # Returns 0, the string matches the regex
regex='(to){3}'
[[ 'toto' =~ $regex ]]      # Returns 1, the string does not match the regex
[[ '(to){3}' =~ "$regex" ]] # Returns 0, the regex is matched as a string /!\

Parameter substitution - Bourne


# Before starting: $var = ${var}, $1 = ${1}
# Substitutions look like: ${<varName><substitutionCharacter><fallbackIfVariableIsUnset>}

# For substitutions below, adding : before the substitution character adds empty value to failure cases
# Otherwise, only unset values fall in failure cases
toto=${var1-Nope}                # fallback value (Nope) if var1 is unset
toto=${1?Missing parameter toto} # error with message if $1 is unset
length=${#var}                   # returns the length of $var

Parameter substitution - Bash

offset=${var:2}                  # returns the value of var, starting with an offset of 2
offset=${var:2:5}                # same but only returns 5 characters from the offset start

replaced=${var/[0-9]/?}          # replaces the first number in $var with ?
replaced=${var//[0-9]/?}         # replaces all numbers in $var with ?

TOKEN='Dr0w554P'
TOKEN_VAR='TOKEN'
secretToken=${!TOKEN_VAR}         # secretToken == TOKEN, useful for hard-wiring global vars

# /!\ The commands below only work with Bash 4+
upperCaseVar=${var^^}
lowerCaseVar=${var,,}

You probably know enough though

Arrays

declare -a array         # Create indexed array: keys are integers
declare -A array         # Create declarative array: keys are whatever the hell you want (strings)
array=()                 # Create an indexed array without values
array=('titi tata' toto) # Create an indexed array with 2 values

arraySize=${#array[@]}                 # Gives the size: 2
array2ndElement=${array[1]}            # Gives the second element, toto

array+=(tutu tete)                     # Appends to an array
array[2]=(tee-tee)                     # Updates array item
for item in "${array[@]}"; do ... done # Loops on an array
sliced=("${array[@]:1:2}")             # Slices array, offset 1, 2 elements

declare -p array     # Log the array in the form: declare -a array=([0]="titi tata" [1]="toto")
printf "${array[@]}" # A simpler version but less clear: titi tata toto

Scripting advice

Scripts have curves

Scripts shape - goals

Maintainability means:

  • Readability

  • Usability

  • Debug-ability

Script shape - example

#!/usr/bin/env bash

set -euxo pipefail # Fail ASAP, log commands before running them
test -f '/path/to/lib' && { source "$_"; } # Load libs from FS

main() ( # One big function executed in a sub-shell
  if isHelp; then displayHelpAndReturn; fi

  arg1="${1?Missing path to lib}" # Check inputs first
  arg2="${2?Missing bla bla}"

  _importedCommand "$arg1" # Use functions to make script readable
  command1 "$arg1"
  command2 "$arg2"
)

command1() ( # Declared below main declaration but not usage!
  arg1="$1"
  ...
)

command2() ( ... )

main "$@" # Passing all script params to main

Conventions

Disclaimer: these are my own, they’ve helped me a lot though

Rule

Reason

Use double quotes only when there is a substitution

Makes it easier to spot constants from templated strings

Quote everything unless you have a good reason not to

Minimizes errors due to word splitting

Prefix external methods/constants with _

Makes it easier to spot them and find where they are implemented

Casing:

* UPPER_SNAKE_CASE for constants and global variables
* lowerCamelCase otherwise

Makes them easier to differentiate

Use full flags in CLI tools

More explicit, ex:
jq -r
vs.
jq --raw-output

Pitfalls

Deprecated syntaxes

files=`ls`    # Back-tick syntax is deprecated in bash
files="$(ls)" # The syntax to use
[ -n "${fileName}" ]   # Uses /usr/bin/[
test -n "${fileName}"  # Uses /usr/bin/test
[[ -n "${fileName}" ]] # Prefer built-in syntax, can't be messed with

Variable names and side effects

  • Some variable names are reserved

  • Some are used by other tools

  • Use precise/namespaced names for upper-case variables

Imports

Imports are resolved from cwd in bash!


# Bad way
source ./utils.sh # Resolves to ~/utils.sh if cwd=~ and /tmp/utils.sh if cwd=/tmp

# Better way
dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"" # Gets script dir, see one-liners
source "${dir}/utils.sh"

Subshells - what is it?

A subshell is a separate instance of the command processor (here, Bash)

Subshells - how do I create one?

( cd /tmp; ls )  # Anything inside parenthesis happens in a subshell
$( cd /tmp; ls ) # Same thing, just capturing stdout too

# Piping creates a subshell
cat file.txt | while read -r line; do
  # In a subshell here
done

# This does not create a subshell
while read -r line; do
  # Not in a subshell here
done < file.txt

Subshells - what are the impacts?

In a subshell, you can’t modify the outside state, meaning

( cd /tmp; ls ) # cwd != /tmp after this line, only the subshell got cd-ed

var=toto; (var=tata); echo "${var}" # Outputs toto

while read -r line; do
  var=tutu
done < file.txt
echo "${var}" # Outputs tutu

Set -e

This is a tricky one


set -e # The script fails on any uncaught errors from now on

# What do you think this does?
grep --silent toto <<< 'tata' && { printf 'OK!\n'; }

# It EXITS! Error case is not caught

grep --silent 'a' <<< 'b' || { printf 'OK!\n'; } # The error is caught here
if grep --silent 'a' <<< 'b'; then # Prefer using if, it catches all errors
  echo 'OK'
fi

Bash swiss army knife

A nice collections of things to know/install

Commands and one-liners

Save them somewhere!


# Create a temporary file and write its path on stdout
mktemp -f toto-XXX.txt # The XXX are replaced by random characters

# Retrieve the path to the file being executed
dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
# What it does:
# * ${BASH_SOURCE[@]} contains the path to the current script
# * dirname           extracts the containing folder path
# * cd                cds to it
# * pwd               prints the absolute path to the current directory
#                     cd/pwd is the most portable link resolver

Tips & tricks


# Escape any character with, even line breaks, improves readability
command \
  | uniq \
  | sort \
  > output.txt
cat ~/Horrible\ file.txt # Also works with names containing spaces

# Group outputs with blocks
command1 >  output.txt # Big chance of using a > by mistake later
command2 >> output.txt
command3 >  output.txt # And ruining the beginning of the file, oops!

{
  command1; command2; command3
} > output.txt # No risk of error!

# Pushd with auto-popd!
set -e
(
  cd "$folder"
  commandThatMayFailAndWouldNotResetCwdIfInMainShell
)

env SHELLOPTS=xtrace bash ./script.sh # Run any script with set -x

I’ll enrich this section as I go along

Useful packages

  • curl THE most used CLI HTTP client

  • htop process inspector

  • fd to replace find

  • rg (RipGrep) to replace grep

  • tree replaces ls for nested folders

  • fuck fix typos in previous command

  • bat cat on steroids (syntax coloration, git integration…​)

  • jq JSON parser (use 1.5+ to keep property order)

  • yq YAML equivalent of jq

  • xmlstarlet XML parser

  • xsv CSV parser

Portability

Making sure it executes properly on all systems by avoiding non-portable commands

Command

Reason

Recommandation

sed

GNU version (Linux) and FreeBSD (Mac) differ on some flags. Syntax hard to read.

For structured formats (JSON etc.) use a real parser. Otherwise, use awk

echo

Some flags (-e) don’t behave the same on linux and mac

Use printf

readlink

Mac does not have the GNU version of readlink

resolvedLink="$(cd "$path" && pwd -P)"

Direct shebangs

Programs are not always installed at the same place

#!/usr/bin/env python
# Instead of /usr/bin/python for example

Shell flags not in shebangs

Some systems will ignore them or crash

#!/usr/bin/env bash
set -e
# Instead of #!/usr/bin/env bash -e

Nice sources

Q&A

Ask me anything