shell入坑笔记

算是《Shell Programming in Unix, Linux and OS X》forth edition的读书笔记了。

如果想要更快入坑，强烈推荐Derek Banas 的1小时视频 Shell Scripting Tutorial。

基本概念

shell是什么？

shell is simply a program that reads in the commands you type in and converts them into a form more readily understood by the system.

就是一个程序program，详细来说，就是读取用户输入的命令，将其转换成系统更容易理解的形式。nothing more, nothing less.

shell‘s responsibilities：

Program execution
variable and filename substitution
I/O redirection
Pipeline hookup
Environment control
Interpreted programming language

基础命令

一些基础的命令，比如：

date
who
echo
ls
cat
wc：统计文件字数等

wc 有三个常用的option，也很好记。-l :统计行数(line), -c 统计字符数（char)， -w 统计字数(word)

当wc不带option时，会输出line，char, word的统计数
cp
mv

cp,mv有一个需要注意的地方，当目标文件已经存在时，会被替换。比如已有文件old_file，执行：
```
cp file1 old_file
# 或者
mv file1 old_file
```
都会导致原来存在的old_file文件被覆盖掉。
rm
pwd
cd
mkdir，rmdir

rmdir 删除的文件夹，需要是空的文件夹，不然会报错。一般使用rm -r dir
ln
```
ln from to
```
建 symbolic links：
```
ln -s from to
```
Symbolic link 与普通link的区别是，symbolic link 指向原始文件，如果原文件被删除，link后的文件也就无效了，因为创建的link文件就是一个指向原文件的link而已。但是普通的link在原始文件被删除后，依然存在，也依然有效。
Standard input & output redirection

标准输入重定向使用 <, 输出重定向使用 >:
```
who > users
wc -l < users
# 输出
5
```
如果是向文件中追加内容，使用>>:
```
echo hello > test
echo world >> test
cat test
# 输出
hello
world
```
standard error redirection

标准错误重定向，使用command 2>file, 在一些脚本里面会可能会看到这样的命令：
```
source ~/.bash_profile 2> /dev/null 
```
这里/dev/null是系统的”garbage can”，你可以把它想象成是一个黑洞，任何不需要输出的内容都可以扔进去。

其他比如：
```
&>/dev/null # 标准输出和错误都重定向到/dev/null
echo "errors" >&2 #将标准错误重定向到STDERR指定的文件
curl -A curl -s github.com > /dev/null 2>&1 #将标准错误重定向到标准输出，然后扔给/dev/null
```
命令后台执行

使用&，例子:
```
sort bigdata > out &
```
ps

查看系统进程情况。

常用tools

之前写过一篇，详细介绍cut， sed, tr, paste 的使用，这里略过这四位爷。

cut 「略」
paste「略」
sed「略」
tr「略」

grep

查找任何匹配指定模式的内容。

基本样式：

grep pattern files

例子：

# 找 puma
ps | grep puma 

# 从 checklist 中 找 food
grep food checklist

# pattern最好套上单引号，比如
# 从 checklist 中 找 * 字符，如果不使用单引号，则*会先被执行成当前目录下的文件列表，然后同checklist 一起作为参数传给grep

grep '*' checklist

grep常用的options：-v,-l, -n

-v:输出不匹配pattern的内容(reverse)

-l:查找包含匹配pattern的文件时，只输出文件名，而不输出相应的文件内容。【看例子吧，表达的不是很清楚】

-n:输出结果中，将匹配到的行数也一并输出

例子：

cat students
# 输出
张三
李四
王五
赵六

# 输出不匹配张三的其他内容
grep -v '张三' students
# 输出
李四
王五
赵六

#  输出以s开头的文件中，包含张三的文件名
grep -l '张三' s*
# 输出
students

# 输出中带有张三所在的行数
grep -n '张三' students
# 输出
1:张三

sort

排序，顾名思义。

例子：

sort students

一些常用的option: -u, -r,-o,

-u: 表示uniq, 所以 sort -u = sort | uniq

-r: 表示reverse，降序排列。

-o:将排序后的结果输出到指定的文件中

-k2n:当存在多列时，指定跳过第一列，以第二列来排序，-k10n，则表示以第10列来排序

例子：

cat numbers
# 输出
22Charlie
12Emanuel
43Fred
90Lucy
33Ralph
44Tony
44Tony

sort numbers
# 输出
12Emanuel
22Charlie
33Ralph
43Fred
44Tony
44Tony
90Lucy

sort -u numbers
# 输出
12Emanuel
22Charlie
33Ralph
43Fred
44Tony
90Lucy

sort -ur numbers
# 输出
90Lucy
44Tony
43Fred
33Ralph
22Charlie
12Emanuel

cat food
# 输出
pizza    4    good
chesses    5    not so bad
coco    10    great!
chips    99    yeah!
sweets    9    I like them!

sort -k2n food
# 输出
pizza    4    good
chesses    5    not so bad
sweets    9    I like them!
coco    10    great!
chips    99    yeah!

uniq

去重。

基本样式：

uniq in_file out_file

例子：

cat numbers
# 输出
22Charlie
12Emanuel
43Fred
90Lucy
33Ralph
44Tony
44Tony

uniq numbers
# 输出
22Charlie
12Emanuel
43Fred
90Lucy
33Ralph
44Tony

常用的option：-d, -c

-d: 列出重复的行

-c: 列出每一行出现的次数

例子：

uniq -d numbers
# 输出
44Tony

uniq -c numbers
# 输出
1 22Charlie
1 12Emanuel
1 43Fred
1 90Lucy
1 33Ralph
2 44Tony

正则

* :匹配0或者多个字符
.*:匹配0或者多个任意字符
? :匹配单个字符
[0-9] :匹配0-9的单个字符
[!a] :匹配除a以外的单个字符
^:行首第一个字符
$:行尾最后一个字符
\{min, max\}: 匹配数量，比如/[a-z]\{4,6\}/，匹配4到6个小写字母组成的字符串,或者/[a-z]\{4\}/，匹配4个小写字母组成的字符串
\(...\): 保存匹配项，并将结果依次保存在register1, register 2...比如/^\(.\).*\1$/，表示匹配所有第一个和最后一个字符相同的行

变量

变量名由字母，数字，下划线构成
变量赋值时，=前后不可有空格，未赋值时，默认是null
shell没有data type的概念，故变量没有类型，一律作为字符串处理

引用变量时，使用$variable，注意，shell performs variable substitution before it executes commands.

number=99
echo There are $number cups of tea on the table.

内置的整型计算（integer Arithmetic）：

$((express))

这里express只能包含数字，算术操作符和变量。

与之相似的有 expr command。

例子：

a=1
echo $a
# 输出
1

: $(( a = a + 1)) # : 表示空命令，执行$((a = a + 1))，但不输出结果
echo $a
# 输出
2

echo $(( 100 * 2 ))
# 输出
200

echo $(( a < 1 )) # false 为 0，true 为 1
# 输出
0

echo $(expr $a + 1)
# 输出
2

echo $(expr 9 + 9)
# 输出
18

echo $(expr 9 \* 9)  ## 注意*用了反斜杠\, 不然shell会将*解析成当前目录下的文件列表，造成语法错误。
# 输出
81

不过expr 更常用的操作是expr expr1 : expr2, 其中，expr2 是一个正则表达式，该操作用于查找expr1中匹配expr2模式的字符数量。

例子：

expr "hello" : ".*"
# 输出
5

expr "hello" : "[A-Z]*"
# 输出
0

参数

每当执行一个shell程序时，shell会自动将传入的参数依次存储在特殊的变量$1, $2, $3…..中，这些特殊的变量又称为positional Params。如果需要引用的参数是第10，11….,需要使用${n}，比如${11}

其他一些与参数有关的变量：

$*: 所有参数列表。如果是”$*“ , shell 会将”$*“ 替换成$1, $2 ……..

$@：同 $*, 唯一的区别是，shell 会将”$@” 替换成”$1”, “$2” …….

$#: 参数的个数

看例子：

现有shell文件args.sh包含以下内容：

#!/usr/bin/env sh

echo $# arguments passed
echo the first argument is $1

# $@ vs $*
for arg in "$@"
do
    echo $arg
done

for arg in "$*"
do
    echo $arg
done

执行该文件：

sh args.sh a b c d e f
# 输出
6 arguments passed
the first argument is a
a
b
c
d
e
f
a b c d e f

这里，”$*” 循环时，将所有的参数一起输出了。

如果for循环中的”$*”，替换成$*，则输出结果同$@。

参数操作中，可能会涉及到shift命令，用于从左边移除positional parameters，队列出栈。$2 的值被赋给了$1，以此类推。

修改上面的args.sh文件：

#!/bin/bash

echo $# arguments passed
echo the first argument is $1
shift
echo $# arguments passed
echo the first argument is $1
shift
echo $# arguments passed
echo the first argument is $1
shift
echo $# arguments passed
echo the first argument is $1
shift

执行：

sh args.sh a b c d
# 输出
4 arguments passed
the first argument is a
3 arguments passed
the first argument is b
2 arguments passed
the first argument is c
1 arguments passed
the first argument is d

参数其实也是变量，参数的引用，赋值和替换还有一些特殊用法，来看看：

${params}: 直接返回params的值

${params:?value}：如果params不是null，返回params，否则将value写入标准错误，然后退出程序，如果value为空，则直接输出标准错误信息。

${params:-value}：如果params不是null，返回params，否则返回value

${params:=value}：同{params:-value}，唯一的区别是value会被赋值给params

${params:+value}：${params:-value}的反面，如果params不是null，返回value，否则返回params(null)，也就是substitute nothing.

此外，还有一些特殊点的：

$0: 当前执行脚本的文件名

$!: 最后运行的命令的后台PID

${#variable}: 返回variable的值的长度，如果variable是数组，返回数组的长度，若是字符串则返回所含字符个数。

双引号单引号etc

单引号: 出现在单引号内的所有的特殊字符都会被忽略

双引号: 除$ , \, `` 之外的所有特殊字符都会被忽略。

反斜杠：紧随其后的字符不会被解析

命令替换 command substitution: ``，等同于$()

看例子，一目了然：

x=5

echo '$x'
# 输出
$x

echo "$x"
# 输出
5

echo \$x
# 输出
$x

eval echo \$x   # eval: scans twice
# 输出
5

echo date
# 输出
date

echo `date`
# 输出
Thu Sep 5 18:37:43 CST 2019

echo $(date)
# 输出
Thu Sep 5 18:37:43 CST 2019

filecount=$(ls | wc -l | sed 's/ //g')
echo $filecount
# 输出
28

判断

主要涉及if 和 case的用法。

if的一般模式：

if command1
then
    command2
    ......
else
    command3
    ......
fi

在使用if 前，先看如何获取exit status 以及判断条件时所用的test 命令。

通过 $? 可以获取最后一次命令的exit status，运行脚本时，$?默认返回最后一次命令的状态。但你也可以直接使用 exit n 来退出当前程序，这里 n 是 0～255之间的整数，0表示程序成功。

而test命令是shell 的一个内置命令，基本模式 :

test expression

其中 expression 表示需要测试的条件。

比如：

x=1
test $x = 1
echo $?
# 输出
0

这里最好是给$x 包上””，如果x没有被赋值，test $x = 1会出现语法错误。「因为shell会替换$x 为 null，然后传给test 的只有两个参数：= 和1，导致语法错误。」

看个例子感受下：

name=
test $name = 'Ruby'
# 输出
parse error: condition expected: =

blanks="  "
test -z "$blanks" # 判断是否是null，不是则返回true(0)，否则返回false(1),  -z: is null,length zero
echo $?
# 输出
1

test -n "$blanks" # 判断是否不是null，不是则返回true(0)，否则返回false(1), -n: not null
echo $?
# 输出
0

test 的另一种样式长这样：

[ expression ]

所以 if 后面出现的那些，其实都是test 命令，是不是有种顿悟的错觉？

这里注意[[]] 和[] 的区别，在流程控制中， [[]] 中允许直接使用 || ，&& 等逻辑符号。但在 [] 可以使用 -a(and)，-o(or)

举个例子：

#!/bin/bash

x=2
if [ "$x" -ge 2 -a "$x" -le 3 ] 
then
    echo "x is between 2 and 3."
fi

if [[ "$x" -ge 2 && "$x" -le 3 ]]
then
    echo "x is between 2 and 3."
fi

但一般不推荐在[]中使用-a, -o，多使用 [ p ] && [ q ] 或者 [ p ] || [ q ] 。

看一个判断文件的例子：

#!/bin/bash

file1="./test_file1"

if [ -e "$file1" ]; then
    echo "$file1 exists"
elif [ -f "$file1" ]; then
    echo "$file1 is a normal file"
elif [ -r "$file1" ]; then
    echo "$file1 is readable"
elif [ -w "$file1" ]; then
    echo "$file1 is writable"
elif [ -x "$file1" ]; then
    echo "$file1 is executable"
elif [ -d "$file1" ]; then
    echo "$file1 is a directory"
elif [ -L "$file1" ]; then
    echo "$file1 is a symbolic link"
elif [ -p "$file1" ]; then
    echo "$file1 is a named pipe"
elif [ -S "$file1" ]; then
    echo "$file1 is a network socket"
elif [ -G "$file1" ]; then
    echo "$file1 is owned by the group"
elif [ -O "$file1" ]; then
    echo "$file1 is owned by the userid"
fi

再来看case。

一般模式：

case value in
pattern1)    command1
                    ......
                    command;;
pattern2)    command2
                    ......
                    command;;
.......
patternn)    commandn
                    ......
                    command;;
esac

这里pattern用的是正则的东西。

pattern也可以用 | 来表示逻辑或，比如pat1 | pat2 | …..|pat-n，表示匹配 pat1,pat2, ….pat-n 中任一个即可。

看个例子：

cat greetings
# 输出
#!/bin/bash

hour=$(date +%H)

case "$hour"
in
    0? | 1[01]) echo "Good morning";;
    1[2-7]) echo "Good afternoon";;
    *) echo "Good evening";;
esac

sh greetings
# 输出
Good afternoon

另外需要注意两个特别的constructs:

Command1 && command2: 只有当command1返回的exit status是0(success)，才会执行command2。

Command1 || command2: 只有当command1返回的exit status是非0(fail)，才会执行command2。

循环

主要是for, while, until.

for

基本模式：

for var in word1, word2...wordn
do
    command1
    command2
    .......
done

这里举一个有些特殊的用法。for可以省略后面的in word1, word2...wordn:

for var
do
    command1
    command2
    .......
done

shell会自动遍历所有的参数。上面的代码等价于:

for var in "$@"
do
    command1
    command2
    .......
done

例子：

cat for_example
# 输出
#!/bin/bash

for arg
do
    echo $arg
done

sh for_example 1 2 3
# 输出
1
2
3

while

一般模式：

while commandt
do
    command1
    command2
    .......
done

例子：

cat while_example
# 输出
#!/bin/bash

while [ "$#" -ne 0 ]
do
    echo "$1"
    shift
done

sh while_example 1 2 3
# 输出
1
2
3

until

一般模式：

until commandt
do
    command1
    command2
    .......
done

例子：

cat until_example
# 输出
#!/bin/bash

until [ "$#" -eq 0 ]
do
    echo "$1"
    shift
done

sh until_example 1 2 3
# 输出
1
2
3

break跳出loop

break: 退出当前loop

break n：跳出的循环层级，比如:

for file
do
    while [ "$x" -gt 1 ]
    do
        .....
        if [ -n "$error"]
        then
            break 2
        fi
        ....
    done
  ......
done

如果error 不为null, for 和while 循环都会退出。

continue跳过

等同于ruby里面的next. 「从起名字的角度看，next比continue要好， next 更能表达出当前iteration后面的命令不再执行的含义，咱直接去下一个iteration。」
loop后台执行

同命令后台执行一样，在loop 的结尾添加 & 即可。
```
for file in memo[1-4]
do
    run $file
done &
```
getopts

shell内置的command，用来处理命令行参数，后面另起一篇单独说它，这里略过。

read 和 printf

read

从终端或者文件中读取数据。

read 命令的exit status 只有当遇到 end-of-file 时才会为非0(fail)。具体而言，如果是从终端读取数据，则当用户按下ctrl+d 时，如果是从文件读取，则是当文件中已经没有数据的时候。

例子：
```
cat addi
# 输出
#!bin/bash

while read n1 n2
do
    echo $(( $n1 + $n2 ))
done

sh addi
111 222
# 输出
333
# 使用ctrl + d 退出
```

printf

格式化输出。

一般模式：

printf "format" arg1 arg2 ....

这块其实蛮复杂的，简单举几个例子：

printf "This is a number: %d\n" 10
# 输出
This is a number: 10

printf "The octal value for %d is %o\n" 10 10
# 输出
The octal value for 10 is 12

printf "The hexadecimal value for %d is %x\n" 10 10
# 输出
The hexadecimal value for 10 is a

printf "A string: %s and a character: %c\n" hello A
# 输出
A string: hello and a character: A

printf "Just the first character: %c\n" hello
# 输出
Just the first character: h

printf "%.5d %.4X\n" 10 27
# 输出
00010 001B

exec 与eval

想起ruby里面的instance_exec, instance_eval……

单独拎出来，是觉得这俩有些傻傻分不清。

eval: 前面的例子中其实有提到它，shell 会在执行前对命令行会多看一眼「可能因为这个命令行长得好看😄」。比如：

x=1
echo \$x
# 输出
$x

eval echo \$x
# 输出
1

exec: 用于替换当前的程序，一般模式exec program，当前的进程会被改变，用该program来替换当前的进程。exec还可以用来重定向标准输入输出。

有关debug

建议在脚本中直接使用set -euxo pipefail 。

set的常用的几个option:

-x : 开启追踪，类似脚本debug时，添加了+x

-u: 用于处理那些未定义变量。Treat unset variables and parameters other than the special parameters ‘@’ or ‘*’ as an error when performing parameter expansion.

-e: 只要有错就退出程序，终止执行，返回非0值。Exit immediately if a pipeline, which may consist of a single simple command , a list, or a compound command returns a non-zero status.

但是set -e 不适用管道命令，解决方法是：set -eo pipefail

综上，完整的处理出错的脚本：

set -euxo pipefail

set除了在debug中使用外，还可用于给positional params重新赋值，而它的兄弟unset，用于移除变量在当前环境下的定义(不过unset 对readonly的变量无效)：

set a b c                         # 分别将a,b,c 赋给 $1, $2, $3
echo $1:$2:$3

x=hi
echo $x
# 输出
hi

unset x
echo $x
# 输出

参考

《Shell Programming in Unix, Linux and OS X》forth edition

Advanced Bash-Scripting Guide

Shell Scripting Tutorial