A pretty decent introduction to Bash for pipelines and interactive use.
Bash is the Bourne Again SHell
Files are streams of bytes. Possibly bi-directional, but usually opened for either reading or writing.
A stream has the following valid operations on it.
Unix/Linux is all about files. Just about everything is accessible as a file.
By default a program has 3 files open.
Ever need a file for a small task, but don't want to go to the effort of coming up with a good name. Temporary files are a life saver. They don't stick around, and are guaranteed to have a unique name
A subshell captures what is printed by `mktemp`, and assigned to `MY_TEMP_FILE`. You can write to the file by using `cat` and a redirection. Close the file with CTRL+D
These are some common commands to figure out where you are and what you want to do.
This utility can split sections of a line by bytes or delimiter. It's most useful with simple CSVs. For example, the first and third columns are needed from a CSV
For more information read the man page.
Awk is an entire language, but most people use it for line oriented processing. It's a fancy version of cut for a lot of purposes. It uses regex matching to take actions and has various blocks to take actions. The default field separator is white space.
Generally you won't see too much awk used in these pipelines, as it takes longer to write and digest AWK over a few simple pipeline patterns. Just know it's very powerful, and can do a lot with very few characters.
`sort` sorts lines in a file. `uniq` removes duplicate lines that appear together.
Cat takes a series of files to print the contents of
Grep stands for globally search for a regular expression and print matching lines. Commands are generally in the following form:
Grep is often used in log processing, like counting up how many times a page or how many times a POST request occurred was viewed in Apache logs.
Find accepts a series of starting points to traverse the directory hierarchy, and supports various filtering operations. Examples:
These are some common operations, but it supports many more advanced operations. It supports filtering by access time, modified time, permissions, and many other attributes.
Curl is one of the most commonly used programs ever. It makes sending web requests on a ton of protocols super easy. It's great for making API calls or downloading web pages. The result of the request is written to standard out.
Curl supports many flags, but it's a large topic. Generally curl is used like so:
Many times we need to select a subset of files and perform an operation on them. Possibly in parallel.
This adds all .txt in your directory hierarchy to a zip file.
Pipes allow commands to be composed and feed into one operation after another. You saw one with the xargs example. To link one command's output to another's standard input, the commands are separated with a pipe(|) character.
Many times it's necessary to feed the results of a pipeline into a program that accepts a file. Bash provides the ability to present these results as file with the following syntax.
The above feeds a series of patterns into grep, from a given pipeline. A more advanced example would be locating a series of files which lack a frontmatter field, but have an image in them. For example if one was working with Jekyll, it might look:
There's lots of sources to get help with bash, and figure out the flags to a given program
This post is tagged:
Read more by exploring the posts in the above tags. Otherwise go home.