Case studies
Here is a collection of mine and contributed awk scripts.
General topic
- Awk and Jmol
awk is using input data to write java script code to plot vectors in Jmol - Multiple input files - first approach
awk collects and assembles data from multiple files in memory - Multiple input files - second approach
awk collects data from multiple files but picks only the necessary data to save on memory - Multiple output files
MUST KNOW feature - covered during the workshop - Color output with custom keywords
use simple awk script to highlight keywords in your output
Bioinformatics oriented
- bioawk
Bioawk is an extension to Brian Kernighan's awk, adding the support of several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names. - Fasta file format tips
worth to know if working often with files in multi-fasta format - Multiline fasta to single line fasta
single cryptic-looking line that will decyphered during the workshop - Sequence clustering with awk
application of the multiple files approach - contribution by Martín González Buitrón - Substitute scientific with common species names in a phylogenetic tree file
- Statistics on very large columns of values
- Manipulating and getting statistics for .vcf and .gff files
Math oriented
- Discrete histogram
very handy discrete histogram awk code - Gaussian smearing
trivial task done with awk - example how to use functions - Linear interpolation
use linear interpolation to resample your data on different grid
Physics oriented
- Dipole moment example
simple calculations should not be difficult to code - here is an example - Multiple files - VASP CHGCAR difference
an simplified example on how to read multiple files (bzip-ed) line-by-line simultaneously to save memory - POSCAR: reorder atom types
simple task creates programming nightmare
Primarily used as reference
- Awk and Gnuplot
outdated problem but shows hot to send data to another program and read back the results - Awk writes Python
collecting your data might be so tedious to program, that you might wamt to use awk to write python instead - Gaussian - extract geometry from output .log file
example - ProLiant Status check
simple check on some values, collected and mailed - Awk and Networking
awk have some advance network protocol capabilities...