Skip to content

Latest commit

 

History

History
367 lines (345 loc) · 18 KB

Week4.md

File metadata and controls

367 lines (345 loc) · 18 KB

Week 4 Notes

Software Management

  • Using Package Management Systems

    • Tools for installing, updating, removing and managing software
    • Install new / updated software across network
    • Package - File look up, both ways
      • Which files are given by a particular package and which package contains a given file
    • Database of packages on the system including versions (compatibility and requirements)
    • Dependency checking
    • Signature verification tools (to check authenticity of source of the software)
    • Tools for building packages (to build packages from soure code - particularly true for kernel modules)
  • Package types

    • Package
      • RPM
        • Red Hat
          • CentOS
          • Fedora
          • Oracle Linux
        • SUSE Enterprise Linux
          • OpenSUSE
      • DEB
        • Debian
          • Ubuntu
            • Mint
          • Knoppix
  • Commands

    • lsb_release -a to find version of Operating System
    • When searching for packages for this version of the OS you can search by OS code name eg: focal
  • Architectures

    • amd64 | x86_64
    • i386 | x86
    • arm (RISC5 Sakthi)
    • ppc64el | OpenPOWER
    • all | noarch |src (not tied to any architecture)
  • Commands

    • uname -a gives the kernel version and the type of architecture.
  • Tools

    • Package Type
      • RPM
        • Yellowdog Updater Modifier (yum)
          • Red Hat Package Manager (rpm)
          • Dandified YUM (dnf)
      • DEB
        • synaptic (GUI)
        • aptitude (Command Line)
          • Advanced Package Tool (apt)
            • dpkg
              • dpkg-deb
  • Package managemet in Ubuntu using apt

    • Inquiring package db
      • Search packages for a keyword
        • apt-cache search keyword
      • List all packages
        • apt-cache pkgnames
        • apt-cache pkgnames | sort | less for page by page sorted display
        • apt-cache pkgnames nm for all packages starting with nm
      • Display package records of a package
        • apt-cache show -a package
  • Package Names

    • Package
      • RPM
        • package-version-release.architecture.rpm
      • DEB
        • package_version-revision_architecture.deb
        • eg : pool/universe/n/nmap/nmap_7.80+dfsg1-2build1_amd64.deb
  • Package Priorities

    • required : essential to proper functioning of the system
    • important : provides functionality that enables the system to run well
    • standard : included in a standard system installation
    • optional : can omit if you do not have enough storage
    • extra : could conflict with packages with higher priority, has specialized requirements, install only if needed.
    • Priority is displayed as extra in the output of apt-cache show nmap or apt-cache show wget for example.
  • Package Sections

  • Checksums

    • For a small change in the original file the checksum is very different. This is useful to chack if the original file has been tampered or not.
    • Can be used to verify that nothing has gone wrong to the contents of the file while downloading.
    • md5sum
      • 128 bit string
      • md5sum filename
    • SHA1
      • 160 bit string
      • sha1sum filename
    • SHA256
      • 256 bit string
      • sha256sum filename

4.2

  • Who can install packages in Linux OS ?
    • administrators
    • sudoers in the case of Ubuntu
    • Only sudoers can install/upgrade/remove packages
    • a sudo command can be executed by those who are listed in /etc/sudoers
    • Command sudo cat /etc/sudoers . If the current $USER is not in the sudoers file the incident will be reported.
    • In the file the users listed under # User privilege specification have sudo permission.
    • sudo attempts and authentication failures get recorded in /var/log/auth.log. View using sudo tail -n 100 /var/log/auth.log
  • When installing a package the system knows the website/server from which the packages have to be downloaded
    • This information is stored in the folder /etc/apt
    • Uncommented lines in the file sources.list have the debian/ubuntu sources
    • A directory sources.list.d stores sources for third party software. Allows apt update to know new versions to download from repositories stored in these files
    • Synchronize package overview files - sudo apt-get update fetches updates and keeps them in cache
    • Upgrade all installed packages - sudo apt-get upgrade upgrades the packages. It lists how many updates are going to be affected and how much data is going to be downloaded.
    • sudo apt autoremove to remove unused packages that were earlier installed to satisfy a particular dependency but are not needed now.
    • Install a package - sudo apt-get install packagename
    • sudo apt-get remove packagename to remove a particular package
    • sudo apt-get reinstall packagename to fix problems caused by accedential file deletions.
    • Clean local repository of retreived package files - apt-get clean
    • Purge package files from the system - apt-get purge package
  • Package management in Ubuntu using dpkg
    • Allows installation directly from a .deb file. Package management at a lower level.
    • /var/lib/dpkg has some information about the packages
      • Files - arch,available,status
        • cat arch displays the architectures for which packages have been installed on the system - amd64,i386
        • less available displays list of packages with info.
        • less status displays if a particular package is installed or not
      • Folder - info
        • contains a set of files for each of the packages that have been installed
        • ls wget* will give files with information about wget
          • more wget.conffilesgives location of configuration file
          • more wget.list displays list of files that would get installed on the system with the package
          • more wget.md4sums displays the listof md5sums of the installed files. (Used to catch tampering)
  • Using dpkg
    • List all packages whose names match the pattern
      • dpkg -l pattern
    • List installed files that came from packages
      • dpkg -L package
    • Display/Report the status of packages
      • dpkg -s package
    • Search installed packages for a file
      • dpkg -S pattern
      • eg : dpkg -S /usr/bin/perl shows the package from which the executable has come. ie : perl-base
    • To query the dpkg database about all the packages - dpkg-query
      • Example dpkg-query -W -f='${Section} ${binary:Package}\n' | sort | less
      • Example where output is filtered dpkg-query -W -f='${Section} ${binary:Package}\n' | grep shells
  • Installing a deb package
    • dpkg -i package_version-revision_architecture.deb
    • not a good idea since it may have some dependencies that will have to be taken care of manually
    • Do not download deb files from unknown sources and install it on the system
    • By default use package management pointing to a reliable repository
    • Uninstalling packages using dpkg is NOT recommended. You may be removing a package that is required by many other packages.
  • When compatibility issues cannot be resloved one can use snap or docker as alternatives when you are unable to install a particular version of a package.

4.3

Pattern Matching

  • Regular Expressions regex and grep commands
    • POSIX standard
      • IEEE 1003.1-2001 IEEE Standard for IEEE Information Technology – Portable Operating System Interface (POSIX(TM))
      • Refer
    • POSIX defines regular expressions to be of 2 different types - Basic and Extended.
  • Regex
    • regex is a pattern template to filter text
    • BRE: POSIX Basic Regular Expression engine
    • ERE: POSIX Extended Regular Expression engine
  • Why learn regex?
    • PRocess some input from the user or perform some string operations.
    • Languages: Java, Perl, Python, Ruby, ...
    • Tools: grep, sed, awk, ...
    • Applications: MySQL, PostgreSQL, ...
  • Usage
    • grep ‘pattern’ filename - to operate on every line in the file
    • command | grep ‘pattern’
      • the grep command operates line after line. A common feature in many utilities in linux.
      • enclose pattern in single quotes
    • Default engine: BRE
    • Switch to use ERE in 2 ways:
      • egrep ‘pattern’ filename
      • grep -E ‘pattern’ filename
  • Special characters (BRE & ERE)

Character Description
. Any single character except null or newline
* Zero or more of the preceding character / expression
[] Any of the enclosed characters; hyphen (-) indicates character range
^ Anchor for beginning of line or negation of enclosed characters
$ Anchor for end of line
\ Escape special characters
  • Special characters (BRE)

Character Description
\{n,m\} Range of occurances of preceding pattern at least n and utmost m times
\( \) Grouping of regular expressions
  • Special characters (ERE)

Character Description
{n,m} Range of occurances of preceding pattern at least n and utmost m times
() Grouping of regular expressions
+ One or more of preceding character / expression
? Zero or one of preceding character / expression
| Logical OR over the patterns
  • Character Classes

Class Description
[[:print:]] Printable
[[:alnum:]] Alphanumeric
[[:alpha:]] Alphabetic
[[:lower:]] Lower case
[[:upper:]] Upper case
[[:digit:]] Decimal digits
[[:blank:]] Space / Tab
[[:space:]] Whitespace
[[:punct:]] Punctuation
[[:xdigit:]] Hexadecimal
[[:graph:]] Non-space
[[:cntrl:]] Control characters
  • Backreferences

    • \1 through \9
    • \n matches whatever was matched by nth earlier paranthesized subexpression
    • A line with two occurances of hello will be matched using: \(hello\).*\1
  • BRE operator precedence

Highest to Lowest
[..] [==] [::] char collation
\metachar
[ ] Bracket expansion
( ) \n subexpresions and backreferences
* { } Repetition of preceding single char regex
Concatenation
^ $ anchors
  • ERE operator precedence

Highest to Lowest
[..] [==] [::] char collation
\metachar
[ ] Bracket expansion
( ) grouping
* + ? { } Repetition of preceding regex
Concatenation
^ $ anchors
| alternation
  • Examples using grep

    • Example File names.txt (Containing Names/Roll-No)
    • Basic use
      • grep 'Raman' names.txt matches line with Raman Singh
      • cat names.txt | grep 'ai' matches line with Snail
    • Usage of .
      • cat names.txt | grep 'S.n' matches lines with Singh and Sankaran
    • Usage of $
      • cat names.txt | grep '.am$' matches lines that end with xam
    • Escaping a .
      • cat names.txt | grep '\.' matches lines that have a .
    • Using anchors at the begining
      • cat names.txt | grep '^M' matches lines begining with m
    • Case insensitive matching with the i flag
      • cat names.txt | grep -i '^e' matches lines begining with e or E.
    • Word boundaries \b
      • cat names.txt | grep 'am\b' matches lines with words that end with 'am'
    • Use of square brackets [] to give options
      • cat names.txt | grep 'M[ME]' matches lines containing 'MM' or 'ME'
      • cat names.txt | grep '\bS.*[mn]' matches lines containing words begining with S and ending with m or n.
      • cat names.txt | grep '[aeiou][aeiou]' matches lines that have 2 vowels side by side
      • cat names.txt | grep 'B90[1-4]' matches words begining with B90 and ending with range 1-4.
      • cat names.txt | grep 'B90[^1-4]' matches words begining with B90 and ending with characters other than the range 1-4. A hat inside square brackets implies negation
    • Specifying occurances using escaped braces
      • cat names.txt | grep 'M\{2\}' matches lines which have 'MM'
      • cat names.txt | grep 'M\{1,2\}' matches lines which have one or 2 'M's
    • Grouping patterns that are matched using parenthesis. Repeating whatever is matched by using \1
      • cat names.txt | grep '\(ma\)' matches lines containing 'ma'
      • cat names.txt | grep '\(ma\).*\1' matches a pattern begining with 'ma' and ending with 'ma' eg: U'mair Ahma'd. The \1 back-references the first parenthesis.
      • cat names.txt | grep '\(.a\).*\1' matches a pattern like 'Mary Ma'nickam
      • cat names.txt | grep '\(a.\)\{3\}' matches a pattern like S'agayam'
    • Using Extended Regular Expression Engine
      • cat names.txt | egrep 'M+' will match lines where M occures one or more times.
      • cat names.txt | egrep '^M+' will match lines where M occures one or more times at the begining of a line.
      • cat names.txt | egrep '^M*'
        • cat names.txt | egrep '^M*a' matches lines where 'M' may or may not occur followed by 'a'
        • cat names.txt | egrep '^M.*a' matches lines where 'M' has to occur at the begining of a line followed by any number of characters and ending with 'a'
        • Watch out for the interpretation of *
      • cat names.txt | egrep '(ma)+' 'ma' could occur one or more times.
      • cat names.txt | egrep '(ma)*' 'ma' could occur zero or more times.
    • Use of pipe as an alternation between 2 patterns of strings to be matched
      • cat names.txt | egrep '(ED|ME)' matches lines containing 'ED' or 'ME'
      • cat names.txt | egrep '(Anu|Raman)' matches lines containing 'Anu' or 'Raman'. Length of string on both sides of pipe need not be the same.
      • cat names.txt | egrep '(am|an)$' matches lines containing 'am' or 'an' at the end.

4.4

  • More Examples using grep and egrep

    • Get package names that are exactly 4 characters long
      • dpkg-query -W -f'${Section} ${binary:Package}\n' | egrep ' .{4}$'
    • Get package names that are from the math section
      • dpkg-query -W -f'${Section} ${binary:Package}\n' | egrep '^math'
    • Example File chartype.txt (Containing few lines with control character)
      • control character inserted using echo $'\cc' >> chartype.txt
    • get lines that have an alphanumeric character at the begining of the line
      • cat chartype.txt | grep '^[[:alnum:]]'
    • get lines that have digits at the end of the line
      • cat chartype.txt | grep '[[:digit:]]$'
    • get lines that have a ctrl character
      • cat chartype.txt | grep '[[:ctrl:]]'
      • cat chartype.txt | grep -v '[[:cntrl:]]' will show the reverse including the empty lines
    • get lines that do not have a ctrl character
      • cat chartype.txt | grep '[^[:cntrl:]]' (This does not work as intended)
    • get lines that have printable characters (exclude blank lines)
      • cat chartype.txt | grep '[[:print:]]'
    • get lines that have blank space characters (exclude blank lines)
      • cat chartype.txt | grep '[[:blank:]]'
    • [[:graph:]] is used to match any non space character
    • To skip blank lines
      • cat chartypes.txt | egrep -v '^$' Here -v excludes and '^$' captures empty lines
    • Identify a line with a 12 digit number
      • egrep '[[:digit:]]{12}' patterns.txt
    • Identify a line with a 6 digit number (Use word boundaries)
      • egrep '\b[[:digit:]]{6}\b' patterns.txt
    • Match lines containing Roll Number of the form MM22B001
      • egrep '\b[[:alpha:]]{2}[[:digit:]]{2}[[:alpha:]][[:digit:]]{3}\b' patterns.txt
    • Match urls without the http
      • egrep '\b[[:alnum:]]+\.[[:alnum:]]+\b' patterns.txt
    • Trimming text
      • top to bottom using head and tail
      • sidways or horizontal trimming of lines using cut
        • cut -c 1-4 fields.txt displays only first 4 characters. Can also use -4 for begining to 4th place or 2- to cut from 2nd place to end.
        • cat fields.txt | cut -d " " -f 1 - This uses " " as a delimiter -d and prints only the first field -f 1
        • cat fields.txt | cut -d ' ' -f 1-2 - to get both fields
        • Capture hello world from 1234;hello world,line 1
          • cat fields.txt | cut -d ';' -f 2 | cut -d "," -f 1
          • egrep ';.*,' fields.txt (To trim pass the output of grep to sed)
        • Combining this with top to bottom trimming
          • cat fields.txt | cut -d ';' -f 2 | cut -d "," -f 1 | head -n 2 | tail -n 1
  • Own experiments using regex

    • Get strictly alphanumeric words
      • cat test.txt | egrep '\b([a-z]+[0-9]+|[0-9]+[a-z]+)\b'
  • REPLIT Code with Us session

    • Getting files with a specific permission pattern from a file
      • cat lsinfo.txt | grep 'rw-r--r--' ;
    • Get all files excluding directories in lsinfo.txt whose last modified date is in January
      • cat lsinfo.txt | grep '^[^d].*Jan'
    • To count the number of lines that starts with a capital letter and contains the word it (case-sensitive)
      • cat twocities.txt | grep -c '^[[:upper:]].*\bit\b'
    • to display all the lines that does not contain the word "we" in it
      • cat twocities.txt | egrep -v '\bwe\b'
    • using cut to display only the countries and its capitals of file.txt in the format Country, Capital (eg in file.txt : India, New Delhi; Asia)
      • cat file.txt | cut -d ';' -f 1
    • all the countries in the file file.txt sorted alphabetically by name in reverse order
      • cat file.txt | cut -d ',' -f 1 | sort -r
    • cut command to extract the continents (including the one white space in the beginning) of the first 5 lines of file.txt and store it in another file named continent.txt
      • head -n 5 file.txt | cut -d ';' -f 2 > continent.txt
    • list the names of all the c++ files in the current directory which contains a line such that the line starts with the string void main() and ends with the character {. There should be one or more spaces/tabs between the characters { and ).
      • egrep '^void[[:space:]]main\(\)[[:space:]]+{$' *.cpp | cut -d '.' -f 1
      • grep '^void[[:space:]]main()[[:space:]][[:space:]]*{$' *.cpp | cut -d '.' -f 1
    • print the count of these files in the following line
      • egrep -l '^void[[:space:]]main\(\)[[:space:]]+{$' *.cpp |tee /dev/tty | wc -l
      • |tee /dev/tty is used to print the output to terminal and also pipe the output to the next command.
      • -l flag for grep and egrep prints the name of each input file that matches
    • command to list all the packages installed on your machine and their versions in the format Package Version in a sorted manner
      • dpkg-query -W -f='${Package} ${Version}\n' | sort