UNIX for
Download
1 / 57

UNIXLINUX tutorial - EMBnet Norway Home - PowerPoint PPT Presentation


  • 286 Views
  • Uploaded on

UNIX for Biologists – An introduction © George B. Magklaras - 2006 The Norwegian EMBnet node, The Biotechnology Centre of Oslo Originated as a research project at AT&T Bell Labs in 1969 by Ken Thompson and Dennis Ritchie.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'UNIXLINUX tutorial - EMBnet Norway Home' - issac


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

UNIX for Biologists – An introduction

© George B. Magklaras - 2006

The Norwegian EMBnet node,

The Biotechnology Centre of Oslo


What is unix the history l.jpg

Originated as a research project at AT&T Bell Labs in 1969 by Ken Thompson and Dennis Ritchie.

Developed in several different versions for various hardware platforms (Sun Sparc, Power PC, Motorola, HP RISC Processors).

In 1991, Linus Torvalds created a UNIX-like system to run on the Intel 386 processor. Intel had already started dominating the PC market, but UNIX was nearly absent from the initial Intel market.

In January 2000, Apple announced MAC OS X, a UNIX/Mach hybrid that provides UNIX command line features.

What is UNIX (The history)?


Is linux really unix l.jpg

Well, yes and no: by Ken Thompson and Dennis Ritchie.

Yes, because it has essentially the same look and feel like any UNIX operating System.

Yes, because it offers the ability to run nearly any program that runs on UNIX systems (through API conventions such as POSIX, etc..).

No because the heart of the system (kernel) has a lot of new features that go beyond the classical design philosophy of UNIX kernels.

Is LINUX really UNIX?


Why should you choose unix linux l.jpg

Noted for its reliability, multi-tasking performance and network application capabilities.

Because it has started life in academia, where most of the scientific analysis programs are developed, you will find a lot of good scientific software that runs on UNIX/LINUX. Windows has certainly scored to a smaller extent on this area.

Unix/LINUX is rich in commands and software development capabilities. Every UNIX/LINUX OS comes with a built set of compilers/debug tools that have are used widely by the world scientific community. This is not true for Windows.

Why should you choose UNIX/LINUX?


What is gnu l.jpg

GNU stands for “ network application capabilities. GNU's Not Unix”. The three letter abbreviation is not a joke. It emphasizes a major project of the Free Software Foundation (FSF) that really created the LINUX operating system with many of its popular tools.

Richard Stallman created FSF, in order to encourage the development and use of freely redistributable code.

Freely means the freedom of redistributing your code under certain conditions. It does NOT mean zero financial cost!

The Gnu Public License (GPL) defines the terms and conditions of redistributing the LINUX kernel and other tools that make it usable, forming a LINUX distribution.

What is GNU?


What is a linux distribution l.jpg

The GPL framework allowed LINUX distributions (RedHat, SuSE, Mandrake) to be formed. These are organised bindings of the LINUX kernel together with a set of programs (text editors, compilers, office and scientific suites) to make a system suitable for a particular task.

The MCC distribution made by Manchester Computer Centre at the University of Manchester in England together with 'SoftLanding Systems' (now called 'Slackware') were the first official LINUX distributions.

RedHat, SuSE , ‘Ubuntu’ and others followed being more successfull today, simply because they are more user friendly.

What is a LINUX distribution?


The case for linux l.jpg

LINUX is certainly a cheaper alternative to other proprietary UNIX systems such as Sun Solaris or HP-UX because:

It runs on a wider range of hardware than them.

You could either download it for free (no support) or purchase it a set of installation media (CD/DVDs) with support from a commercial LINUX vendor (see references) at a cost which is a small fraction of the TCO of a UNIX system.

Today LINUX can also give you an integrated desktop environment with Word Processing, Spreadsheet and development tools at a substantially lower cost than purchasing a Microsoft Windows system.

The case for LINUX


The unix shell l.jpg

Provides a proprietary UNIX systems such as Sun Solaris or HP-UX because:powerful interface to the UNIX Operating System, so you can manipulate data and execute several applications under certain conditions.

Also known as the 'command-line' interface, a bit like the old “Command Prompt” in Windows/DOS systems, but it is not the same.

Comes under different flavours, but all of them do the same thing in slightly different ways.

Knowing the shell well is the ONLY WAY to make the most out of a UNIX system. It can be a bit difficult at the beginning, but since you get used to it, you have made a good friend that will help you address every computational problem!

The UNIX 'shell':


Logging in to the shell l.jpg

In order to be able to use the UNIX shell, you will have to authenticate yourself (tell the system who you are). This process is commonly called the 'login' process, and it involves two steps.

- Know your username and a password.

- Have a means of communicating with the UNIX shell, so you can provide this kind of information.

The first step is quite easy. You contact your system administrator or relevant authority and you obtain a login name and a password for the system. The second step requires a little bit more attention.

Logging in to the shell:


Connecting to a unix system 1 l.jpg

Back in the old UNIX days, users had dedicated machines called terminals that used to display text only information.

Today, most people connect to a UNIX system by means of faster TCP/IP network connections from another UNIX-like workstation or a Microsoft Windows machine.

The safest and most widespread way to connect is the Secure Shell (SSH) protocol. This allows for secure point-to-point communication between your system and the UNIX machine you are trying to login to.

Connecting to a UNIX system (1):


Using ssh from a unix workstation l.jpg

UNIX/LINUX and MAC OS X workstations make the SSH UNIX login process very easy. You will need the IP address or the DNS name of the UNIX workstation. If, for example, you username is ‘georgios’ and you want to login to the UNIX server ‘frigg.uio.no’ ,I would type:

ssh [email protected] [press enter]

The server would then ask me the password for username georgios. If I type the password correctly (note you won't see the password as you type it), then I will be greeted by the shell prompt.

Using SSH from a UNIX workstation:


Using ssh from a windows workstation l.jpg

Windows will almost certainly require the installation of additional SSH terminal client software for this purpose. Programs such as 'F-Secure SSH 'or 'PuTTY' give a basic terminal window for interacting with the Shell.

If you need to display graphics generated on the UNIX machine, you will need an X-Windows client for Windows such as Exceed or X-Win32. Additional configuration steps need to be made, in order to make X-Windows communication possible.

Ask the help of your local system administrator to set up these programs.

Using SSH from a Windows Workstation:


Basic shell principles 1 l.jpg

There is a basic syntax for all commands executed at the shell:

command argument1 argument2 argument3...

command is the name of the actual shell command you wish to

execute. Every command may take a certain number of

arguments (or operands). For example:

cd /mn/proteas/data

“cd” is the actual command and it takes one argument

“/mn/proteas/data”.

Always make sure that you have a space between a shell command and its argument(s).

Basic Shell Principles (1):


Basic shell principles 2 l.jpg

All UNIX shells are shell:case sensitive with regards to both the commands and their arguments, in contrast to versions of Windows/DOS systems. This means that typing:

cd /mydirectory/programs

is not the same as typing:

CD /MYDIRECTORY/PROGRAMS

or even:

Cd /MyDirectory/Programs

Usually, shell commands are lower case, unless otherwise stated.

Basic Shell Principles (2):


The shell prompt l.jpg

When you login into a UNIX system, you will encounter the shell:shell prompt. The shell prompt is an indication that the system is ready to execute your commands, but it also contains useful information. A typical shell prompt looks like the one below:

[email protected] /usr/bin/virexp $

saying that I am currently logged in as user georgios at a server called frigg and I am currently in a directory called virexp that resides under a directory with name /usr/bin/. The $ sign says 'you can type now' and it should have a (sometimes blinking) cursor after it.

The Shell Prompt:


The shell execution path l.jpg

Every shell session has a collection of variables collectively known as the “shell environment”. They control a number of issues like the appearance of the Shell prompt, what program might be your default text editor and many other issues.

Perhaps the most important of these variables that can affect your actions is the “execution path”. This is a list of directories that the shell remembers all the time, in order to automatically reference certain applications (without you remembering where they are). Type echo $PATH at the shell prompt to see this list of directories.

The Shell Execution Path:


Filesystem basics 1 l.jpg

A Filesystem is a special part of the Operating System that is responsible for organising the storage of your data inside a computer.

Again, like the shell, there are several different types of filesystems, but they all perform essentially the same functions (transparent and efficient data storage).

However, for large server systems, the choice of filesystem usually makes or breaks issues such as performance, reliability and storage efficiency.

Network-aware filesystems deserve a special mention, since they allow for efficient and transparent data access via computer networks. Examples: CIFS(Windows and UNIX) and NFS(Unix).

Filesystem basics (1):


Filesystem basics 2 unix files l.jpg

UNIX files are named locations on the computer's storage device. Each filename points to a special filesystem record that contains information about:

The type of file (plain data, executable program, special device)

The user who created the file

Access permissions for the file

The beginning and end of the file record in the filesystem area, as well as its exact position in the filesystem.

Filesystem basics (2): UNIX files


Filesystem basics 3 directories and file paths l.jpg

Directories (or folders) are containers in which files can be grouped.

In a UNIX system, they are arranged in hierarchical mode, starting from the top-level “root” directory ( / ). The root directory branches into several files and root subdirectories.

The consequence of this hierarchy is that each file can be uniquely identified by a 'path'. A 'path' begins with a / (hint:root directory) and continues through a list of subdirectories, all the way down to the filename:

For example: /home/gm/mydata/bac1.seq

Remember not to confuse the term 'path' with the shell's execution path, as described in earlier slides.

Filesystem Basics (3): Directories and file 'paths':


Directory hierarchy diagram l.jpg

/ be grouped.

usr

home

bin

etc

gm

mydata

Directory Hierarchy Diagram:

<- toplevel

<-1st level down

<-2nd level down

/home/gm/mydata/

<-3rd level down

Back1.seq


Navigating the filesystem l.jpg

Use be grouped. pwd to Print your Working Directory. For example, if I login to the host 'biotin' and I type pwd, I get the following:

[email protected] ~ $ pwd

/mn/proteas/u1/georgios

[email protected] ~ $

This means that I am currently in a directory georgios, which is under a directory called u1. This directory itself is under the proteas directory, which lives under the mn directory. Finally the mn directory is under the root (toplevel) directory.

Navigating the filesystem:


Navigating the filesystem 2 l.jpg

In the previous slide, be grouped. /mn/proteas/u1/georgios is your “home” directory (note the ~ symbol after the hostname frigg). This means that whenever you login as an ordinary user, you always have an entry point position in the filesystem.

Your supervisor is now saying: “Under your home directory, you will find a directory called “mysequences”. Could you go to that directory and tell me what kind of files exist under it?”

“Certainly” you reply. “I can use the cd command to get there”

[email protected] ~ $ cd mysequences

./mysequences

[email protected] ~/mysequences $

Navigating the Filesystem (2):


Navigating the filesystem 3 l.jpg

The “cd” command (Change Directory) can be used for moving around the filesystem. It takes a path as its argument.

The path can be “absolute”. For example:From your home directory, you can go to the /usr/bin directory by typing:

[email protected] ~ $ cd /usr/bin

[email protected] /usr/bin $

The path can also be “relative”. For example: If you are already under the /usr directory, you could just type:

[email protected] /usr $ cd bin

./bin

[email protected] /usr/bin $

Navigating the Filesystem (3):


Navigating the filesystem 4 l.jpg

The command “ moving around the filesystem. It takes a path as its argument.cd ..” will get you one level up. For example, if we go back to your supervisor’s request and we assume that you are under the mysequences directory, if you want to go back to the toplevel of your home directory, you type:

[email protected] ~/mysequences $ cd ..

[email protected] ~ $

“..” is a shorthand notation for the previous directory level and it can really save you from typing long directory names that you cannot remember. It always works in a relative path context.

The alternative would be to give an “absolute” path to the cd command:

[email protected] ~/mysequences $ cd /mn/proteas/u1/georgios

[email protected] ~ $

Navigating the Filesystem (4):


Listing files 1 l.jpg

You are back at the mysequences directory under your home directory. Your supervisor asked you to list the files in the directory:

[email protected] ~/mysequences $ ls

seqdocs v2.3_admin.pdf xlrhodop.fasta

[email protected] ~/mysequences $

The ls command lists all the directory contents and is the equivalent of the dir command in DOS/Windows.

Listing Files (1):


Listing files 2 l.jpg

Your supervisor says: “That's not good enough. I want details (file size, permissions, etc). Why don't you use the -la options of the ls command?”

[email protected] ~/mysequences $ ls -la

total 340

drwx------ 3 georgios biotek 62 Mar 26 16:31 .

drwx--x--x 63 georgios biotek 8192 Mar 28 08:45 ..

drwx------ 2 georgios biotek 6 Mar 26 16:31 seqdocs

-rw------- 1 georgios biotek 325479 Mar 26 15:22 v2.3_admin.pdf

-rwxrw---- 1 georgios biotek 1777 Mar 26 15:22 xlrhodop.fasta

Listing Files (2):


Listing files 3 l.jpg

[email protected] ~/mysequences $ details (file size, permissions, etc). Why don't you use the -la options of the ls command?”ls -la

total 340

drwx------ 3 georgios biotek 62 Mar 26 16:31 .

drwx--x--x 63 georgios biotek 8192 Mar 28 08:45 ..

drwx------ 2 georgios biotek 6 Mar 26 16:31 seqdocs

-rw------- 1 georgios biotek 325479 Mar 26 15:22 v2.3_admin.pdf

-rwxrw---- 1 georgios biotek 1777 Mar 26 15:22 xlrhodop.fasta

The third column from the left states the user owner of the listed files (georgios). The biotek indication of the fourth column indicates the file group (concept introduced later). The fifth column indicates the size of the file in bytes.

Listing Files (3):


Locating files in the directory tree l.jpg

The supervisor says: “Help! I have placed a file called xlrhodop.fast or xlrhodop.fasta (I can't remember the name) and now I can't find it. Can you help me locate it?”

In order to save the day, you can employ the find command. Its generic syntax is:

find [starting point] -name filename -print

starting point indicates the directory tree position that we wish to start searching. Filename could be an approximation of the file name (it doesn't have to be exact).

Locating files in the directory tree:


Locating filenames in the directory tree 2 l.jpg

[email protected] ~ $ find ~/ -name xlrhodop.fas* xlrhodop.fast or xlrhodop.fasta (I can't remember the name) and now I can't find it. Can you help me locate it?”

/mn/proteas/u1/georgios/xlrhodop.fasta

/mn/proteas/u1/georgios/mysequences/xlrhodop.fasta

Note that the wildcard character (*) towards the end of the filename we are trying to search for. This says that we know that the name contains the string “xlrhodop.fas”. This would match all relevant filenames (reporting their exact location in the directory tree)

/mn/biotroll/u1/georgios/xlrhodop.fasta

/mn/biotroll/u1/georgios/mysequences/xlrhodop.fasta

Locating filenames in the directory tree (2)


File permissions 1 l.jpg

You ask the supervisor: “What are these strange characters in the left most column of the ls -la output mean (-rw------- )??”.

Every file in UNIX has a set of permission flags that define in a strict way, who is allowed to read, write (modify) or execute that file.

For example, let's take one of the listed files of the ls -la output command:

-rwx------ 1 georgios biotek 325479 Mar 26 15:22 v2.3_admin.pdf

Starting from the left, this says: The file xlrhodop.fasta can be read (r)read, (w)modified,(x)executed by its owner (georgios). Ignore the rest of the flags for now.

File Permissions (1):


File permissions 2 l.jpg

Directories (remember they are UNIX special files) are no exception to this rule and they also have permission flags. For example:

drwx------ 2 georgios biotek 6 Mar 26 16:31 seqdocs

Note the leftmost flag (d). This indicates that seqdocs is a directory and user georgios has full permissions (read, write and execute) for that directory. Hence, what we say about file permissions is true for directory permissions with a few exceptions (see special file permission consideration slides).

File Permissions (2)


Changing file permissions 1 l.jpg

The supervisor says “The exception to this rule and they also have permission flags. For example:file v2.3_admin.pdf is quite important and should not be modified. Can we have it as read only please? Use the chmod (change mode) command.”

The generic syntax for the chmod command is:

chmod [u|g|o (+|-) (r,w,x)] [filename]

DON'T PANIC! We will explain this cryptic syntax with some examples!

Changing File Permissions (1):


Changing file permissions 2 l.jpg

In order to satisfy your supervisor’s request, the file permissions were:

-rw------- 1 georgios biotek 325479 Mar 26 15:22 v2.3_admin.pdf

Thus, in order to make the file read only we need to remove the (w) flag. We type at the prompt:

[email protected] ~/mysequences $ chmod u-w v2.3_admin.pdf

The above says remove (-) the write permission (w) for the user (u) who is the owner of the file. This is the meaning of the u-w flag. After this action, ls -la should now indicate:

-r-------- 1 georgios biotek 325479 Mar 26 15:22 v2.3_admin.pdf

This is now a read-only (ro) file.

Changing File Permissions (2):


Changing file permissions 3 l.jpg

If we wanted to add back the write permission flag, we would type:

[email protected] ~/mysequences $ chmod u+w v2.3_admin.pdf

The + sign says add write permissions (w) for the user (u) that owns the file.

You can also add/remove more than one flag at a time:

[email protected] ~/mysequences $ chmod u-wx v2.3_admin.pdf

This would remove write (w) and execute permissions (x).

Changing File Permissions (3):


Special file permission considerations 1 l.jpg

The execute permission is important when you are dealing with programs that you wish to run. Whether these programs are binary files or collections of shell commands (scripts) it doesn't matter. In order to run those programs, you will always have to set the (x) permission flag.

When changing permissions for directories, keep in mind that under some special cases, you will need to enable the x flag, in order to allow access to the directory. Read permission might not be enough to allow access to the directory. This varies amongst different UNIX flavours.

Special File Permission Considerations (1):


Special file permission considerations 2 l.jpg

If a chmod command fails to execute by giving you an error message of the type, when typing

chmod: changing permissions of `testfile.fasta': Operation not permitted

make sure you check who owns the file with the ls -la command. If you try and change the permissions of a file you do not own, the operation will fail. In fact, insufficient permissions can affect the behaviour of all UNIX shell commands, not only chmode.

Special File Permission Considerations (2):


Deleting files l.jpg

Given the right permissions, you can remove a file using the rm command. If, for example, you have a file named testfile.fasta and you want to remove it, you type:

[email protected] ~/mysequences $ rm testfile.fasta

CAUTION:Take great care when you use the rm command. Whatever you delete, you WILL NOT BE ABLE TO UNDELETE. There is no “Recycle Bin” in command line UNIX. Always check where you are with the pwd command

Deleting files :


Viewing the file contents l.jpg

The supervisor says: “How do I view the contents of a file? I want a simple shell command that will show the file contents.”

The cat command is probably one of the most frequently used commands. It displays the contents of the file. For example:

cat xlrhodop.fasta

will display the contents of the file xlrhodop.fasta on the screen, despite the fact that this command is used for concatenatig files.

An alternative way of viewing the file contents is to use a text editor. We are going to cover the basics of text-editors in the tutorial later in the course.

Viewing the file contents:


Viewing file contents 2 l.jpg

Be carefull NOT to attempt to view the contents of an executable (binary) file with cat. Your terminal will be filled with garbage characters and you might loose your connection.Here is a sample output of viewing a binary file with cat:

000731 (Red H▒├ Li┼┤│ 7.2 2.96-10701.001.001.001.001.001.001.001.001.001.001.001.001.001.01.sy└├▒b.s├r├▒b.shs├r├▒b.i┼├erp.┼o├e.ABI-├▒±.h▒sh.dy┼sy└.dy┼s├r.±┼┤.┬ersio┼.±┼┤.┬ersio┼_r.re┌.dy┼.re┌.p┌├.i┼i├.├e│├.°i┼i.rod▒├▒.d▒├▒.eh_°r▒└e.dy┼▒└ic.c├ors.d├ors.±o├.bss.co└└e┼├.┼o├e???# 1(?(X7????

If you are uncertain about whether a file is a text file or a binary one, you can use the strings command. This will give you all the valid alphanumeric characters of the file and will certainly prevent you from loosing your terminal connection:

strings xlrhodop.fasta

Viewing file contents (2):


Viewing file contents 3 l.jpg

The supervisor says: “Ohh! I tried to use cat to view a file but the output is too long for my terminal screen. The text keeps scrolling and I loose the first lines of the text. Can I stop this somehow?”

The less command can actually allow you to view a file, but it will stop the scrolling of the output, when your terminal window is filled. You can then press enter to gradually scroll down to the rest of the file content output:

less xlrhodop.fasta

The more command would do exactly the same thing.

Viewing file contents (3):


Viewing file contents 4 l.jpg

Alternatively, if you suspect that the information you want to retrieve is towards the beginning or the end of the file, you can use head:

head xlrhodop.fasta

This displays the beginning of the file. On the other hand, tail can display the end of the file.

tail xlrhodop.fasta

Both of these commands can be tailored to display a certain number of lines from the beginning (head) or the end (tail of the file):

head -3 xlrhodop.fasta -> displays the first 3 lines of the file

tail -3 xlrhodop.fasta -> displays the last 3 lines of the file

OR

head –n 3 xlrhodop.fasta -> newer systems use –n instead of -

tail –n 3 xlrhodop.fasta -> newer systems use –n instad of -

Viewing File Contents (4):


Creating directories l.jpg

The supervisor says: “We need a new directory to store all the pdf documents. Could you create a new directory called pdfdoc under the mysequences directory?”

[email protected] ~/mysequences $ mkdir pdfdoc

[email protected] ~/mysequences $ ls -la

total 340

drwx------ 4 georgios biotek 75 Mar 28 15:15 .

drwx--x--x 63 georgios biotek 8192 Mar 28 14:53 ..

drwx------ 2 georgios biotek 6 Mar 28 15:15 pdfdoc

drwx------ 2 georgios biotek 6 Mar 26 16:31 seqdocs

-r-------- 1 georgios biotek 325479 Mar 26 15:22 v2.3_admin.pdf

-r-------- 1 georgios biotek 1777 Mar 26 15:22 xlrhodop.fasta

Creating Directories:


Removing directories 1 l.jpg

“What about the seqdocs directory?”, you ask. “Delete it using the rmdir command”, the supervisor replies.

[email protected] ~/mysequences $ rmdir seqdocs

So your directory structure should now look like this.

total 340

drwx------ 3 georgios biotek 61 Mar 28 15:25 .

drwx--x--x 63 georgios biotek 8192 Mar 28 14:53 ..

drwx------ 2 georgios biotek 6 Mar 28 15:15 pdfdoc

-r-------- 1 georgios biotek 325479 Mar 26 15:22 v2.3_admin.pdf

-r-------- 1 georgios biotek 1777 Mar 26 15:22 xlrhodop.fasta

Removing Directories (1):


Removing directories 2 l.jpg

The it using the rmdir command will promptly remove a directory if and only if it is empty. If the directory you are trying to remove (example:pfddoc) contains files, rmdir will fail with the following error message:

rmdir: `pdfdoc': File exists

You then have to delete all the files under the directory pdfdoc and then issue the rmdir command.

The alternative would be to use the rm command. Remember, directories are 'special' files, so you could remove them with rm. The next slide shows you how.

Removing Directories (2):


Removing directories 3 l.jpg

rm -r -f [directory name] it using the

The -r option says delete directories recursively. The -f option forces the command to go ahead, despite the fact that the target is a directory and has files under it. Both options are required. For example, in order to delete a directory pdfdoc under the ~/mysequences directory, you would type:

rm -r -f pdfdoc/

CAUTION: The usage of rm in this way is even more dangerous, because it will delete EVERYTHING at a selected directory tree point, all the way down to the leaf nodes. Always check where you are with pwd first. If you delete the files, they will be gone forever!

Removing Directories (3):


Copying files 1 l.jpg

The supervisor says:”Under the ~/mysequences directory there is a file called v2.3_admin.pdf . Could you make another copy of that file with the name 23adminbeta.pdf ?”

You can now use the cp command. The command's general syntax is:

cp [sourcefilepath] [destfilepath]

sourcefilepath:absolute or relative path of the file we want to copy.

destfilepath:absolute or relative path of the new file. This might include a new filename. If you specify a different directory for the new destination file and NOT a filename, the source file's name is used by default.

Some examples to illustrate these points follow.

Copying Files (1):


Copying files 2 l.jpg

copy the there is a file called v2.3_admin.pdf file as 23adminbeta.pdf under the same directory (~/mysequences),we type the following:

[email protected] ~/mysequences $ cp v2.3_admin.pdf 23adminbeta.pdf

As a result, we should now have two files with exactly identical contents. Note that the size and the permission contents indicate that the files are identical.

-r-------- 1 georgios biotek 325479 Mar 28 17:01 23adminbeta.pdf

-r-------- 1 georgios biotek 325479 Mar 26 15:22 v2.3_admin.pdf

Also note that cp was executed this time with relative paths for the source and destination files.

Copying Files (2):


Copying files 3 l.jpg

If the supervisor had said:”Could you make a copy of the there is a file called v2.3_admin.pdf file into the pdfdoc directory with the name 23adminbeta.pdf“, you could then type:

cp v2.3_admin.pdf perldoc/23adminbeta.pdf

By default, the copy command preserves the permissions and ownership rights of files. If in doubt, use the -p flag. This situation can occur when performing a copy of the file from computer to computer using specialist filesystems such as NFS..

Copying Files (3):


Copying directories l.jpg

You could copy entire directories recursively (including any files and their entire subdirectories) by using the cp command. If the sourcefilepath is a directory and the command is called with the -p and -r flags. For example, to make an exact copy of the pdfdoc directory under the ~/mysequences directory, type:

[email protected] ~/mysequences $ cp -p -r pdfdoc/ pdfcopy/

The -p flag preserves the permission and ownership properties and the -r instructs copy to copy all subdirectories under pdfdoc (recursive copy).

Copying Directories:


Moving files l.jpg

cp copies a file (or directory) under a different name or new location, but it leaves the source file in its old place. However, sometimes we wish to move the file, in that we wish to copy the file to a new location without preserving the old one. This is when we can use the mv command, with the following syntax:

mv sourcefilepath destfilepath

sourcefilepath:absolute or relative path of the file we want to copy.

destfilepath:absolute or relative path of the new file. This might include a new filename. If you specify a different directory for the new destination file and NOT a filename, the source file's name is used by default.

Moving Files:


Moving or renaming files 2 l.jpg

In order to move the file xlrhodop.fasta to myxlr.fasta we type:

[email protected] ~/mysequences $ mv xlrhodop.fasta myxlr.fasta

This removes the xlrhodop.fasta file and re-generates it with the name myxlr.fasta, under the same directory.

-r-------- 1 georgios biotek 1777 Mar 26 15:22 myxlr.fasta

mv does not only preserve file permissions and ownership rights but it does also preserve timestamps, so it is an effective way to rename a file. The UNIX shell has a rename command, but mv could be used effectively to rename a file. All the points we have made about mv for files are also true for directories.

Moving (or renaming) Files (2):


Redirecting command output l.jpg

The type:> symbol is the output redirection operator and can be used to re-direct the output of any UNIX command that prints something on the screen.

Lets suppose that you want to merge two fasta sequences into a single file. We have seen earlier that the cat command can be used to concatenate (ie join) the contents of a file. So, if you type something like:

cat myseq1.fasta myseq2.fasta

it would print the contents of boh files one-after the other on the screen (stdout). But what you really want is to place this output to a file. You can then type:

cat myseq1.fasta myseq2.fasta > mergedseq.fasta

to place the output in the file mergedseq.fasta .

Redirecting command output:


Redirecting command input l.jpg

Suppose that you have finished an extensive blast search, and you want to mail the results to your lecturer or colleague. Mail is a UNIX-level program that performs the function of sending simple e-mails. Normally, you would type mail and then type the message on the keyboard. However, if you just want to mail the results, you could type:

mail [email protected] < blast_report

So here, you utilise the input redirector (<) to say to the mail program, “don't expect input from the keyboard, but mail all the file contents instead as input”.

Redirecting command input:


Chaining command outputs and inputs using pipes l.jpg

Quite often in UNIX, we need to direct the standard output of one command to the standard input of another. The most commonly used operator to do that is the pipe oparator | .

Suppose for example that we need to count the number of lines of a text file to see how long it is. The command wc -l can perform this action. However, this command expects its standard input from the keyboard. So, what if we redirect all the file contents to this command by doing something like:

cat mytext.txt | wc -l

The cat command will print all the lines of the file. However, instead of doing that on the screen, it gives all the output to the wc -l command. The result is an integer representing the number of lines of the mytext.txt file.

Chaining command outputs and inputs using 'pipes':


Powerful pipes l.jpg

Just to demonstrate the power of the UNIX shell, this line extracts all the DNA triplets from the output of a compseq command and places them on a single file. All in one shell command line (watch the demonstration). Remember, your ability to extract pattern from files is a key tool for those of you who are going to follow the bioinformatics path!

cat l02870.timer | grep ^[ATCG] | grep -iv tot | cut -c1-3

Powerful Pipes:


References l.jpg

UNIX has a wealth of on-line and hard-copy references available. This tutorial is by no means exhaustive and you should consult a variety of sources to further enhance your knowledge.

UNIX has a built-in reference manual. The man command should be you best friend, whenever you need help for a particular command. For example, type man cat at your shell prompt. Every UNIX system should have this facility.

However, man is good when you roughly know the basics of the command you are having a problem with. What if you don't know which command to use. Then you use the apropos command. Let's say for example that I am looking for pattern matching commands. I would type apropos pattern at the shell prompt, and this would give me a list of relevant commands.

References:


References 2 l.jpg

EMBnet UNIX Quick Quide: Useful summary of basic UNIX commands:

http://www.no.embnet.org/EMBNET/quickguides/UNIX03.pdf

University of Surrey Unix Tutorial for Beginners on the World Wide Web: http://www.ee.surrey.ac.uk/Teaching/Unix/

The LINUX tutorial:http://www.tldp.org/LDP/gs/node1.html

“Developing Bioinformatics Computer Skills”, O'REILLY PRESS, ISBN: 1-56592-664-1, useful for Biologists and Bioinformaticians, especially for beginners.

References (2):


ad