instruction to use the svarap program plan
Download
Skip this Video
Download Presentation
Instruction to use the SVARAP program Plan

Loading in 2 Seconds...

play fullscreen
1 / 26

T l chargement - PowerPoint PPT Presentation


  • 223 Views
  • Uploaded on

Instruction to use the SVARAP program Plan Principle of SVARAP program Use of SVARAP: GDE Alignment Formatting the GDE alignment Variability analysis Activation of « macros » Pasting the GDE alignment Checking-up the GDE alignment format

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'T l chargement' - oshin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
instruction to use the svarap program plan
Instruction to use the SVARAP programPlan
  • Principle of SVARAP program
  • Use of SVARAP:
    • GDE Alignment
    • Formatting the GDE alignment
    • Variability analysis
      • Activation of « macros »
      • Pasting the GDE alignment
      • Checking-up the GDE alignment format
      • Rough data of variability analysis by nucleotidic site
      • Variability analysis by window of 50 nucleotides for 2000 nucleotides length
      • Variability analysis by nucleotidic site for 2000 nucleotides length
  • Program ASVARAP: study of amino acid variability
  • Examples
  • Download / References
  • Contact
principle of svarap program
Principle of SVARAP program
  • « SVARAP » (Sequence VARiability Analysis Program) analyses, evidences and graphically represents variability or genetic diversity of nucleotidic sequences. Ii uses a Microsoft Excel® file which is able to analyse simultaneously up to 100 séquences of up to 4000 nucleotides.
  • Variability is defined as the proportion of analysed sequences for which the nucleotide at a given position is not the most frequently found in the studied set of sequences.
  • The program generates graphes and calculates mean, median, minimal and maximal values, and coefficient of variation for windows of 50 nucleotides. It also analyses site by site.
  • Classically, tools aligning sequences identify sites and natures of nucleotidic differences. Quantitative analysis of variability or diversity may increase the level of information to find some discriminant or conserved regions, which could be aimed by PCR; or highly polymorphic « spots ».

Thompson J. D., Gibson T. J., Plewniak F., Jeanmougin F., Higgins D. G. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 1997, 25(24) : 4876-82.

Next

how svarap works
How SVARAP works ?
  • Sequences are aligned and the alignement in GDE format is copied then pasted in a cell of our program that format the sequences to facilitate future analysis. Notably, each nucleotide stand in a different cell to get in a same column the nucleotides corresponding to a same nucleotidic site.
  • Consensus nucleotide at each nucleotidic site (defined as the most frequently found at this position for the studied set of sequences) is automatically generated.
  • The program simultaneously calculates the absolute numbers of each of the 4 nucleotides (G, A, C, T, or deletions or insertions), and their frequencies (en %). Diversity or variability is defined as the proportion of sequences for which, at a given site, nucleotide differ of the nucleotide which is the most frequently found for the studied set of sequences. It is calculated with the formula: 100 – (maximal value in % of frequency for each of the four nucleotide at a given nucleotidic site).The program also calculates the number of nucleotides of different nature harbored at a given site. Results are analysed to calculate for windows of 50 nucleotides the median, mean, minimal and maximal values of variability. Concommitantly, a site by site analysis is also done and given for length of 2000 nucleotides.
  • Finally, SVARAP graphically represents the diversity/variability.
alignment of sequences in gde format
Alignment of sequences in GDE format
  • Initial « material » is a set of sequences (maximuml 100 sequences).
  • SVARAP uses an alignment in GDE format (Genetic Data Environment). Firstly, sequences are aligned with ClustalX v.1.8 [Thompson, 1997] after asking in the Output Format Options for creation of a GDE file. Then, the alignment is copied then pasted in a cell of our file Microsoft Excel® nommé « AnaVarNuc_Pos… ».

Next

to get an alignment in gde format using clustal x v1 8 1 2
To get an alignment in GDE format using clustal X v1.8 (1/2)
  • Open ClustalX (1.8) and append sequences in FASTA format.
  • Select tab « Alignment », then output Format Options...

Next

to get an alignment in gde format using clustal x v1 8 2 2
To get an alignment in GDE format using clustal X v1.8(2/2)
  • Select GDE format.
  • Start alignment.
  • Locate the GDE file.
formatting the gde alignment using microsoft word
Formatting the GDE alignment using Microsoft Word®
  • Like for most of sequences analysis, it is necessary to format sequences.
  • Copy then paste in a Microsoft Word® then 1/ delete all paragraphe jump; 2/ replace the « - » by another kind (. for instance) that do not lead to paragraph jump; 3/ add a paragraph jump before the name of sequences. Then paste a paragraph jump (<enter>) after the name of sequences (and before the 1st nucleotide).
activating macros
Activating « macros »
  • The Microsoft Excel® contains « macros ». It is necessary to activate them to use the file; it is possible to suppress this step :
pasting the gde alignment in svarap
Pasting the GDE alignment in SVARAP

1

2

3

How to analyse > 4000 nucleotides or > 2000 nucleotides simultaneously.

Link to final analysis

4

1

  • 1. 2 files, analysing variability for nucleotides 1 to 2000 or 2001 to 4000, are downloadable, as analysis for 4000 nucleotides cannot be done simultaneously.
  • 2. When using this program: click on column B then key <Suppr> to delete prior work.
  • 3. Paste in a same cell (white space, cell B2, the GDE alignment formatted using Microsoft Word®).
  • Sheet « Paste the alignment »

2

3

verify format of gde alignment 1 2
Verify format of GDE alignment (1/2)
  • In column A, only sequence name, and in columns F, I, L and O, only sequences. Right number of sequences.
  • If not: check the GDE alignment.
  • Sheet « Sep1000 »

Next

verify format of gde alignment 2 2
Verify format of GDE alignment(2/2)
  • In column B, only sequence name, and in column C, only sequences. Right number of sequences.
  • If not: check the GDE alignment.
  • Sheet « Nuc 1-1000 » and « Nuc 1001-2000 »
analysis of variability
Analysis of variability

2

5

6

3

1

4

  • This sheet and the table contain the main part of analysis of variability: the level of variability (1.) correspond to the proportion of sequences for which, at a given nucleotidic site, the nucleotide differ compared with the nucleotide the most frequently found in the studied set of sequences. Positions that are defined (2.) correspond to those defined in your set of sequences. The number of distinct variations (3.) correspond to the number of different nucleotides observed at a given site.
  • This analysis is done by windows of 200 bases for reasons related to Microsoft Excel software (4.).
  • 5. Analysis in absolute value. 6. Analysis in %
  • Sheets « Var...»

1

2

3

4

5

6

Next

consensus sequence on a length of 2000 nucleotides
Consensus sequence on a length of 2000 nucleotides

1

  • The consensus nucleotide is calculated for each of the nucleotidic sites on the whole length of the studied sequences.
  • # (1.) correspond to an indetermination:

examples: major representation equivalent for 2 nucleotides; insertions or deletions as major representation.

  • Sheet « Consensus »

1

Next

rough data of variability by nucleotidic site on a length of 2000 nucleotides
Rough data of variability by nucleotidic site on a length of 2000 nucleotides
  • The variability is calculated for each of the nucleotidic positions on the whole length of the studied sequences.
  • Sheet « Consensus »
analysis by window of 50 nucleotides
Analysis by window of 50 nucleotides
  • Variability is calculated and analysed by windows of 50 nucleotides on the whole length of the studied sequences. The analysis is available:
  • in tables Sheet « Data fen 50 »
  • in graphe Sheet « Fig 1-2000 fen 50 »
analysis by nucleotidic site for a length of 2000 nucleotides 1 2
Analysis by nucleotidic site for a length of 2000 nucleotides (1/2)

1

  • A graph for variability calculated for each of the nucleotidic sites on the whole length of the studied sequences is systematically generated.
  • Sheet « Fig var par position »
  • Each window of 250 nucleotides can be printed separately or copied then pasted in another software (1.). Or all 2000 nucleotides are printable at the same time:

1

Next

analysis by nucleotidic site for a length of 2000 nucleotides 2 2
Analysis by nucleotidic site for a length of 2000 nucleotides(2/2)
  • Look before printing of the variability calculated for each of the nucleotidic positions on the whole length for the studied sequences.
  • Sheet « Fig var par position »
how to analyse more than 4000 nucleotides
How to analyse more than 4000 nucleotides

This program is not only limited concerning the length of studied sequences. It can analyse more than 4000 nucleotides, and more than 2000 nucleotides at the same time.

To analyse more than 4000 nucleotides:

  • Copy the file « AnaVarNuc_Pos 1-2000 »
  • Go to sheet « Paste alignment »
  • Unmask all columns (<Format><Colonnes><Afficher>)
  • Go to cells F2 to F201 and replace 1 by the starting site to analyse in your alignment (e.g. 8000, or 10224); then replace in column G2 to G201, respectively 1001 by a value incremented of 1000 vs the one written in column F (e.g. 9000, or 11224)
  • You have so programmed the analysis of nucleotides 8000 to 10000, or 10224 to 12224.
how to analyse more than 2000 nucleotides at the same time
How to analyse more than 2000 nucleotides at the same time

This program is not only limited concerning the length of studied sequences. It can analyse more than 4000 nucleotides, and more than 2000 nucleotides at the same time.

To analyse more than 2000 nucleotides at the same time:

  • Use the values of variability for 2000 nucleotidic sites ad stored in the sheet called « consensus ». When copying in a new Microsoft Excel® file these values by 2000 nucleotides from several files, you are creating graphics for the appropriate length.
applications for svarap
Applications for SVARAP

An example of use of SVARAP

  • SVARAP produces rapidly graphical representations which can be easily interpreted.
  • It leads in a first step to analyse genetic diversity in a set of sequences by windows of 50 nucleotides.
  • A more precise information is also available with site by site analysis.

Next

contact
Contact
download
Download

Download the instructions

for use of SVARAP

Download SVARAP to analyse

nucleotidic positions 1 to 2000

(Microsoft Excel® v97)

Link to Clustal X v1.8

Download SVARAP to analyse

nucleotidic positions 2001 to 4000

(Microsoft Excel® v97)

Download ASVARAP to analyse

amino acid positions 1 to 1000

(Microsoft Excel® v97)

References

  • URL: http://ifr48.free.fr/recherche/jeu_cadre/jeu_rickettsie.html
1 delete the paragraph jump
1/ Delete the paragraph jump

In Microsoft Word® v97 - French edition:

  • <Edition><Remplacer><Plus><Spécial><Marque de paragraphe><Remplacer tout>
2 replace dashes
2/ Replace dashes

To copy then paste

In Microsoft Word® v97 - French edition:

  • <Edition><Remplacer>
  • Dans rechercher: -
  • Dans remplacer par: ―
  • <Remplacer tout>
3 add paragraph jumps before and after the name of sequences
3/ Add paragraph jumps before and after the name of sequences.

In Microsoft Word® v97 - French edition:

  • <Edition>
  • Dans rechercher: #
  • Dans remplacer par: # par Marque de paragraphe#
application asvarap
Application ASVARAP
  • The study of variability can also concern amino acid sequences (amino acids 1 to 1000). The principle and use are the same as for SVARAP :

Download