1 / 12

Exercise 1: Importing Illumina data

Exercise 1: Importing Illumina data. Using the Import tool File / Import folder. Select the folder IlluminaTeratospermiaHuman6v1_BS1 In the Import files -window choose the action “Use import tool" and click OK

haines
Download Presentation

Exercise 1: Importing Illumina data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exercise 1: Importing Illumina data • Using the Import tool • File / Import folder. Select the folder IlluminaTeratospermiaHuman6v1_BS1 • In the Import files -window choose the action “Use import tool" and click OK • Click the Mark title row –button and click on the title row of the data file. Click Next. • Click the Identifier –button and click on the TargetID column. • Click the Sample –button and click on the AVG column. • Click Finish • Alternative: Importing a whole BeadStudio data file directly • File / Import files. Select the file IlluminaForLumiHuman6v1_BS1.tsv • In the Import files -window choose the action "Import directly" and click OK. This way the file is imported as it is.

  2. Exercise 2: Normalizing Illumina data • Using the IlluminaTeratospermiaHuman6v1_BS1 dataset (separate files) • In the workflow view, double click on the box ”13 files” to select all of them • In the analysis tool section, choose Normalization and Illumina • Click Show parameters and set the chiptype to Human-6v1 • Click Run • Repeat the run using the same chiptype, but setting the normalize.chips to none. • Using the file IlluminaForLumiHuman6v1_BS1.tsv (one whole BS file) • Select the file IlluminaForLumiHuman6v1_BS1.tsv • Choose Normalization and Illumina – lumi pipeline • Click Show parameters and set the chiptype to Human-6v1 • Click Run • Repeat the run using the same chiptype, but setting the normalize.chips to none.

  3. Exercise 3: Describe the experiment • Using the IlluminaTeratospermiaHuman6v1_BS1 dataset (separate files) • Double click the phenodata file • In the phenodata editor, enter 1 in the group column for the control samples and 2 for the affected samples • Using the file IlluminaForLumiHuman6v1_BS1.tsv (one whole BS file) • Double click the phenodata file • In the phenodata editor, click on the original name –column to sort the samples. In the group column mark the replicates with the same number (1, 2 and 3)

  4. Exercise 4: Illumina quality control • Using the IlluminaTeratospermiaHuman6v1_BS1 dataset • Run the tools Statistics / NMDS and Visualization / Dendrogram for both the normalized and the ”mock-normalized” data files • View the result files side by side (use the Detach button) • Using the IlluminaForLumiHuman6v1_BS1.tsv dataset • As above

  5. Exercise 5: Filtering • Select the normalized data and play with different filters • Preprocessing / Filter by SD • Preprocessing / Filter by CV • Preprocessing / Filter by IQR

  6. Exercise 6: Statistical testing • t-test • Select the sd-filter.tsv of the teratospermia dataset • Run Statistics / Two group test using the method t-test • Empirical Bayes • Select the normalized.tsv of the teratospermia dataset • Run Statistics / Two group test using the method empirical Bayes and turning the P-value adjustment off • Run Preprocessing / Filter by SD on the result file two-group.tsv • Run Statistics / Adjust P-values on the result file sd-filter.tsv (you have to specify the P-value column in the parameters) • Compare the results using the Venn diagram • Save the analysis session • File / save session

  7. Exercise 7: Linear modelling - taking several covariants into account at the same • Use a kidney cancer dataset of 17 samples • Start a new session • File / Import folder, select the folder AffyNormalized and Import directly • Right-click the normalized.tsv and link it to the phenodata.tsv. Look what columns you have in the phenodata. • Linear modelling • Select the normalized.tsv and Statistics / Linear modelling. Set group, kidney side and gender as the three main effects. Set donor as the pairing information. • Select the result file pvalues.tsv and run the tool Utilities / Extract genes using a P-value for all the main effect P-value columns (= three times) • Save the session

  8. Exercise 8: Clustering • Open your Illumina session • Hierarchical clustering • Select the adjust-pvalues.tsv • Run Clustering / Hierarchical with default parameters. • Repeat the run using bootstrapping: Set the resampling parameter to bootstrap and number of replicates to 10. • How reliable are the branches? • K-means clustering • Select the adjust-pvalues.tsv • Run the tool ”K-means – estimate K” • Run K-means clustering setting the parameter number of clusters according to your estimated K. • View the clusters using the visualization method Expression profiles • Extract the genes from cluster 1 using Utilities / Extract genes from clustering

  9. Exercise 9: Annotation • Annotate genes • Select the file adjust-pvalues.tsv • Run Annotation / Illumina gene list • Open the result file annotations.html and click the links in the gene and pathway columns to read more about one of the genes • Open the result file annotations.tsv and sort it by the pathway column. Slide the pathway column next to the description column and make it wider

  10. Exercise 10: Pathway analysis • Gene enrichment analysis • Select the file adjust-pvalues.tsv • Run Pathways / Hypergeometric test for KEGG • Are any KEGG pathways enriched in your list of differentially expressed genes? • Using the file annotations.tsv, figure out what are the genes that contributed to the top pathway • Gene set test • Select the file normalized.tsv • Run Pathways / Gene set test and set the parameter pathways.or.genelist to KEGG.

  11. Exercise 11: Promoter analysis • Pattern discovery: do the promoters of similarly expressed genes share a sequence motif? • Select the file extract.tsv containing the genes from cluster 1 • Run Promoter analysis / Weeder. What is the most interesting motif? Check in the matrix (Best occs) what positions are most conserved. • Run Promoter analysis / Cosmo. As judged by the sequence logo, do you find similar motifs?

  12. Exercise 12: Saving and running a workflow • Save a workflow • Prune your teratospermia dataset workflow if necessary • Select the file normalized.tsv and click on the Workflow / Save starting from selected. Give your workflow a meaningful name and save it. • Run a workflow • Open the session called sessionIlluminaTeratospermia.cs • Select the file normalized.tsv and Workflow / Run recent

More Related