70 likes | 177 Views
Beginning BioPerl for Biologists. MPI Ploen J un Wang. Perl modules. Collections of “Object” definitions and functions Usage: Put in Perl module environmental path Or, use “use lib ”. 1, BioPerl and BioSeqIO. Begin! Use “ Data::Dumper ” Open a sequence file by Bio::SeqIO
E N D
Beginning BioPerl for Biologists MPI Ploen Jun Wang
Perl modules • Collections of “Object” definitions and functions • Usage: • Put in Perl module environmental path • Or, use “use lib ”
1, BioPerl and BioSeqIO • Begin! • Use “Data::Dumper” • Open a sequence file by Bio::SeqIO • Check data with dumper • Write another sequence file use SeqIO • Bio::Seq object • Functions related to Bio::Seq
2, do something (Blast) • Use Bio::Tools::Run::RemoteBlast • Modify the usage from the perl module • Get a hit-list and print scores, etc • Get the sequences by sequence accession numbers • use Bio::DB::GenBank; • Get sequences • Save them
2, do something more (clustalw) • use Bio::Tools::Run::Alignment::Clustalw • Read all sequences you get from blast • Do an alignment and save the alignment • Make a guidance tree
3, sequence annotations (features) • Also save a genbank files for blast result • Get information for the sequences Bio::SeqFeatureI • Also try to make a feature (add annotations to a raw sequences ) Bio::SeqFeature::Generic
7, primary Bio::Fastq • Get a Fastq file (e.g. through ftp.ebi.ac.uk/) • Get standard sequence information • Understand quality scores and do filtering • Check MID and assign groups (maybe on raw data) • Some recent updates in processing NGS data • call programs using either module or system • Parse results from each one and connect to form a pipeline