1 / 18

Databases מאגרי מידע

Databases מאגרי מידע. אחסון שליפה. Different kinds of DBs dealing with biological information retrieved by various means. DNA. RNA. protein. phenotype. Protein sequences Translated nuc sequences Protein domains Protein structure. Diseases polymorhism Gene expression

Download Presentation

Databases מאגרי מידע

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Databases מאגרי מידע אחסון שליפה

  2. Different kinds of DBs dealing with biological information retrieved by various means DNA RNA protein phenotype • Protein sequences • Translated nuc sequences • Protein domains • Protein structure • Diseases • polymorhism • Gene expression • Prot-prot interactions DNA sequences (individual genes or complete genomes) • cDNA • ESTs • Non-coding RNA

  3. Common to all databases • A database is a structured collection of information. • A database is composed of basic objects calledrecordsorentries (רשומות). • Each record is composed offields (שדות),which hold defined data that is related to that record. Let’s consider the following database of students learning bioinfo in HUJI

  4. For some records there is only partial information – some fields contain no data (quality of DB) Some records contain similar data in some of the fields Each record has unique identifier Databases A database can be thought of as a large table, where the rows represent records and the columns represent fields. ID (Accession Numbers): Unique identifiers of the database records.

  5. Data Retrieval • The purpose of databases is not merely to collect and organize data, but mainly to allow advanced data retrieval. • Aquery (שאילתא)is a method to retrieve information from the database. • The organization of each record into predetermined fields, allows us to use queries on fields.

  6. The best search strategy…

  7. Fields Phrase your query Syntax Keywords Boolean operators 1. Think – phrase your scientific question. 2. Choose appropriate database 5. Think, evaluate. The computer is just a machine. You are (hopefully) a thinking organism. 4. Access additional entries discussing same or similar entities by links to additional databases.

  8. Phrasing a query… Terms/words for search [field] + (BOLLEAN OPERATORS) Terms/words for Search [field]

  9. “cell cycle” Boolean Operators 1 AND 2 1 2 cell AND cycle Cell* - cell, cells, cellular etc) 1 OR 2 1 2 cell OR cycle 1 NOT 2 1 2 cell NOT cycle

  10. The secretary wants to locate the record of the student Sharon Asulin but does not remember the last name – search Sharon The search was not limited to a certain field Sharon[allfields]

  11. OOPS !! Retrieved too many records that don’t match the required data - too much noise.

  12. Evaluating Search Results Search results “scientific truth”

  13. What can we do to reduce/eliminate false positives without reducing true positives?

  14. Sensitivity Ability of a method to detect positives, irrespective of how many false positives are reported. Selectivity Ability of a method to reject negatives, irrespective of how many false negatives are rejected. Sensitivity Selectivity

  15. Let’s refine our search Find allstudents whose first name is Sharon Sharon[first name] Keyword synthax (NCBI)field definition

  16. Now we don’t retrieve any answer (false negative?) and we are still not distracted by the noise. The original search phrase sharon[all fields] would have retrieved all the noise but not the required info.

  17. The secretary wants to locate the record of the female student who comes from Cuba but does not remember her name.Search female[gender]AND*cuba*[comments] Keyword synthax (NCBI)field definition Boolean operator

  18. והעיקר, והעיקר : לא לפחד כלל

More Related