Representative sets and Clustering. Tom Oldfield. Representative sets. A subset of data that provides a statistically valid sample set for the complete data. A set structure fragments that best represent the “protein databank” or “protein space” during data analysis . What is the PDB ?.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
A subset of data that provides a statistically valid sample set for the complete data.
A set structure fragments that best represent the “protein databank” or “protein space” during data analysis
Lysoyme was used in a systematic survey to study the structural effect of mutating each residue.
It is a list of solutions to experiment
difficult to use both data in some analysis.
Fold space is biased by experiment – no membrane proteins.
80 % of the information for a X-ray structure is unknown (phase problem)
Another story !
Current Fold space is “not-closed”
Should use all the structures with that site.
Representatives selected by non-fold analysis
MSDmine has the facility to define your own list
A group of similar things
Moderately easy (direct solution) and well defined and fast. 1D
Difficult (iterative & non-exhaustive, non-transitive data, multiple solutions, non-closed data)
Needs biological/chemical knowledge first