1 / 17

GSC-II Classifications Oct 2000 Annual Meeting

GSC-II Classifications Oct 2000 Annual Meeting. V. Laidler G. Hawkins, R. White, R. Smart, A. Rosenberg, A. Spagna. Preliminary Classification. Goal: Classify as well as possible to plate limit Metric: Minimize overall number of errors Procedure:

lenora
Download Presentation

GSC-II Classifications Oct 2000 Annual Meeting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GSC-II ClassificationsOct 2000 Annual Meeting V. Laidler G. Hawkins, R. White, R. Smart, A. Rosenberg, A. Spagna

  2. Preliminary Classification Goal: Classify as well as possible to plate limit Metric: Minimize overall number of errors Procedure: • Use ranks to handle plate to plate variation • Match training population to sky population • OC1 oblique decision tree (Murthy et al) • Build several decision trees & let them vote • Classification categories star / nonstar / defect Classification / Laidler

  3. Next Step Classification Goal: Provide reliable guide stars to V~19(?) Metric: Minimize contaminationof “stars” to Vlim while maintaining sufficient completenessfor adequate coverage Contamination: We called it a star but it’s really nonstellar Completeness: Everything that is really a star is called a star Classification / Laidler

  4. Development Areas • Multi-plate weighted voting • Training set magnitude distribution • Training set sources • Classification categories • Classification features • Object selection Available In progress Future Classification / Laidler

  5. Multi-plate Weighted Voting • Weights calculated empirically from percentages of misclassifications(NED, NPM, ~4 plates per survey) • Compensates for observed bias in classifier and breaks ties Classification / Laidler

  6. MP weighted voting compared to Mendez Galaxy model • Current classification comes from a single plate • Multiplate weighted voting is straightforward DB operation Conservative star selection further reduces contamination; coverage remains adequate Classification / Laidler

  7. Training set mag distribution:What happens to V < Vlim objects? • Optimized approach • have more dynamic range • contribute all the weight when counting errors Preliminary approach • occupy 20% of ranked hyperspace • are outnumbered when counting errors • contain the same classification bias as the sky • are free of classification bias Classification / Laidler

  8. Training Set Sources Classification Categories • Decision trees can be improved by using training sets with smaller dispersion in parameter space • Catalog objects will likely provide cleaner, better separated populations • Galaxies and blends are different => reside in different areas of parameter space => individually constitute better defined populations than when combined • Galaxy / blend classifications are value added to the catalog Classification / Laidler

  9. New training set • Magnitude balanced to F=17: bright only • Star/galaxy/blend classifications • Stars, galaxies from catalogsNED,NPM,CAMC,LCRS • Blends from deblender “parent” objects • 1200 objects XP330, XP853, XP005 b={48,41,28} Classification / Laidler

  10. New training set: Compare to production classifier • “Above all, do no harm” • Visually examine objects that changed classifications Classification / Laidler

  11. New training set: compare to external catalogs Significant improvement in magnitude range of training set • Extend training set: can we extend this performance to Vlim? • Possibly use star/galaxy/blend to Vlim, star/nonstar/defect below Classification / Laidler

  12. The “curse of dimensionality” tells us that tree performance can be improved by reducing the number of features Edinburgh group has used two features specifically to separate blends from galaxies Current classification features Maximum Density Integrated Density Semimajor axis Semiminor axis Ellipticity Unweighted semimajor axis Unweighted semiminor axis Unweighted ellipticity 4 texture features 2 spike features 16 areas Future work: Classification features Classification / Laidler

  13. Future work: Object Selection • Object selection can be considered an additional classification step • Select based on: • Blend status • Multi plate information • Probability • Select for functional or science goals: • Minimize contamination • Maximize completeness • Probability comes from leaf population • Final probability comes from averaging probabilities from each tree • Can we use probabilities to further optimize guide star selection? Classification / Laidler

  14. What do the probabilities mean? • Do the probabilities measure the observed population? No. This is not unexpected. Decision trees are optimized to produce correct answers, not to produce accurate models of the probability function. • Do the probabilities indicate reliability? Yes. • Conclusion: We can use the probabilities to construct a “class quality” field, but should not take them at face value. Classification / Laidler

  15. How to Improve a Classifier Classification / Laidler

  16. Classification / Laidler

  17. Using Ranks • Sort the objects in order by the raw feature • Assign a ranked feature based on position in the list Classification / Laidler

More Related