1 / 28

SDSS Data Release 6 Access to DR6 and SEGUE Catalog Data

SDSS Data Release 6 Access to DR6 and SEGUE Catalog Data. Ani Thakar Alex Szalay, Maria Nieto-Santisteban, Nolan Li, Wil O’Mullane, Adrian Pope, Tamas Budavari, George Fekete, Jordan Raddick,Sam Carliles JHU Brian Yanny, Svetlana Lebedeva FermiLab Jim Gray Microsoft Research. Outline.

beau
Download Presentation

SDSS Data Release 6 Access to DR6 and SEGUE Catalog Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SDSS Data Release 6Access to DR6 and SEGUE Catalog Data Ani Thakar Alex Szalay, Maria Nieto-Santisteban, Nolan Li, Wil O’Mullane, Adrian Pope, Tamas Budavari, George Fekete, Jordan Raddick,Sam Carliles JHU Brian Yanny, Svetlana Lebedeva FermiLab Jim Gray Microsoft Research

  2. Outline • SDSS and Data Overview • SDSS-II and DR6 • CAS Data Access • SkyServer, ImgCutout, CasJobs • Help resources and sample queries • Restricted collab access • SDSS and other datasets • VO services • EPO content JHU CAS Seminar, March 6, 2007

  3. Catalog Archive Server (JHU) Multi-Fiber SpectroGraph (JHU) SDSS • Digital map in 5 spectral bands covering ¼ of the sky • 40+ TB of raw pixel data • Photometric catalog with more than 200 million objects • Spectra of ~ 1 million objects • Data Release 5 (DR5) last public release: 240 M images, 740 k spectra Apache Point Observatory, NM • JHU contributions: • Multi-Fiber Spectrograph • 20” Photometric Telescope • Catalog Archive Server DBMS • All data is served from FermiLab (master archive site) • SDSS-II is the continuation of SDSS through 2008 JHU CAS Seminar, March 6, 2007

  4. SDSS Data Overview • Catalog Archive Server (CAS) • Science parameters extracted to catalogs • Stuffed into relational DBMS (SQL Server) • Heavily indexed, optimized • Online access via SkyServer • Several levels of access, query tools Data Archive Server (DAS) FITS files (raw data) Images, spectra, corrected frames, atlas images, binned images, masks Online form-based access Rsync and wget file retrieval SDSS Data Release cas.sdss.org skyserver.sdss.org www.sdss.org das.sdss.org/DRx-cgi-bin/DAS JHU CAS Seminar, March 6, 2007

  5. SDSS Data Releases 5%, 200GB 20%, 1TB 35%, 2TB 52%, 3TB 80%, 4.5TB 66%, 3.8TB (DR6) DR3 DR4 EDR DR5 DR2 DR1 Jan 2004 Jan 2005 Jan 2003 Jan 2006 Jan 2001 Jan 2002 Jan 2007 Jun Jun Jun Jun Jun Jun JHU CAS Seminar, March 6, 2007

  6. SDSS-II • Legacy • Continuation of SDSS-I (fill out 10k sq.deg.) • Completeness is same as for SDSS-I • Flux limits are the same • Target all galaxies with r_petro < 17.77, plus LRGs • SEGUE • Detailed 3-d map of the Galaxy • Spectra of 240,000 stars in the disk and spheroid • Age, composition and phase space distribution • CAS component cataloged in SegueDRx DB • Supernova Survey • Repeated scans of SDSS Southern Stripe over 3 mths/yr • Data not available in CAS yet, will be on DAS soon JHU CAS Seminar, March 6, 2007

  7. Publication Policy • Click on Collaboration link on www.sdss.org • Scroll to bottom • Click on SDSS Publication Procedures • Proprietary data • Announce project to SDSS Projects Page • Add SDSS credits/acknowledgements to papers • Reference SDSS Technical Papers • Post manuscript to SDSS Publications Page • External Collaborators and Participants • Post requests to sdss-coco and sdss-general mailing lists • Ask your local CoCo rep (he won’t bite) JHU CAS Seminar, March 6, 2007

  8. CAS Datasets • BestDRx • Latest, greatest calibration of the data • Photometric and spectroscopic objects • The default and most accessed (by far) dataset • TargDRx • The calibration from which spectroscopic targets were chosen • RUNS • All the runs (processings) other than Best, Target • SegueDRx • New with DR6 (SDSS-II) JHU CAS Seminar, March 6, 2007

  9. CAS Data Model (Best DB) JHU CAS Seminar, March 6, 2007

  10. DR6 CAS • BestDR6, TargDR6 and SegueDR6 databases • SegueDR6 = SEGUE stripes • May be rolled into BestDRx in the future • SkyServer: http://cas.sdss.org/collabdr6 • Also pwd access at http://cas.sdss.org/collabdr6pw for non-collab IPs • See message sdss-archive/2935 for uname/pwd • CasJobs: • DR6 and DR6QA targets point to BestDR6 DB • TARGDR6, SEGUEDR6 targets • Need to have “collab” privilege set in user profile JHU CAS Seminar, March 6, 2007

  11. CAS Data Access • SkyServer • Web browser-based synchronous access • Meant to support several levels of users • From casual to moderately advanced queries • From simple form-based to direct SQL queries • From cone (radial) search to crossid type searches • Visual tools to browse image and catalog data • API access, e.g. emacs interface, sqlcl (command-line) • Strict limits on execution time and output size • Fair use for everyone, robots/crawlers discouraged • ImgCutout • Finding Chart and JPEG image browser • Accessible from SkyServer (Visual Tools) JHU CAS Seminar, March 6, 2007

  12. CasJobs • Link in SkyServer (http://cas.sdss.org/casjobs) • Batch Query Workbench, personal user DB (MyDB) • Quick mode: 1 minute cutoff • Submit mode: up to 8 hours in “long” queue • 24-hr queue for collab members • Preferred method for serious queries • MyDB database to save results of your queries • Define your own functions, procedures too • Share your tables with collaborators (groups) • Job history, plotting, FITS/CSV/VOTable output • Table Import (upload) for your own data • Groups to share your results with collaborators • Command-line access Java tool also downloadable JHU CAS Seminar, March 6, 2007

  13. Using CasJobs • Every query has a default “target” • The database that it will operate on • e.g., MyDB, DR4, DR5, CollabDR4, DR5QA, BESTRUNS etc. • Each target is hosted on a separate server • Provides load balancing and performance • Some quirks/restrictions due to distributed execution • Help page and FAQ explain these • Ability to do distributed joins between different datasets • e.g., between DR4 and DR5 or RUNS and DR5 JHU CAS Seminar, March 6, 2007

  14. Collab-only access • collabdrx SkyServer sites • IP-restricted access to collabdrx URL • Password access to collabdrxpw URL from other IPs • Larger query limits (e.g., 1 hour/500k rows) • “collab” privilege in CasJobs • Gives you access to restricted data, additional longer queues (e.g. DR5QA, DR6QA 24-hr queues) • If you have collab priv set, you will see these queues • If you don’t have it, email sdss-helpdesk@fnal.gov JHU CAS Seminar, March 6, 2007

  15. Data available only to Collab - RUNS • RUNS DB • SkyServer: http://cas.sdss.org/runs • Also http://cas.sdss.org/runspw for pwd access • CasJobs: BESTRUNS context • Mostly SEGUE (half) runs and stripe 82 (most) • DRx runs still be added over next few months • Imaging only, no spectra • May be possible to link to BEST spectra with join • Use match tables to match up repeat observations in multiple runs JHU CAS Seminar, March 6, 2007

  16. Glossary , Table Descriptions and Algorithms • Searchable, dynamically loaded from DB, interlinked • The , , symbols are links to Glossary, Algorithm and Table Description entries SkyServer Help Resources • Help menu option on top right of SkyServer • Start with Archive Intro • Next look at Query Limits and How To pages • Then Introduction to SQL and Sample Queries • Look at Optimizing Queries page (esp. bookmark bug) • Try out some of the sample queries • Cut and paste to SQL search page (ToolsSearchSQL) • Browse FAQ and Schema Browser • Data release and technical papers JHU CAS Seminar, March 6, 2007

  17. CasJobs Help • Shares SQL Intro, Schema Browser with SkyServer • Has its own FAQ page • Lists differences between CasJobs and SkyServer due to distributed query execution • Advanced CasJobs Queries page • Neighbor searches with fixed and variable search radii • Cursors • Compound queries JHU CAS Seminar, March 6, 2007

  18. Sample Queries • 50+ sample queries from simple to complex • Available in SkyServer and CasJobs • Clean photometry meta-flags sample • INNER/OUTER JOIN samples • Sector/Region tables usage sample • Variability queries from Robert and Zeljko • CasJobs Advanced Queries Help page • Has examples of neighbor searches, cursors etc. JHU CAS Seminar, March 6, 2007

  19. SkyServer General Tips • Use astro or collab sites • Less “frills”, more direct access to tools • More generous query limits (timeouts, row limits) • See HelpQuery Limits page • Collab site is restricted access, largest query limits • Some extra features • e.g. Imaging/Spectro form query • Each release has separate sites • http://cas.sdss.org/collab/ (the public release) • http://cas.sdss.org/collabdr6/ (not yet public) • Use Contact link when emailing help-desk JHU CAS Seminar, March 6, 2007

  20. Find the right tool for the job • Visual exploration: ToolsVisual Tools • Browse objects one at a time: Explore page • Shows all parameters for object, also its image and spectrum • Browse and find objects on a frame: Finding Chart • Navigate image frames: Navigate • View multiple objects with query: Image List • Browse images: ToolsGet Images • Frames: Fields browser • Spectroscopic plates: Plates browser • View individual spectra: Spectra browser JHU CAS Seminar, March 6, 2007

  21. Finding the right tool (contd.) • SQL search: ToolsSearch • Cone (radial) search: Radial search form • Region (rectangle) search: Rectangular search form • Imaging form query: Imaging Query form • Spectroscopic form query: Spectro Query form • All other searches: SQL search page • Cross-matching: ToolsObject Crossid • Imaging crossid: Upload • Spectro crossid: SpecList • Advanced,unrestricted SQL queries: CasJobs • Your own personal DB • Retrieve results when you are ready JHU CAS Seminar, March 6, 2007

  22. CAS Dos and Don’ts • Do not submit a query unless you have some idea how long it will take! • It could tie up the server for hours (sometimes days)! • Do a “count” query first if necessary • Casjobs also has a graphical query plan (Plan button) • Look at samples, query optimization pages • If not sure, use form queries at first • Use the predefined views for unique/primary objects • PhotoObj, PhotoPrimary for photometry • Consider using PhotoTag table if you only need popular fields • Makes better use of cache • SpecObj for spectra JHU CAS Seminar, March 6, 2007

  23. Dos and Don’ts (contd.) • Use the Contact link to contact Help Desk • Fill the short form, which gives us necessary information • In CasJobs, press Contact after logging in • Automatically attaches your userid to the message • Will speed up response to your request • Do not contact Help Desk staff directly • Questions are answered by a pool of experts as available • More likely to get delayed or no response (unless you can bug them in person ) • If you run out of MyDB space, ask for more!! • We’re pretty liberal about giving more space, but you have to ask (to avoid empty/unused MyDBs taking up space) JHU CAS Seminar, March 6, 2007

  24. SDSS and other datasets • GALEX • Has its own CasJobs page hosted by MAST • SDSS vs GALEX cross-matches • DR5 vs GR2 and DR4 vs GR2 available now • DR5 vs GR3 coming soon • Link table with IDs from both catalogs • SDSS parameters for GALEX matches also extracted • Some older datasets matched in BEST DB • FIRST, USNOB, ROSAT, USNOB proper motions • Open SkyQuery site for other datasets • Only small-area xmatches possible at the moment JHU CAS Seminar, March 6, 2007

  25. Virtual Observatory • JHU is one of the main participants • SDSS is one of the drivers for NVO • Co-PI (Szalay) and Project Manager (Hanisch) here • JHU VO services • Open SkyQuery (http://openskyquery.net/) • VO services (http://voservices.org/) • Spectrum services • Filter profiles • Footprint services (new!) • VO registry (with STScI) • Standard VO services: Cone Search, SIAP JHU CAS Seminar, March 6, 2007

  26. Educational Resources • Extensive EPO content in SkyServer • Use Projects link on the menu at the top • K-12 and college level student exercises and teacher resources • Open SkyQuery / Cross-match EPO project • Jordan Raddick’s talk on May 1 JHU CAS Seminar, March 6, 2007

  27. Coming Attractions (or not …) • FITS cutout service • VO-India is developing a service • There are technical difficulties with mosaicing multiple SDSS frames • How to handle different S/N, PSFs across frames? • Non-SDSS datasets in CasJobs • Merging of CasJobs and Open SkyQuery features • Ability to do large-scale cross-matches with other datasets within CasJobs environment JHU CAS Seminar, March 6, 2007

  28. Thanks! http://www.sdss.org (main site and DAS) http://cas.sdss.org/ (public CAS, redirected to latest public release) http://cas.sdss.org/drX (public CAS, release X) http://cas.sdss.org/astro (astronomers, latest pub release) http://cas.sdss.org/astrodrX (astronomers, release X) http://cas.sdss.org/collab (IP-restricted collab site, latest pub release) http://cas.sdss.org/collabdrX (IP-restricted collab site, release X) http://cas.sdss.org/collabpw (collab pwd access, latest pub release) http://cas.sdss.org/collabdrXpw (collab pwd access, release X) http://www.voservices.org/ (VO services @ JHU) http://www.openskyquery.net/ (Open SkyQuery) http://www.skyserver.org (support site) Software downloads, mirror site resources, data download info JHU CAS Seminar, March 6, 2007

More Related