Big Data
80 likes | 229 Views
Big Data. Steven Gollmer Cedarville University. Working with Large Data. Accessing data Collection and calibration assumptions Selecting appropriate parameters Formatting Calculation Testing hypothesis. Hipparcos Space Astrometry. Main Page
Big Data
E N D
Presentation Transcript
Big Data Steven Gollmer Cedarville University
Working with Large Data • Accessing data • Collection and calibration assumptions • Selecting appropriate parameters • Formatting • Calculation • Testing hypothesis
Hipparcos Space Astrometry • Main Page • http://www.rssd.esa.int/index.php?project=HIPPARCOS • Data Catalogues • http://www.rssd.esa.int/index.php?project=HIPPARCOS&page=Overview • http://cdsweb.u-strasbg.fr/ • Software • Desktop - http://www.rssd.esa.int/index.php?project=HIPPARCOS&page=Celestia2000 • Search tool - http://www.rssd.esa.int/index.php?project=HIPPARCOS&page=multisearch2 • Data Format • Flexible Image Transport System (FITS) - http://fits.gsfc.nasa.gov/
Sloan Digital Sky Survey • Main Page • http://www.sdss.org/ • Data • 9th Data Release - http://www.sdss3.org/dr9/ • Archive Server - http://dr9.sdss3.org/ • Software • IDL - http://www.sdss3.org/dr9/software/
Weather Data • NOAA National Climatic Data Center • http://www.ncdc.noaa.gov/ • Popular Data - http://www.ncdc.noaa.gov/most-popular-data • Environmental Modeling Center • http://www.emc.ncep.noaa.gov/
TERRA/AQUA • http://terra.nasa.gov • http://aqua.nasa.gov • Data • LARC DAAC - http://eosweb.larc.nasa.gov/ • LAADS Web - http://ladsweb.nascom.nasa.gov/index.html • Format • NetCDF - http://www.unidata.ucar.edu/software/netcdf/ • HDF - http://www.hdfgroup.org/
Other Topics of Interest • Topics of Interest • Extra-Solar Planets • Asteroid Mapping and Near Earth Detection • Earthquakes • Agencies and Products • NASA - http://www.nasa.gov/home/index.html • ESA - http://www.esa.int/ESA • USGS - http://www.usgs.gov/ • GOES - http://www.goes.noaa.gov/ • Paleoclimatology - http://www.ncdc.noaa.gov/paleo/pubs/pcn/pcn-proxy.html
Hypothesis Testing • P-value • Probability of a value being found assuming the null hypothesis. • Usually reject the null hypothesis if p < 0.05 or 0.01 (5% or 1%) • May have more stringent criteria for rejection. • T-test • Assume a normal distribution • One-sample test • Two-sample test • Check significance using T distribution table • If number of samples is large, then z-test will work on one-sample test • erf(x)= • One Tail: z=1/2(1+erf(x/) Two Tail: z=erf(x/)