1 / 22

Python Crash Course File I/O

Python Crash Course File I/O. Sterrenkundig Practicum 2 V1.0 dd 08-01-2014 Hour 5. File I/O. Types of input/output available Interactive Keyboard Screen Files Ascii/text txt csv Binary Structured FITS > pyFITS, astropy.io.fits URL Pipes. Interactive I/O, fancy output.

kail
Download Presentation

Python Crash Course File I/O

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Python Crash CourseFile I/O Sterrenkundig Practicum 2 V1.0 dd 08-01-2014 Hour 5

  2. File I/O • Types of input/output available • Interactive • Keyboard • Screen • Files • Ascii/text • txt • csv • Binary • Structured • FITS > pyFITS, astropy.io.fits • URL • Pipes

  3. Interactive I/O, fancy output >>> s = 'Hello, world.' >>> str(s) 'Hello, world.' >>> repr(s) "'Hello, world.'" >>> str(1.0/7.0) '0.142857142857' >>> repr(1.0/7.0) '0.14285714285714285' >>> x = 10 * 3.25 >>> y = 200 * 200 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...' >>> print s The value of x is 32.5, and y is 40000... >>> # The repr() of a string adds string quotes and backslashes: ... hello = 'hello, world\n' >>> hellos = repr(hello) >>> print hellos 'hello, world\n' >>> # The argument to repr() may be any Python object: ... repr((x, y, ('spam', 'eggs'))) "(32.5, 40000, ('spam', 'eggs'))"

  4. Interactive I/O, fancy output Old string formatting >>> import math >>> print 'The value of PI is approximately %5.3f.' % math.pi The value of PI is approximately 3.142. New string formatting >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678} >>> for name, phone in table.items(): ... print '{0:10} ==> {1:10d}'.format(name, phone) ... Jack ==> 4098 Dcab ==> 7678 Sjoerd ==> 4127

  5. Formatting I/O A conversion specifier contains two or more characters and has the following components, which must occur in this order: • The "%" character, which marks the start of the specifier. • Mapping key (optional), consisting of a parenthesised sequence of characters (for example, (somename)). • Conversion flags (optional), which affect the result of some conversion types. • Minimum field width (optional). If specified as an "*" (asterisk), the actual width is read from the next element of the tuple in values, and the object to convert comes after the minimum field width and optional precision. • Precision (optional), given as a "." (dot) followed by the precision. If specified as "*" (an asterisk), the actual width is read from the next element of the tuple in values, and the value to convert comes after the precision. • Length modifier (optional). • Conversion type. >>> print '%(language)s has %(#)03d quote types.' % \ {'language': "Python", "#": 2} Python has 002 quote types.

  6. The conversion types are:

  7. Interactive I/O >>> print “Python is great,”, ”isn’t it?” >>> str = raw_input( “Enter your input: ”) >>> print “Received input is: “,str Enter your input: Hello Python Received input is: Hello Python >>> str = input("Enter your input: "); >>> print "Received input is: ", str Enter your input: [x*5 for x in range(2,10,2)] Received input is: [10, 20, 30, 40] If the readline modules was loaded the raw_input() will use it to provide elaborate line editing and history features.

  8. File I/O >>> fname = ‘myfile.dat’ >>> f = file(fname) >>> lines = f.readlines() >>> f.close() >>> f = file(fname) >>> firstline = f.readline() >>> secondline = f.readline() >>> f = file(fname) >>> for l in f: ... print l.split()[1] >>> f.close() >>> outfname = ‘myoutput’ >>> outf = file(outfname, ‘w’) # second argument denotes writable >>> outf.write(‘My very own file\n’) >>> outf.close()

  9. Read File I/O >>> f = open("test.txt") >>> # Read everything into single string: >>> content = f.read() >>> len(content) >>> print content >>> f.read() # At End Of File >>> f.close() >>> # f.read(20) reads (at most) 20 bytes Using with block: >>> with open(’test.txt’, ’r’) as f: ... content = f.read() >>> f.closed CSV file: >>> import csv >>> ifile = open(’photoz.csv’, "r") >>> reader = csv.reader(ifile) >>> for row in reader: ... print row, >>> ifile.close()

  10. Read and write text file >>> from numpy import * >>> data = loadtxt("myfile.txt") # myfile.txt contains 4 columns of numbers >>> t,z = data[:,0], data[:,3] # data is a 2D numpy array, t is 1st col, z is 4th col >>> t,x,y,z = loadtxt("myfile.txt", unpack=True) # to automatically unpack all columns >>> t,z = loadtxt("myfile.txt", usecols = (0,3), unpack=True) # to select just a few columns >>> data = loadtxt("myfile.txt", skiprows = 7) # to skip 7 rows from top of file >>> data = loadtxt("myfile.txt", comments = '!') # use '!' as comment char instead of '#' >>> data = loadtxt("myfile.txt", delimiter=';') # use ';' as column separator instead of whitespace >>> data = loadtxt("myfile.txt", dtype = int) # file contains integers instead of floats >>> from numpy import * >>> savetxt("myfile.txt", data) # data is 2D array >>> savetxt("myfile.txt", x) # if x is 1D array then get 1 column in file. >>> savetxt("myfile.txt", (x,y)) # x,y are 1D arrays. 2 rows in file. >>> savetxt("myfile.txt", transpose((x,y))) # x,y are 1D arrays. 2 columns in file. >>> savetxt("myfile.txt", transpose((x,y)), fmt='%6.3f') # use new format instead of '%.18e' >>> savetxt("myfile.txt", data, delimiter = ';') # use ';' to separate columns instead of space

  11. String formatting for output >>> sigma = 6.76/2.354 >>> print(‘sigma is %5.3f metres’%sigma) sigma is 2.872 metres >>> d = {‘bob’: 1.87, ‘fred’: 1.768} >>> for name, height in d.items(): ... print(‘%s is %.2f metres tall’%(name.capitalize(), height)) ... Bob is 1.87 metres tall Fred is 1.77 metres tall >>> nsweets = range(100) >>> calories = [i * 2.345 for i in nsweets] >>> fout = file(‘sweetinfo.txt’, ‘w’) >>> for i in range(nsweets): ... fout.write(‘%5i %8.3f\n’%(nsweets[i], calories[i])) ... >>> fout.close()

  12. File I/O, CSV files • CSV (Comma Separated Values) format is the most common import and export format for spreadsheets and databases. • Functions • csv.reader • csv.writer • csv.register_dialect • csv.unregister_dialect • csv.get_dialect • csv.list_dialects • csv.field_size_limit

  13. File I/O, CSV files • Reading CSV files • Writing CSV files import csv # imports the csv module f = open('data1.csv', 'rb') # opens the csv file try: reader = csv.reader(f) # creates the reader object for row in reader: # iterates the rows of the file in orders print row # prints each row finally: f.close() # closing import csv ifile = open('test.csv', "rb") reader = csv.reader(ifile) ofile = open('ttest.csv', "wb") writer = csv.writer(ofile, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL) for row in reader: writer.writerow(row) ifile.close() ofile.close()

  14. File I/O, CSV files • The csv module contains a the following quoting options. • csv.QUOTE_ALL Quote everything, regardless of type. • csv.QUOTE_MINIMAL Quote fields with special characters • csv.QUOTE_NONNUMERIC Quote all fields that are not integers or floats • csv.QUOTE_NONE Do not quote anything on output

  15. Handling FITS files - PyFITS http://www.stsci.edu/resources/software_hardware/pyfits Read, write and manipulate all aspects of FITS files extensions headers images tables Low-level interface for details High-level functions for quick and easy use

  16. PyFITS - reading >>> import pyfits >>> imgname = “testimage.fits” >>> img = pyfits.getdata(imgname) >>> img array([[2408, 2408, 1863, ..., 3660, 3660, 4749], [2952, 2408, 1863, ..., 3660, 3115, 4204], [2748, 2748, 2204, ..., 4000, 3455, 4000], ..., [2629, 2901, 2357, ..., 2261, 2806, 2261], [2629, 2901, 3446, ..., 1717, 2261, 1717], [2425, 2697, 3242, ..., 2942, 2125, 1581]], dtype=int16) >>> img.mean() 4958.4371977768678 >>> img[img > 2099].mean() 4975.1730909593043 >> import numpy >>> numpy.median(img) 4244.0

  17. PyFITS – reading FITS images >>> x = 348; y = 97 >>> delta = 5 >>> print img[y-delta:y+delta+1, ... x-delta:x+delta+1].astype(numpy.int) [[5473 5473 3567 3023 3295 3295 3839 4384 4282 4282 3737] [3295 4384 3567 3023 3295 3295 3295 3839 3737 3737 4282] [2478 3567 4112 3023 3295 3295 3295 3295 3397 4486 4486] [3023 3023 3023 3023 2750 2750 3839 3839 3397 4486 3941] [3295 3295 3295 3295 3295 3295 3839 3839 3397 3941 3397] [3295 3295 2750 2750 3295 3295 2750 2750 2852 3397 4486] [2887 2887 2887 2887 3976 3431 3159 2614 3125 3669 4758] [2887 2887 3431 3431 3976 3431 3159 2614 3669 4214 4214] [3159 3703 3159 3703 3431 2887 3703 3159 3941 4486 3669] [3703 3159 2614 3159 3431 2887 3703 3159 3397 3941 3669] [3431 3431 2887 2887 3159 3703 3431 2887 3125 3669 3669]] row = y = first index column = x = second index numbering runs as normal (e.g. in ds9) BUT zero indexed!

  18. PyFITS – reading FITS tables >>> tblname = ‘data/N891PNdata.fits’ >>> d = pyfits.getdata(tblname) >>> d.names ('x0', 'y0', 'rah', 'ram', 'ras', 'decd', 'decm', 'decs', 'wvl', 'vel', 'vhel', 'dvel', 'dvel2', 'xL', 'yL', 'xR', 'yR', 'ID', 'radeg', 'decdeg', 'x', 'y') >>> d.x0 array([ 928.7199707 , 532.61999512, 968.14001465, 519.38000488,… 1838.18994141, 1888.26000977, 1516.2199707 ], dtype=float32) >>> d.field(‘x0’) # case-insensitive array([ 928.7199707 , 532.61999512, 968.14001465, 519.38000488,… 1838.18994141, 1888.26000977, 1516.2199707 ], dtype=float32) >>> select = d.x0 < 200 >>> dsel = d[select] # can select rows all together >>> print dsel.x0 [ 183.05000305 165.55000305 138.47999573 158.02999878 140.96000671 192.58000183 157.02999878 160.1499939 161.1000061 136.58999634 175.19000244]

  19. PyFITS – reading FITS headers >>> h = pyfits.getheader(imgname) >>> print h SIMPLE = T /FITS header BITPIX = 16 /No.Bits per pixel NAXIS = 2 /No.dimensions NAXIS1 = 1059 /Length X axis NAXIS2 = 1059 /Length Y axis EXTEND = T / DATE = '05/01/11 ' /Date of FITS file creation ORIGIN = 'CASB -- STScI ' /Origin of FITS image PLTLABEL= 'E30 ' /Observatory plate label PLATEID = '06UL ' /GSSS Plate ID REGION = 'XE295 ' /GSSS Region Name DATE-OBS= '22/12/49 ' /UT date of Observation UT = '03:09:00.00 ' /UT time of observation EPOCH = 2.0499729003906E+03 /Epoch of plate PLTRAH = 1 /Plate center RA PLTRAM = 26 / PLTRAS = 5.4441800000000E+00 / PLTDECSN= '+ ' /Plate center Dec PLTDECD = 30 / PLTDECM = 45 / >>> h[‘KMAGZP’] >>> h['REGION'] 'XE295‘ # Use h.items() to iterate through all header entries

  20. PyFITS – writing FITS images >>> newimg = sqrt((sky+img)/gain + rd_noise**2) * gain >>> newimg[(sky+img) < 0.0] = 1e10 >>> hdr = h.copy() # copy header from original image >>> hdr.add_comment(‘Calculated noise image’) >>> filename = ‘sigma.fits’ >>> pyfits.writeto(filename, newimg, hdr) # create new file >>> pyfits.append(imgname, newimg, hdr) # add a new FITS extension >>> pyfits.update(filename, newimg, hdr, ext) # update a file # specifying a header is optional, # if omitted automatically adds minimum header

  21. PyFITS – writing FITS tables >>> import pyfits >>> import numpy as np >>> # create data >>> a1 = numpy.array(['NGC1001', 'NGC1002', 'NGC1003']) >>> a2 = numpy.array([11.1, 12.3, 15.2]) >>> # make list of pyfits Columns >>> cols = [] >>> cols.append(pyfits.Column(name='target', format='20A', array=a1)) >>> cols.append(pyfits.Column(name='V_mag', format='E', array=a2)) >>> # create HDU and write to file >>> tbhdu=pyfits.new_table(cols) >>> tbhdu.writeto(’table.fits’) # these examples are for a simple FITS file containing just one # table or image but with a couple more steps can create a file # with any combination of extensions (see the PyFITS manual online)

  22. Introduction to language End

More Related