1 / 6

Retrieving Web Pages (HTTP), Topic 3, Chapter 6

Network Programming Kansas State University at Salina. Retrieving Web Pages (HTTP), Topic 3, Chapter 6. First, some comments. Switch to application protocols Client side focus Pre-build Modules A natural OO thing – a matter of productivity Argh!, someone else’s code

jackbrown
Download Presentation

Retrieving Web Pages (HTTP), Topic 3, Chapter 6

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Programming Kansas State University at Salina Retrieving Web Pages (HTTP), Topic 3, Chapter 6

  2. First, some comments • Switch to application protocols • Client side focus • Pre-build Modules • A natural OO thing – a matter of productivity • Argh!, someone else’s code • Lots of choices, language independent principles • Web related network programming • Chapter 6 – retrieving web pages – easy • Chapter 7 – Parsing HTML – hard • Chapter 8 – XML and XML-RPC – interesting

  3. HTTP Basics • Stateless, connectionless protocol • Basic GET … import socket s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(('www.sal.ksu.edu', 80)) request = """GET /faculty/tim/index.html HTTP/1.0\n From: tim@sal.ksu.edu\n User-Agent: Python\n \n""" s.send(request) fp = open( "index.html", "w" ) while 1: data = s.recv(1024) if not len(data): break fp.write(data) s.close() fp.close()

  4. Now, for the easy way … import sys, urllib2 page = "http://www.sal.ksu.edu/faculty/tim/" req = urllib2.Request(page) fd = urllib2.urlopen(req) while 1: data = fd.read(1024) if not len(data): break sys.stdout.write(data)

  5. Submitting with GET >>> import urllib >>> encoding = urllib.urlencode( [('activity', 'water ski'), \ ('lake', 'Milford'), ('code', 52)] ) >>> print encoding activity=water+ski&lake=Milford&code=52 >>> url = "http://www.example.com" + '?' + encoding >>> print url http://www.example.com?activity=water+ski&lake=Milford&code=52

  6. Submitting with POST >>> encoding = urllib.urlencode( [('activity', 'water ski'),\ ('lake', 'Milford'), ('code', 52)] ) >>> print encoding activity=water+ski&lake=Milford&code=52 >>> import urllib2 >>> req = urllib2.Request(url) >>> fd = urllib2.urlopen("http://www.example.com", encoding)

More Related