Python 02: interacts with Internet

About Protocols
Transport Control Protocol..Works on the transport layer.
TCP port numbers…

# Sockets in Python

mysock.send('GET HTTP/1.0\n\n')

while True:
    data = mysock.recv(512)
    if( len (data) < 1) :
    print data


The result:

HTTP/1.1 200 OK
Date: Mon, 09 Nov 2015 21:19:27 GMT
Server: Apache
Last-Modified: Fri, 07 Aug 2015 16:39:14 GMT
ETag: “20a1817f-a7-51cbb46b621a7”
Accept-Ranges: bytes
Content-Length: 167
Cache-Control: max-age=604800, public
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: origin, x-requested-with, content-type
Connection: close
Content-Type: text/plain

But soft what light through yonder window breaks
It is the east and Juliet is the sun
Arise fai
r sun and kill the envious moon
Who is already sick and pale with griefSocket is a low level ayer.

(It keeps the head info)

Using urllib

import urllib
fhand = urllib.urlopen('')

for line in fhand:
    print line.strip()

Just like opening a file.

Parsing HTML with BeautifulSoup lib
Regx is for parsing HTML. Or, the easy way is to use “Beautiful Soup”.

place the in the same folder with your other python code.
download here:
(I am using version 4.1)

unzip the file, use command to install:
>> Python install
if you are using pydev in eclipse, you will find it automatically detects the changes.

Following the code:

import urllib
from bs4 import BeautifulSoup

url = raw_input('Enter - ')

html = urllib.urlopen(url).read()
soup = BeautifulSoup(html,"html.parser")
#for older versions, it should be: soup = BeautifulSoup(html)

tags = soup('a')

for tag in tags:
    print tag.get('href',None)

The function is to find all hyperlink tags, and get urls of each.

The result:

Enter –


Out of topic:
This is the 20th post of my blog. I am thinking that I will not be a serious blogger, well, not only talk about techniques.
Registered another module Regression Models, but did not have time to start learning seriously. (Winter makes people lazy…)
Busy with preparing a two-week business trip, like visas (wtf, passport courier fees are killing me) and tickets. Will spend 1 week for my holiday during December and then back to work. Hopefully I will survive the whole winter, with more better blogs.
Thank you all.

Published by Irene

Keep calm and update blog.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: