Chow's Notes

urllib,urlib2与httplib,urllib3

urllib:编码参数离不开urllib,urllib.urlencode,

urllib.urlopen(URL,[,data])

支持POST,根据参数区分post或者get

urllib2:发送url请求,可添加http请求头字段,但是添加Cookie头字段无效

httplib: 可以发送cookie字段,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def getRead(page=1):
url=“http://cn.ae.aliexpress.com/wssellercrm/ajax_ws_seller_crm_list.htm“
query_data=urllib.urlencode({‘orderBy‘:‘tradeTotalFund‘,
‘orderType‘:‘desc‘,
‘currentPage‘:‘%s‘ % page
})
headers={
‘Cookie‘:‘acs_usuc_t=acs_rt=0552311d078047e49628c29bae7c5510; ali_apache_id=113.108.202.203.1411525480759.841347.4; xman_us_t=x_lid=cn1501352204&sign=y&x_user=o8anv3hmlnvnCLyPLqx730tq76bwCDtOxL56lzQjUvk=&need_popup=y; xman_f=J/g6ytFi9Zx/qPq2GKORwwSTfOyc11nLAZ5D2t0QcHLpWriunNxyySSUfp0TcTSc0XzOc93b/LaWDSeXmBT/S5oWGPl0c2YHSO1Ze8wNBwSHf0LQhXNA07nALRuAwChR9JqjEtGVKN2Xf/MFGK690sZtLSPXKyk5kk2uMkjn+n96YD1P6h0J3dv1bpr01gZmnIOUVbdNpCDZ+bTWIu3ZcapoLwh4SIZ6eibzNti48s6vs2UbzUmK7DMYrk8YVglLU1k2ky7sXPfn9o+SheNk/Odlvke+YRhWs4xqbUuW4rbrWQRMPjeGHEnMmjfm74SD+ihymHPTnhuq2YvyEdrxgW2nRheHtoXL; utma=3375712.1054749006.1411533102.1411533102.1411533102.1; utmc=3375712; __utmz=3375712.1411533102.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); aep_history=keywords%5E%0Akeywords%09%0A%0Aproduct_selloffer%5E%0Aproduct_selloffer%092044496155; xman_t=Q+lWZ5g2zA5zJ5yM2wr4KngA3sXe8EPhLC1g87NvKawi3WpwpddgJ9ElJWqoGlg4bdxj5A8E/OcAypHLVoEWaA15M+wL23Ip+2HKC3bSo/YJD0KpEEBhfUUIP6dIhLuoGaxb7sYFPB8Ft+U59SlhZxA/cUX/EhiW0ZybZP645cm/dmfuUv91sX48GNbW+5hxHTi3ZidjE3VFfA0jaS4cj2Db0MpQ+FcNxirCgAHqYZfVyx2q4dJDvUXjzeJ8Q6GYnriPRV638yWtCm+41TsAQeWU8wQkeiNzkS1PkWy2oF07VmoiBq/fN2gG27W12YwyFD0FErWTRhBg6G6+7iKkuzXqfdOhuv+zzZUpupVc4ol+DRgdLmLKuDundnKCJ8D+0oYfnV0fQI4sdyB746IrFLro9grrKP95J/RbrAdJtUYoDXEcUb7Dc2l3awf1UqGXU81CDIHxULXYD3XIIUX0qthV4NQ3b3AITuVGVNsURl4riMK0gK7o4lsYWA3N+IXhkj7iZCPffbgr7UmRAMrD1ybaJs2gY7tl3Pc6PqG0497LnzK6bp98z2vOiJ1sAjqK8WwtM4dSYxCB4TbPzMbfB+82qVAU73J6DBc34PJlgbW/If6U+ra3RNGldPsjwQH0gxEIt6iRi2zen0x9kXva2FAiwPcKvBvh6hW6f2opod8=; JSESSIONID=A003C292DF818CF827DCE0DB85DE80C0; ali_apache_track=mt=3|ms=|mid=cn1501352204; ali_apache_tracktmp=W_signed=Y; xman_us_f=x_l=1&x_locale=zh_CN&no_popup_today=n&x_user=CN|Aveen|Chow|cnfm|205829874&x_regin=CN&aep_site=glo&last_popup_time=1411525493024; intl_locale=zh_CN; aep_usuc_f=region=CN&site=glo&c_tp=USD; intl_common_forever=0MBh06hlQ3PAQrC0FHvbHnfPGdeIGm6rwfpHuYOZdNAmAhxKvGVqQA==; acs_t=qW7Yrx/HyH7fy1xJei1OVEY0QdTXxudiltRlU+AP6iYm+pgfaF/Guj2wkKf7SiZb‘
}
client=httplib.HTTPConnection(‘cn.ae.aliexpress.com‘,80)
req_url=“?“.join((‘/wssellercrm/ajax_ws_seller_crm_list.htm‘,query_data))
print req_url
client.request(‘GET‘,req_url,headers=headers)
response = client.getresponse()
print response.status
print response.reason
cont = response.read()
print cont
return reg.findall(cont)

urllib3:可以建立一个连接池,具备post文件功能,同时支持添加各种请求头字段

1
2
3
4
5
http=urllib3.PoolManager()
res=http.request(‘GET’,URL,data)
http=urllib3.PoolManager()
res=http.request(‘POST’,TOKEN_URL,req_args,encode_multipart=False)

如果请求响应报 Bad Request / Bad content type 信息时
需要设置参数encode_multipart=False

对HTTPS URL的请求
默认不会校验HTTPS请求,通常会产生InsecureRequestWarning
如果想发起未验证的HTTPS请求,可以禁用警告(不建议)
参考:urllib3 SSL Warnings

1
2
import urllib3
urllib3.disable_warnings()

建议用 SSL certificate verification
参考:urllib3 Certificate verification
可以先安装

1
pip install certifi

已经有了certificates后,代码可以如下

1
2
3
4
5
>>> import certifi
>>> import urllib3
>>> http = urllib3.PoolManager(
... cert_reqs='CERT_REQUIRED',
... ca_certs=certifi.where())