实例1:京东商品页面的爬取
import requests
kv
={'User-Agent': "Mozilla/5.0"}
url
="https://www.amazon.cn/gp/product/B01M8L5Z3Y"
r
=requests
.get
(url
,headers
=kv
)
r
.status_code
r
.encoding
r
.text
[:1000]
import requests
kv
={'User-Agent': "Mozilla/5.0"}
url
="https://www.amazon.cn/gp/product/B01M8L5Z3Y"
try:
r
=requests
.get
(url
,headers
=kv
)
r
.raise_for_status
()
r
.encoding
=r
.apparent_encoding
print(r
.text
[:1000])
except:
print("爬取失败")
实例2:亚马逊商品页面的爬取
import requests
kv
={'User-Agent': "Mozilla/5.0"}
url
="https://www.amazon.cn/gp/product/B01M8L5Z3Y"
r
= requests
.get
(url
,headers
=kv
)
r
.status_code
r
.text
[:1000]
import requests
url
="https://www.amazon.cn/gp/product/B01M8L5Z3Y"
try:
kv
={"user-agent":"Mozilla/5.0"}
r
=requests
.get
(url
,headers
=kv
)
r
.raise_for_status
()
r
.encoding
=r
.apparent_encoding
print(r
.text
[1000:2000])
except:
print("爬取失败")
实例3:百度/360搜索关键字提交
import requests
kv
={"wd":"Python"}
r
=requests
.get
("http://www.baidu.com/s",params
=kv
)
r
.status_code
r
.request
.url
len(r
.text
)
import requests
url
="http://www.baidu.com/s"
keyword
="python"
try:
kv
={"wd":keyword
}
r
=requests
.get
(url
,params
=kv
)
print(r
.request
.url
)
r
.raise_for_status
()
print(len(r
.text
))
except:
print("爬取失败")
import requests
url
="http://www.so.com/s"
keyword
="python"
try:
kv
={"q":keyword
}
r
=requests
.get
(url
,params
=kv
)
print(r
.request
.url
)
r
.raise_for_status
()
print(len(r
.text
))
except:
print("爬取失败")
实例4:网络图片的爬取
import requests
import os
url
="http://image.nationalgeographic.com.cn/2017/0211/20170211061910157.jpg"
root
="E://pics//"
path
=root
+url
.split
("/")[-1]
try:
if not os
.path
.exists
(root
):
os
.mkdir
(root
)
if not os
.path
.exists
(path
):
r
=requests
.get
(url
)
with open(path
,"wb") as f
:
f
.write
(r
.content
)
f
.close
()
print("文件保存成功")
else:
print("文件已存在")
except:
print("爬取失败")
实例5:IP地址归属地的自动查询
import requests
url
="http://m.ip138.com/ip.asp?ip="
try:
r
=requests
.get
(url
+"202.204.80.112")
r
.raise_for_status
()
r
.encoding
=r
.apparent_encoding
print(r
.text
[-500:])
except:
print("爬取失败")
转载请注明原文地址: https://mac.8miu.com/read-488348.html