python爬取ZOL桌面壁纸图片

mac2025-02-14 18

最近迷上了爬虫，看了一丢丢视频学习之后，开始实战，爬取图片地址： http://desk.zol.com.cn/

循环爬取“风景”图片，实现代码如下：

from urllib import request,error import re key_name=request.quote("fengjing") ##定义函数，将爬到的每一页的商品url写入到文件 def savefile(data): path="C:\\Users\\Administrator\\Desktop\\fengjing_url.txt" file=open(path,"a") file.write(data+"\n") file.close() #外层for循环控制爬取的页数将每页的url写入到本地 for p in range(0,10): url="http://desk.zol.com.cn/"+key_name+"/"+str(p)+".html" data=request.urlopen(url).read().decode("utf-8",'ignore') ######尝试了几次，不加ignore会报错 savefile(url) pat='<a class="pic" href="/(.*?)" target="_blank" hidefocus="true"><img width="208px" height="130px" alt=(.*?) src="https://(.*?)"'####这个正则初学，不太会写，所以写的很长，希望有更好方法的小伙伴多多指教 img_url=re.compile(pat).findall(data) for j in range(len(img_url)): this_img=img_url[j][2]######由于正则写的不好，所以返回的东西比较多，不过幸好谢天谢地，需要的东西都在[2]里 this_img_url="http://"+this_img print(this_img_url) img_path="C:\\Users\\Administrator\\Desktop\\fengjing\\" + str(p)+ str(j)+".jpg" request.urlretrieve(this_img_url,img_path)

爬取结果：把Key_name换成“dongman”（动漫），爬取的图片如下：

最新回复(0)