Python之网络爬虫实战（爬图篇）——LOL英雄和皮肤我都要

mac2025-02-12 32

使用requests库来爬取英雄联盟所有英雄及皮肤，小白有何不清楚可查看入门篇：Python之网络爬虫实战（入门篇）打开英雄联盟官网的所有英雄所在的页面来获取英雄的编号Id： https://lol.qq.com/data/info-heros.shtml鼠标右键，选择“查看元素”（或直接按快捷键F12），点击选项“网络”，按快捷键F5刷新一下，避免部分文件没显示出来，下拉查找一个命名为hero_list.js的文件，该文件保存了所有英雄的相关信息，点击该文件，右边栏的消息头会有个请求网址https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js，该网址就是所要找的，保存了所有英雄的相关信息网页打开https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js，出现的是混乱的代码：对此使用快捷键：Ctrl+A将所有代码选中并复制下来，放到JSON解析https://www.json.cn/来使代码格式化，方便查看：可见目前一共有145个英雄，展开hero目录，里面的heroId就是所要的，仔细观察会发现heroId并不是按1-145的顺序（注意此坑），故不能直接用个循环来解决点开一个英雄，查看英雄的皮肤及对应的名称（操作与上述雷同）：可见安妮有13个英雄皮肤接着就是细节的处理与代码的编写了

爬取英雄联盟所有英雄及皮肤的完整代码：

import requests import os headers = {"User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36"} def get_hero(): url = "https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js" res = requests.get(url).json() for hero in res['hero']: hero_id = hero['heroId'] #获取英雄编号 detail_line = 'https://game.gtimg.cn/images/lol/act/img/js/hero/'+hero_id+'.js' #字符串拼接 #detail_line = 'https://game.gtimg.cn/images/lol/act/img/js/hero/%s.js'%hero_id #python2.5 #detail_line = f'https://game.gtimg.cn/images/lol/act/img/js/hero/{hero_id}.js' #字符串格式化python3.6 #detail_line = 'https://game.gtimg.cn/images/lol/act/img/js/hero/{}.js'.format(hero_id) #format()形式 get_skin(detail_line) def get_skin(url): res = requests.get(url,headers=headers).json() for skin in res["skins"]: if not skin["mainImg"]: continue item = {} item["heroName"] = skin["heroName"] #英雄的名字 item["skinName"] = skin["name"].replace("/","_") #皮肤的名字并将名字中出现的斜线/用下划线代替_ item["skinImage"] = skin["mainImg"] #皮肤的图片链接 print(item) save(item) def save(item): #构造一个目录 hero_path = '.images/'+item['heroName']+'/' if not os.path.exists(hero_path): #若目录不存在则创建目录 os.makedirs(hero_path) res = requests.get(item["skinImage"]) #发送图片请求 with open(hero_path + item["skinName"]+".png","wb") as f: f.write(res.content) if __name__ == "__main__": get_hero()

发现一个不错的爬虫教程，在此分享一下，扫描上方二维码或直接在微信上搜索公众号“百里锁钥”，于后台回复“爬虫实战教程”即可获取

最新回复(0)