python爬虫遇到crypto加密

mac2024-03-26  33

@TOC

爬虫遇到crypto加密

这几天疯狂加班么的时间写blog,今天处理的快,写一写之前无意见到的很厉害的网站,看了看还是网易易盾写的呢好像?全国建筑市场监管公共服务平台里的企业数据(话说这些都是公开的应该不会是面向监狱编程吧?)

企业数据及详情页被加密

前情提要:如果你还不知道requests或者getpost或者n多基础知识你还是先学学别的,不要老想着一步到胃 想啥呢 得到的数据是这样婶的

95780ba0943730051dccb5fe3918f9fe4c6612ab8a332ee7d1067088471faa62e290207d953d46d3a1225f20c0fde9f49ea47fe35a8\

观察这个URL, 不管翻页多少次URL前一段是固定的,就在sources全局搜这个url(你应该会用f12的对吧?)现在开始讲咋解出来的

解法

一段段减这个url 到

dataservice/query/comp/list

搜着了嘿嘿 搜到了在哪个页面里 再进入页面继续ctrl+f寻找这个在哪一行,然后打上断点 点翻页,看看从这里获得的数据怎么被加密的,跳跳跳,发现了,加密函数一般就这diao样

t.interceptors.response.use((function(t) { var e = JSON.parse(p(t.data)); return e }

JSON.parse(p(t.data)) 稍微会一点编程的就知道,用了p这个函数,传进去了t.data,点进这个p函数 可以在console里print一下e,让程序运行过加密方法后再print,能看到e里面就是想要的数据啦,还贴心的是json格式

function p(t) { var e = o.a.enc.Hex.parse(t) , n = o.a.enc.Base64.stringify(e) , a = o.a.AES.decrypt(n, u, { iv: d, mode: o.a.mode.CBC, padding: o.a.pad.Pkcs7 }) , i = a.toString(o.a.enc.Utf8); return i.toString() }

先hex、base64加密 然后aes解密 CBC、PKcs7 如果不了解随便百度个解码网站你点解aes就能看到他有几种模式 这样出来你觉得到底是对数据加密了呢,还是对数据解码了呢? 当然是解码了,因为使用aes加密后输出的可以是base64或者hex的码,再把它解base64和hex(我有点乱 还不是很懂哈哈)。 既然知道了加密方式,接下来就要写js程序解啦,是不是不知道怎么写,或者想着难道我还要再去整个ide整个环境写js?当然那是不可能的,pycharm还可以写js信不信? pycharm是真的diao! 去网站下个node.js 然后 去python里pip(此处百度) 然后cmd npm install crypto-js (不要写成nmp。。。) 重启pycharm,创建js的文件,开始写

写js

对了,要加解码aes还要知道密码和偏移量iv,这个网站很人性化的就给你写上面了哈哈

, u = o.a.enc.Utf8.parse("jo8j9wGw%6HbxfFn") , d = o.a.enc.Utf8.parse("0123456789ABCDEF");

然后照着他的写法,开始写(我还不是很懂js,此处源码是在网上找到的)

data='95780ba0943730051dccb5fe3918f9fe4c6612ab8a332ee7d1067088471faa620daa3298d163744eddd8f4d77d4af24f3cb52e773eff109ce79e8ca1922a84f53649f93dc7f5d12d4ecdacc7a54558bdc3178d7108f869c52e66c20b3dfa089835eff20697c088efffda7498ef0b49eca35c5c4bcbde5aec177fa72e9fa67064665dc813613ccdb1902cc350e29b84d63accd395cc07676fe88c697be3d7d2278b329f50e9ac253df85c20bd3d3c06e777411177df7c5c7cc9f7d0d2a42d476e0fd4c68beb591da9e6c9a65650b42d81697dd222b166f5a4fd97b742a20dc862032b63ecedc92b42e36677a13fbb355017bd230284616b990390d37e9ecaec7f7ba536172da105657349c9ad2aa242770942b8624f4e64f38251dd1be7d1b9941155da518389cb6174eaf3d86f99823b6d192302ad15f20496677ada6c4a082962358f7225e6568504ffb2a0cdb2f3b718696946a8b01d9d46e8dd50c59f23bb47f15bc2efb86e880c105d461313c2659e0d77850e6e94b990bdea56b02efaf3d8b4b6e590c86f584c591608d346f79ec369aeacc4b2a058cf28263a5669bb5cc0b6ac76e7d46ad5d543855b297d02f8884d7130edd73b4b69df55756775ad13dacdace905fc726af4d12d66707246d9f6190c2336af7d63a1235406612fde33e611a8a7f10fb92e41f50b77e6ffdc9601f6502057e4a9a41f586dae34c01fb529da51f19cf2e791b11626b1010524cc7f5926e3d42e2cffbc66f0a79de940f65b54cc7d7a527a0b721becfbed6672113bda5c80febcac8430cc106b39e3499914474ff0933072208a8a39a2c3ec336145c41a551a4fb0a330814231b4224d7ab9de758ae06d31a178340ba8fd749894f724332f4815857b2f40f756e04d3e4e11a7eac0d13518ef4df686b19d80467b1981bc422114a1d5a4e493b2a10a765b302d4045ee866ada76b1d5fd74a5db6655abfab855e2aff6862ecd75003d54465655769d70a74649ca7cd88a9cae670ae95ecb86cb25e63457b07556d726882556f12eca24887bd001c3e842746d24181de49959f7e3521876eef39914ab9b8e77cc5d4795e87f19104a7bf0143ce049904020233b50963235d462db8c3397e7364ad62fd6c2421979562614ebb30b0b6bff8055a1a4e705595ba9d4b2dc1c520a2e50464b77af8e07f200602e7aedd63edf00fe0273bc501b9fdab190a4332890877a692dc4d329f89219b7cc735ece01cc7b5f186a3b80d01d7c1aaaad92610b08bc222421add03c79f5e8de71ae0b6a19583685d215e54424b17cc6bc6dbea96b339c349126c2cd39d9e856cc63fc399d77a6f95ea523715827155995ea8828d47d501fdb454a05349dc2fb5e26ee0d252658cedb1edb2607f3cd0f5583166ac63bfa8f1717a2b20a23a130aaa5b4fc095e17d59efb00abe856899707a2fdc7ae9bcaec14179b6e7fe846126c6328c0504c91a037c92ccae766a38e20b1b3036e4d203aac6b65c9b943d7d9d2efd4cada8bf3a748262a847ebebdda1f9d30446b84e5459003f435119661faa0512a3683951de859875ec0bda98ec741480449eeb41700ff72da314446975462858a4eb34ed123e23c48be94675c086ebe80e52934df89f3a0980e2816f8d61e4c52333b741a79ed6e72531f41e4099556dd61aeb3782e22779e246625861bf3ffe8a671f332befbb71aad6f6b8b99cfd499cfa893765a4bbf1379789ca8e06f6bc33a19fdfa373b635e01be19bd8b12f14067f340aedbc4ac87b58d4572bff15d14a6df2274a8f57a555b15f26b4742ea72755e9a8d81bb5a0d715299ecffd8f6b68eb1cbe50d6a21f4c9ff0eb5842f7ff591d7369630b01dca11a6a5b8700a2a3be954b0af208d7c32f6a0fe9b56f5f7177f0ecd7a101917be6abcf2cc6da59306f28b42fe803d8df7e8a2ae3a2a2d0efe2eb3e6d10ee845e41e8277fd9e2a111fdba46a45a962db410a39c7a35ab82eb630289353305d78cba89b6c2bd00980f9a1e097c473a88c20ca71cb1af77627353830c98dfe792ba65f035e57386be9206fa4083cf6901a842f0d206bb697b303f78c3347bcf9096b80a6b2eac801abdde8f54de33ab36b1a402bb8a5733f5d5dd8a1d6a987ba85f8c9913f1296ed03d26dddd8ad484e9f0312ecbc03ca6747a01edb4d6b510b26d9cc5225eb80e23195bd9e259dfff8d0e1caf2eae09915d9360468e7007f6bd51d72cb6de0489608362d99400fa1a2ad9f8f02553962be366cc60d6a5e637190331c0df06b2f224907031fea335a58ed9938342212cc94257954418539604731899623593478063cc8b666081ee0f0d1a23355ca41d798d19365764013c51d59bcbaef009867a87ca3ff6036d72b0ee6ee05de3c390642820e240ee46e7f74f72dad20b48404fff994b7dc0120a92b0d1b756fca230a04e89b6cd4e99d7245eb05814ced12eb150ad7b129516831055c87481c94f98461be24fa2facae5b9133b82a3c8a4c949fd42035c3b9b7d3ae2b1caef111311d839af6f7ede6a6de95f2341bb7707a3f3331e8bfa4f35d2b97c6b1c4de2993287d4f283c7f927925ddfe6615ce33aaffe9966a9d01c407f1a8ae43cc921d4d9b28018e0fc3d7fcb20afd5d11be5eb8aecdeb9e84e7e04c68f7871de5f1163be21068a048214e7fea457f396a860e3190cbef43791d4239320494b7f0ae7ba7951bdfea5f3e61e65b3d5e3f8823f3de7b36845ec0f36d9a5b6f7923a2f2c4b796d598a11ddcbe00e1c20e7f72f3260e3055354d4b0695eef50f5eba0fdfbb072f5ac8b6c88bb00290f1869d2607fae9b485b4c2b9214465672c72c4a56585a9a6b6f63ee44b73072fb9467e9528d52a0b388995f251caeec083db9bb82a7adb3c0630e26b3ca4b70d98c7ea91d5cf82636ef7e4ed0cca3f91f4dc1fab0fd4bc67535759d6359c9930df2d6b388714dbc55c0c368ad1060fbcd13e6d143801f639c094149b1273e2d499b14a0aa8fc3b6f1d55663ea708990f19e8e2532396162b4fa21599840528d6292dc16ade0c46605f619203164d49205f1eb6c8ebd2026efcae27c071877f5bec4d9c44ab3170b1cf6dd675e21be0feabedf4d0b2a752c66e991d58a33271b1a6fd9228adf05b6fe1bec1b63b0f77e03598eded56a461589225f11681f5d2f1370505fa21efaf450b2009fdbdb87e8a7f33dc8902daecc8f76282efebc3854cc6c346bbf09c1319fd742a2e4e3b64d007a01fdb263b39d36f4caa0bdc78b518506083c1bc3c5928a6d4406fc3193aa3968711d308b0c6b5024c5fceaa16c230cb5ddc2bbdcda28408fe1889b8ffc43ba2bbc329107d37e27676f51bc899de4dc09a75423eb98083f8403918c9307e07b0e53b7c99bfd5b12aed23759de635e33496ff8a2ed208ef9e3e2a30c619611d7f3f44d876d1c82b4cc28dd7f5546a55936f344019427172bba15aa0fb116478cfb92607c22586ebf28b394727430237403ff75f9bc8ebf919a1f053af196b85280b0cd12a31cf0296f32ae3b804e251627aa7e4d8fd4a51d18a040d5eeb517b42d536cdea1c4b3c19af4700239209ff6f8d2ac05994f584c44b44b744a9c85ac0cf3c74d80603b8f0754131ccb2ad766beeadbf1292d6b03f57eeaea8d06aeae4642c0bd594e466fd3cc3b04860b6f210c761a961070680984604c41404e602fe2c332640f3bd73cfb2707f2d198c206fb0830472ab7cb1050cf199f4e2cb00393d6467c9f5f35b2ebde24e3cbd203431ec7e0c702a0697871812dc9befebbcfa096759b7d38079d134a277bddc09d5b920440878cc80f070ef567c286d5b5c8957e802b8718db21f055905467d16b1d74698530959e577030482c8fdb2287dd7a5dce1864241173434bfad8b585de7d7014a79a31c24eea02e4c3440fd377e9a298aa835deb4185f83446b9bb076cc3ad0500ad12fc001edce1f5948b25128c62c36cebbf08009481994e61122a41e7f6d1e2f8329cfe5d34855179f7937eaa1631ed14989ea96668b7c5be713a66141fa4ba72689c2be116b6babefc7e892f9eebdbccaa8547af4184df76ff2bcce18b95e2ddf8eeb25684eef1e74f737d9e47945021b6395d4ebe6409a120da35556fc93b35826cd872229b2691c2a336e1c3c8258081e90676047ecefd125d2bdefdc53e9b82b332040c1e0b3de9d34451fcc8de2575255ab59393665ac1fac136c3d578a155cc120219ddd68708afcf0ec117b070d6753412b3bb2c8a26ecf0932fedacad40d7e501adc6a2ba32b8011992c560c63e0b0d6cbebac0a577aec4b0466a4c2a5a73e6d7fdded939b9bc7124307bfeeccab7bd24170adc6a5f283278a946edfa205c2273eac435256e6f39fd354390b9ab64e01bcd26d2ef6b9385c71e0a75f586f94d675ba817fa1931931523beced6860fedaf9fe366b79edcba2d3f0595a02ac7e4a2514bb18c88542d296d86455b62eabb1f81749335eccfb958f4cbacbf8862bbf686f324b04358b4774cc72fe759e2e64976677aa7a1b69f8597de8f54c74abf39ad99e9a5fd0eec5b8ce360e89af1d437e2ca28928507aaea78883c672888466792dff05d6e235668d2fcaad3cd47cfab2dc16fcd28d5d0bb12c0a54b4208d0349cfcab545762ce2ee669f24f1eb005a959eff6fcf6ecb1eb3ba882fe9d537a32d2130501f6be4e5f7f6ce129dcf331b1bb018e7ae8c9d8608fa73c990007c5dc9ecb5ac9999aafe74a0f7ac4ef2058e70e458c2a44edddbc93743da378f639cda649386b4d31b3cae55db9da7aaa497b5ca2ff140a953f45acac2c031f70385c208a4182240bc3bb3d55057c08c0296678d5249ac7c9ff1a46ea726df06089b02d194733de839e10bc816342f6506adb8ec59f54a7898b79640ec226ba652dffd93dc22aa3c1013b0bc48bb89987e6c8cefca04b479751342af198a26af9839a0f00c32f8a4aa0fabf8ed631b2ae5cf6731429ca8b4ed5a2ec9903e41a87ccd1c917a66a2cd016760a99f5eb9dac059588ff4a72f7cf48c2c4f642abcb14cb88ddf6b0777052f9e82309ddeb3af004d8486b31760e8080339ee144c681171c2a67723b0d3e6d15de98b156ce732d44a1c1978a28c76a1ea60399f0ab8cfa714c9ff202efd6ddaea53fcbb557542a7052a848979d7ae144a257b14a7740ecea2940b2197bc9b0309cdc7f32976106e0e88aefb0a53098e9175051d7f43b5b69a3b79a6f3ebc2e3d61f267b586e484afaf816625057f28ca30ab536bbc0b4c50725d853820429db3cd60d4f1f36b919068405dab64a449a09f1b586a413085c6924774c9e57afde6f101b4efb08d793554bd418094fcaad65cbe83bb262c445cfb958e5e38b476e6a763933d10bbfa04c716119fa1c65dc1f70027cfcc8235dcee6c8ed635e1e30ff96458e885e1428d3918d0ec7580bef5758edfc839b0205807abaac2f08dce873ad8af80a8e5f90afe3272e8b37cb5b6c8697d81bc84742c40b81de783515ac2e1a18e6d5a57ae10049777354303248773eeb86482e1fce23e97633c03712794afa85ce00784f4d44b392848cb17c673ba2a1d122fc5b05516cbd0f4135f1853447dd744d4ef56debf7ebab32e77fe777f2c17abc864feb90c845133c109630a107057d3e31102ba631fc65330d91d470cdadd66911a67c1c2273078a3805302fb86e262f8858c879c62381fff16f8042f55f9e1f9e11425820aa3d893fe7939a8904f4fb138c52098288377bd8fb4f42c10e9d9afa16fb7b3e814e4074bc02a48d4e4b800c65ce4874979ceead01aa16632a2641a1fcb8f0f7fec3fdb5fbd444a5c357da2cd8bfc40f5400becce658f49bb3d071d612391bb27736ee7e6ed338c5a6b3e614e581add9f9e0bfa6b9e9ac7453bceabae8b43c4704177bcba12e46eb52547ec43d97de0a2bfffcc587078d0f7e1bbce34ef85a56d9c9b5a9adce3af44cadf146adb2bed948b08bf1ab29be7b91e180430cea0fb471a506ef138e46a9ffd7d35d52453c7a8d2ef872f86535aba316aae37598f1e7001123540b79b34cc093dc1e22ce8cd4fb661602e16446a613d8627665281128faf675c29f487f8972449cb2af2b0d9ea18f6c81a6028470146c89660e6fd1e0fb2ae0b86ea38b165eaa8c81169bbe7014224ee4be69b5f90e4137ed9b5ebb05bd7d624b2fd390086b411552cc169386edaab6a71b7f4d5ca29b179b216e6fd3c35c30324eb352b2e70b7f2c500a437f9cd7890603078ea3600cce2a270730f0e4693f0d1fe971cc7b5c7c7914cfd2c4db288dc72cf5b306ab8c35ceb7875a95cdcc9de9a25fb464b8fbee00690027b204e705f98ef7cb2480b086dd9fffd6cdfa5d5d3d4be1a6591da87c4c939f20e14f6c0addf4999346f2fff3703a2754f2f19549f0353fadbe5bd07fe4e75c408f992e347e02a5a3a09a0885cc1f860739fb862df90f43c62284218bb1f4cb9ca4e9cb9455d06955875c4531dc45402dcf074738f95686712788b54bc79cc6d94b60e00eaaa4e8bc04fbccf2b0e22543669de90cd5d8099af1cb563aed7e9e99cfc867dfa0085f9a39a3ceaf45012dbafd7c543f9496f0af0eccb61634052f077e3e55dc00cebf8eb5438623518fa7967ae4e0e6577704e11e680371dc02b2f8d6a64ccb277386e9b182e2127fc0cd64fa31cb09ccf8ff1af401b89b38c162869cd3dfcadf9d48b636fd8b4363517840cbb8f4269290a38790cc288a07bf7abcdba5bb6219dad63087ef1ebc27ca2601c8ed638943868de4efb7aa21d07928116bea30070e278047c8178594744a9a04479f50c379e023416a89b07144ba9dbe15101ac494f16e1bdc77563d19f0300b3b976fa3573e0626b29cd7a89463f25497fa1e' /** * @return {string} */ function Aes(data) { module.paths.push("C:\\Users\\Administrator\\AppData\\Roaming\\npm\\node_modules"); let CryptoJS=require('crypto-js'); var u= CryptoJS.enc.Utf8.parse("jo8j9wGw%6HbxfFn"), d = CryptoJS.enc.Utf8.parse("0123456789ABCDEF"); e = CryptoJS.enc.Hex.parse(data); n = CryptoJS.enc.Base64.stringify(e); return_data = CryptoJS.AES.decrypt(n, u, { iv: d, mode: CryptoJS.mode.CBC, padding: CryptoJS.pad.Pkcs7 }).toString(CryptoJS.enc.Utf8); return return_data; } result=Aes(data); console.info(result);

上面的data是从网页里提出来的一个 let 相当与 import module是声明安装的cryptjs(或说所有的module)在哪 run一下 得到的数据就不全粘贴出来了,太多了哗啦哗啦的,跟你在console里看到的一样

{“code”:200,“data”:{“list”:[{“RN”:1,“QY_ID”:“0D0D0C0B0D0A0F0F0D0D080A0C0409090F09”,“QY_ORG_CODE”:“9113010060118700X3”,“QY_NAME”

把他封装成方法之后请求页面之后就调用这个方法解码,可以把得到的数据json.loads一下,更方便了有木有。

总结一下

用js加密的页面还有很多,之前看见的天津公共资源交易写的更奇怪,都是匿名函数写的加密方法,而且也不会像这个网页一样直接iv和密码都给你了,感觉要想学好爬虫还得学学js,如果爬app可能还得学学安卓哈哈,不然是真的学不好。 当然,破解这个网站的方法还有很多,selenium分分钟解千愁,requests文档写的解决一切难题我觉得是吹牛,但是把他写的给selenium完全适用,只不过selenium的效率太差了,click也经常嗝屁,如果像我这样学习的看管们时候就要用更难一点的方法不要用selenium,就这个建筑市场里的企业信息起码是百万级别,更何况里面还有具体的信息,人员信息,你想用selenium爬完是不现实的(其实是selenium用不了。。。我也很无奈) 至于具体用requests还是scrapy 我觉得还是得具体分析,有的网站贼贱,真的是太贱了,我看了都想打死写这网页的,标签不封上的(还有你看到的标签跟爬到的标签不一样的你气不气,因为浏览器自动帮它合上了(规范化了)),你这种网页也能编译过是真的无敌,欺负我不懂?还有那种看着是位置都一样,f12检查一看都不一样,这个是li 那个是ul然后是span然后是a 一样的信息换个网页就是不一样的标签里,还有需要很多操作才能翻页的网页,如果是那种特别千篇一律的网页可以用scrapy,如果是需要定制一下子的就用requests,觉得requests慢也不要紧,多进程+协程(协程写起来贼奇怪,轻易别试,搞分布式都比协程简单),最后说一句,scrapy是真的快,真的真的快,基于twsited就是叼。 还有希望我毕业了能找个好一点的爬虫工作嘤嘤嘤,还有半年就毕业了,有点方

最新回复(0)