现在的百度搜索url都是加密的,不是真实url。比如搜索豆瓣
复制链接地址得到的url如下:https://www.baidu.com/link?url=vsdsl04PUGwYT-udMGNDBSgQ4D62grmcfm8fM4LVjYLVVMoaXT6EoDxqw0FKxHcy&wd=&eqid=979239ad000511ed0000000463453c3e
访问这个加密链接并抓包,得到的响应如下:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<meta content="always" name="referrer">
<script>
try{if(window.opener&&window.opener.bds&&window.opener.bds.pdc&&window.opener.bds.pdc.sendLinkLog){window.opener.bds.pdc.sendLinkLog();}}catch(e) {};var timeout = 0;if(/bdlksmp/.test(window.location.href)){var reg = /bdlksmp=([^=&]+)/,matches = window.location.href.match(reg);timeout = matches[1] ? matches[1] : 0};setTimeout(function(){window.location.replace("https://www.douban.com/")},timeout);window.opener=null;
</script>
<noscript>
<META http-equiv="refresh" content="0;URL='https://www.douban.com/'"></noscript>
我用的语言是 python,findall得到真实ip
innerurl = re.findall("0;URL=\'(.*?)\'", text)[0]