如何避免 HTTPError

扎眼的阳光 python 369

原文标题How can I avoid the HTTPError

在一些成功的请求之后,我总是会收到这个错误。

HTTP 错误 503:服务不可用

有什么简单的方法可以解决吗?

from urllib.request import urlopen
from bs4 import BeautifulSoup as soup
import pandas as pd

url = "https://www.amazon.in/s?k=smart+watch&page=1"

original_price =[]

amazon_data = urlopen(url)
amazon_html = amazon_data.read()
a_soup = soup(amazon_html,'html.parser')
all_original_price = a_soup.findAll('span',{'class':'a-price a-text-price'})
all_original_price = [o.find('span', {'class': 'a-offscreen'}).text.split('>') for o in all_original_price]
for item in all_original_price:
    original_price.append(item)
print(original_price)

原文链接:https://stackoverflow.com//questions/71508951/how-can-i-avoid-the-httperror

回复

我来回复
  • HedgeHog的头像
    HedgeHog 评论

    在您的请求中添加 aUser-Agent 应该可以解决您的问题:

    header = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)' }     
    req = urllib.request.Request(url, headers=header)
    amazon_html = urllib.request.urlopen(req).read()
    

    例子

    import urllib.request
    from bs4 import BeautifulSoup as soup
    
    url = 'https://www.amazon.in/s?k=smart+watch&page=1'
    header = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)' }
       
    req = urllib.request.Request(url, headers=header)
    amazon_html = urllib.request.urlopen(req).read()
    
    original_price =[]
    
    a_soup = soup(amazon_html,'html.parser')
    all_original_price = a_soup.find_all('span',{'class':'a-price a-text-price'})
    all_original_price = [o.find('span', {'class': 'a-offscreen'}).text.split('>') for o in all_original_price]
    for item in all_original_price:
        original_price.append(item)
    print(original_price)
    

    输出

    [['₹3,999.00'], ['₹3,999.00'], ['₹12,999'], ['₹7,999'], ['₹4,999'], ['₹4,999'], ['₹5,999'], ['₹6,400'], ['₹3,999'], ['₹6,990'], ['₹7,999'], ['₹1,599'], ['₹7,999'], ['₹6,990'], ['₹5,999'], ['₹4,999'], ['₹5,999'], ['₹6,999'], ['₹4,999'], ['₹4,999'], ['₹9,999'], ['₹5,999'], ['₹6,990']]
    
    2年前 0条评论