Python python 怎么获取页面所有酒店的名字

lyyyyyyy · 2019年03月13日 · 最后由 lyyyyyyy 回复于 2019年03月15日 · 2735 次阅读

https://www.millenniumhotels.com/en/hotels/

大佬们救助,现在每次发布后都要验证所有酒店都在列表中

共收到 9 条回复 时间 点赞
import json
import requests
regions = ['Asia','Europe','MEA','New+Zealand','United+States']
hotels = []
for region in regions:
    url = "https://www.millenniumhotels.com/api/search/destinations?keywords=&regionName=%s" % region
    get_response = requests.get(url)
    if get_response.status_code == 200:
        # print(get_response.text)
        result = json.loads(get_response.text)
        hotelMsgs = result.get('data').get('hotels')
        for hotelMsg in hotelMsgs:
            hotels.append(hotelMsg.get('name'))
[print(hotel) for hotel in hotels]

@ 煎饼果子 感谢大佬,觉得自己太蠢了,没想到去用接口的方法。顺便问一下大佬,我把这段代码封装后可以放到我的 UI 自动化脚本中用吗?

1 楼给出的接口验证方式是一种思路,可以做一下接口层面的验证。不过我觉得有两个问题:

  • 后台接口和前端展示的功能是分开的,所以接口数据没问题,不代表前端的展示也没问题。
  • 还需要验证接口数据是否完全一致,所以需要拿到一份完整的标准数据列表,然后拿接口返回数据解析后和标准数据列表进行对比验证。

如果从 UI 的角度测试,建议思路有以下几种:

  • 同样拿到标准数据列表,然后在当前页面查找是否每个数据是否一致。完整的解决方案,需要 region-country-city - hotel 四级的数据都一致,才能既保证酒店都列出来了,也能保证数据的层级和顺序展示正确。
  • 另一种思路是通过截图,然后进行图像对比。如果两张图片的重合率达到标准(例如 99%),则基本能说明展示的数据是一致的。之前看到论坛里有类似的图片对比解决方案,可以查一下。

昨天大致按 第一种思路写了下面的测试脚本,不过后来发现页面上有些特殊的展示和取到的数据顺序不一样(例如 HK 和 TW 单独列了出来),所以脚本没完全跑通。 仅供参考吧:

from selenium import webdriver
from selenium.webdriver.common.by import By
import time

driver = webdriver.Chrome()
driver.maximize_window()
driver.get('https://www.millenniumhotels.com/en/hotels/')

region_list = ["Asia", "Europe", "Middle East", "New Zealand", "United States"]

target_country = [['China', 'Indonesia', 'Japan', 'Malaysia', 'Philippines', 'Singapore', 'Thailand'], ['France', 'Georgia', 'Italy', 'United Kingdom'], ['Iraq', 'Jordan', 'Kuwait', 'Oman', 'Palestine', 'Qatar', 'Saudi Arabia', 'Turkey', 'UAE'], ['New Zealand'], ['United States']]
target_city = [[['Beijing', 'Chengdu', 'Fuqing', 'Hangzhou', 'Shanghai', 'Wuxi', 'Wuyishan', 'Xiamen', 'Zunyi', 'Hong Kong', 'Taichung', 'Hualien'], ['Jakarta'], ['Tokyo'], ['Cameron Highlands', 'Kuala Lumpur', 'Penang'], ['Manila'], [], ['Phuket']], [['Paris'], ['Tbilisi'], ['Rome'], ['Aberdeen', 'Birmingham', 'Cardiff', 'Dudley', 'Gatwick', 'Glasgow', 'Liverpool', 'London', 'Manchester', 'Newcastle', 'Plymouth', 'Reading', 'Sheffield', 'Slough']], [['Sulaimani'], ['Amman'], ['Al Jahra', 'Kuwait City'], ['Muscat', 'Mussanah', 'Salalah'], ['Ramallah'], ['Doha'], ['Hail', 'Madinah', 'Makkah', 'Riyadh'], ['Istanbul'], ['Abu Dhabi', 'Dubai', 'Sharjah']], [['Auckland', 'Bay of Islands', 'Dunedin', 'Greymouth', 'Hokianga', 'New Plymouth', 'Palmerston North', 'Queenstown', 'Rotorua', 'Taupo', 'Te Anau', 'Wairarapa', 'Wanganui', 'Wellington']], [['Anchorage', 'Boston', 'Boulder', 'Buffalo', 'Chicago', 'Cincinnati', 'Durham', 'Los Angeles', 'Minneapolis', 'Nashville', 'New York', 'Scottsdale']]]
target_hotel = [[[['Grand Millennium Beijing', 'Millennium Residences @ Beijing Fortune Plaza'], ['Millennium Hotel Chengdu'], ['Millennium Hotel Fuqing'], ['Millennium Resort Hangzhou'], ['New World Millennium Hong Kong Hotel'], ['Millennium Gaea Resort Hualien'], ['Grand Millennium Shanghai HongQiao'], ['Millennium Hotel Taichung'], ['Millennium Hotel Wuxi'], ['Millennium Resort Wuyishan'], ['Millennium Harbourview Hotel Xiamen'], ['Millennium Hotel Zunyi']], [['Millennium Hotel Sirih Jakarta']], [['Millennium Mitsui Garden Hotel Tokyo']], [['Copthorne Hotel Cameron Highlands'], ['Grand Millennium Kuala Lumpur'], ['Copthorne Orchid Hotel Penang']], [['The Heritage Hotel Manila']], [], [['Millennium Resort Patong Phuket']]], [[['Millennium Hotel Paris Charles De Gaulle', 'Millennium Hotel Paris Opera']], [['The Biltmore Hotel Tbilisi']], [['Grand Hotel Palace Rome']], [['Copthorne Hotel Aberdeen'], ['Copthorne Hotel Birmingham'], ['Copthorne Hotel Cardiff-Caerdydd'], ['Copthorne Hotel Merry Hill-Dudley'], ['Copthorne Hotel Effingham Gatwick', 'Copthorne Hotel London Gatwick'], ['Millennium Hotel Glasgow'], ['Hard Days Night Hotel Liverpool'], ['Copthorne Tara Hotel London Kensington', 'Millennium and Copthorne Hotels at Chelsea Football Club', 'Millennium Gloucester Hotel London Kensington', 'Millennium Hotel London Knightsbridge', "The Bailey's Hotel London", 'The Chelsea Harbour Hotel'], ['Copthorne Hotel Manchester'], ['Copthorne Hotel Newcastle'], ['Copthorne Hotel Plymouth'], ['Millennium Madejski Hotel Reading'], ['Copthorne Hotel Sheffield'], ['Copthorne Hotel Slough-Windsor']]], [[['Copthorne Hotel Baranan', 'Grand Millennium Hotel Sulaimani', 'Millennium Kurdistan Hotel and Spa']], [['Grand Millennium Amman']], [['Copthorne Al Jahra Hotel & Resort'], ['Copthorne Kuwait City', 'Millennium Hotel and Convention Centre Kuwait']], [['Grand Millennium Muscat', 'Millennium Executive Apartments Muscat'], ['Millennium Resort Mussanah'], ['Millennium Resort Salalah']], [['Millennium Palestine Ramallah']], [['Copthorne Hotel Doha', 'Kingsgate Hotel Doha', 'Millennium Hotel Doha', 'Millennium Plaza Doha']], [['Millennium Hail Hotel Saudi Arabia'], ['Millennium Al Aqeeq Hotel', 'Millennium Madinah Airport', 'Millennium Taiba Hotel'], ['Copthorne Makkah Al Naseem', 'M Hotel Makkah by Millennium', 'Makkah Millennium Hotel', 'Makkah Millennium Towers', 'Millennium Makkah Al Naseem'], ['Copthorne Hotel Riyadh']], [['Millennium Istanbul Golden Horn']], [['Bab Al Qasr Hotel', 'Grand Millennium Al Wahda', 'Kingsgate Hotel Abu Dhabi by Millennium'], ['Copthorne Hotel Dubai', 'Grand Millennium Business Bay', 'Grand Millennium Dubai', 'M Hotel Downtown by Millennium', 'Millennium Airport Hotel Dubai', 'Millennium Al Barsha', 'Millennium Atria Business Bay', 'Millennium Place Marina', 'Millennium Plaza Hotel Dubai', 'Studio M Arabian Plaza'], ['Copthorne Hotel Sharjah']]], [[['Copthorne Hotel Auckland City', 'Grand Millennium Auckland', 'M Social Auckland'], ['Copthorne Hotel and Resort Bay of Islands', 'Kingsgate Hotel Autolodge Paihia'], ['Kingsgate Hotel Dunedin'], ['Kingsgate Hotel Greymouth'], ['Copthorne Hotel and Resort Hokianga'], ['Copthorne Hotel Grand Central New Plymouth', 'Millennium Hotel New Plymouth Waterfront'], ['Copthorne Hotel Palmerston North'], ['Copthorne Hotel & Apartments Queenstown Lakeview', 'Copthorne Hotel and Resort Queenstown Lakefront ', 'Millennium Hotel Queenstown'], ['Copthorne Hotel Rotorua', 'Millennium Hotel Rotorua'], ['Millennium Hotel and Resort Manuels Taupo'], ['Kingsgate Hotel Te Anau'], ['Copthorne Hotel & Resort Solway Park Wairarapa'], ['Kingsgate Hotel The Avenue Wanganui'], ['Copthorne Hotel Wellington Oriental Bay']]], [[['The Lakefront Anchorage'], ['The Bostonian Boston'], ['Millennium Harvest House Boulder'], ['Millennium Buffalo'], ['Millennium Knickerbocker Chicago'], ['Millennium Cincinnati'], ['Millennium Durham'], ['Millennium Biltmore Los Angeles'], ['Millennium Minneapolis'], ['Millennium Maxwell House Nashville'], ['Millennium Broadway New York Times Square', 'Millennium Premier New York Times Square'], ['The McCormick Scottsdale']]]]


for i in range(len(region_list)):
    print('region is : %s' %region_list[i])
    driver.find_element_by_link_text(region_list[i]).click()
    time.sleep(5)
    # 获取城市列表
    country_list = driver.find_elements(by=By.CLASS_NAME, value='nk2-waterfall-country')
    print('country lenth: %d' % len(country_list))
    assert len(country_list)==len(target_country[i])
    for j in range(len(country_list)):
        country_name = country_list[j].find_element(by=By.CLASS_NAME, value='nk2-waterfall-country-name').text
        print('country name is : %s' %country_name)
        assert country_name==target_country[i][j]

        # 获取城市列表
        city_list = country_list[j].find_elements(by=By.CLASS_NAME,value='nk2-waterfall-city')
        print('city lenth: %d' %len(city_list))
        assert len(city_list) == len(target_city[i][j])
        if len(city_list):
            for k in range(len(city_list)):
                city_name = city_list[k].find_element(by=By.CLASS_NAME,value='nk2-waterfall-city-name').text
                print('city name is : %s' %city_name)
                assert city_name==target_city[i][j][k]

                hotel_list = city_list[k].find_elements(by=By.CLASS_NAME,value='nk2-waterfall-hotel-name')
                print('hotel lenth in city %s : %d' %(city_name,len(hotel_list)))
                assert len(hotel_list)==len(target_hotel[i][j][k])
                for l in range(len(hotel_list)):
                    print(hotel_list[l].text)
                    assert hotel_list[l].text==target_hotel[i][j][k][l]
        else:
            hotel_list = country_list[j].find_elements(by=By.CLASS_NAME, value='nk2-waterfall-hotel-name')
            print('hotel lenth in country %s : %d' % (country_name, len(hotel_list)))
            assert len(hotel_list) == len(target_hotel[i][j][k])
            for l in range(len(hotel_list)):
                print(hotel_list[l].text)
                assert hotel_list[l].text == target_hotel[i][j][k][l]

driver.quit()

我有个思路,但是可能不适配到你的框架中:

  1. 从接口中获取所有的 hotel(如 @ 煎饼果子 的方法)
  2. 从页面上爬下所有的 hotel(Requests-HTML) 1 和 2 对比
果冻 回复

另外,你应该还需要检查每个语种

Jerry li 回复

感谢大佬,之前和开发确认过是后端传数据的时候出错的,前端没问题。所以我觉得 1 楼大佬的方法可以,正常情况下 len(hotels) ==125 前端显示应该就没问题了。

果冻 回复

对,每次发布都需要检查语言切换是否成功,目前正在把这条用例添加到脚本中,但是遇到一些困难还没解决,断言出了点问题,还在思考如何解决

需要 登录 后方可回复, 如果你还没有账号请点击这里 注册