Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

知乎爬虫 Request error: {"error":{"code":101,"name":"AuthenticationError","message":"ZERR_NOT_LOGIN"}} #517

Open
Once2gain opened this issue Dec 9, 2024 · 3 comments

Comments

@Once2gain
Copy link

对于这个错误尚不清楚如何解决,还大佬指导一下~

2024-12-09 09:11:10 MediaCrawler INFO (core.py:289) - [ZhihuCrawler.launch_browser] Begin create browser context ...                                                                                                                                                   
2024-12-09 09:11:12 MediaCrawler INFO (core.py:258) - [ZhihuCrawler.create_zhihu_client] Begin create zhihu API client ...                                                                                                                                             
2024-12-09 09:11:12 MediaCrawler INFO (client.py:136) - [ZhiHuClient.pong] Begin to pong zhihu...                                                                                                                                                                      
2024-12-09 09:11:12 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/me?include=email%2Cis_active%2Cis_bind_phone, Request error: {"error":{"code":100,"name":"AuthenticationInvalidRequest","message":"请求头或参数
封装错误"}}                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                       
2024-12-09 09:11:13 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/me?include=email%2Cis_active%2Cis_bind_phone, Request error: {"error":{"code":100,"name":"AuthenticationInvalidRequest","message":"请求头或参数
封装错误"}}                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                       
2024-12-09 09:11:15 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/me?include=email%2Cis_active%2Cis_bind_phone, Request error: {"error":{"code":100,"name":"AuthenticationInvalidRequest","message":"请求头或参数
封装错误"}}                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                       
2024-12-09 09:11:15 MediaCrawler ERROR (client.py:146) - [ZhiHuClient.pong] Ping zhihu failed: RetryError[<Future at 0x7f9f24a9ad90 state=finished raised DataFetchError>], and try to login again...                                                                  
2024-12-09 09:11:15 MediaCrawler INFO (login.py:58) - [ZhiHu.begin] Begin login zhihu ...                                                                                                                                                                              
2024-12-09 09:11:15 MediaCrawler INFO (login.py:108) - [ZhiHu.login_by_cookies] Begin login zhihu by cookie ...                                                                                                                                                        
2024-12-09 09:11:15 MediaCrawler INFO (core.py:89) - [ZhihuCrawler.start] Zhihu跳转到搜索页面获取搜索页面的Cookies,该过程需要5秒左右                                                                                                                                  
2024-12-09 09:11:21 MediaCrawler INFO (core.py:111) - [ZhihuCrawler.search] Begin search zhihu keywords                                                                                                                                                                
2024-12-09 09:11:21 MediaCrawler INFO (core.py:118) - [ZhihuCrawler.search] Current search keyword: 汽车                                                                                                                                                               
2024-12-09 09:11:21 MediaCrawler INFO (core.py:127) - [ZhihuCrawler.search] search zhihu keyword: 汽车, page: 1                                                                                                                                                        
2024-12-09 09:11:22 MediaCrawler INFO (client.py:212) - [ZhiHuClient.get_note_by_keyword] Search result: {'paging': {'is_end': False, 'next': 'https://api.zhihu.com/search_v3?advert_count=0&correction=1&filter_fields=&gk_version=gz-gaokao&lc_idx=0&limit=20&offset
=20&q=%E6%B1%BD%E8%BD%A6&search_hash_id=6cd8c97771a7c7d75956db132dbae18d&search_source=Filter&show_all_topics=0&sort=&t=general&time_interval=&vertical=&vertical_info=0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C0%2C2%2C0'}, 'data': [{'type': 'search_result', 'highlight': {'d
······
stion_id': '562563592', 'title': '电动汽车为什么一下爆发了?', 'desc': '人类早就有能力造电动车了啊,是什么契机,技术进步这两年爆发的?', 'created_time': 1709466282, 'updated_time': 1710330624, 'voteup_count': 15420, 'comment_count': 2457, 'source_keyword': '汽车', 'user_id': '51c6b8c755f776354a3e966e881bbbda', 'user_link': 'https://www.zhihu.com/people/di-ren-jie-57', 'user_nickname': '狄仁杰', 'user_avatar': 'https://pic1.zhimg.com/50/v2-2df28424e6e8e451a5454c091bf6c0ae_l.jpg?source=4e949a73', 'user_url_token': 'di-ren-jie-57', 'last_modify_ts': 1733735482784}
2024-12-09 09:11:22 MediaCrawler INFO (core.py:168) - [ZhihuCrawler.batch_get_content_comments] Crawling comment mode is not enabled
2024-12-09 09:11:22 MediaCrawler INFO (core.py:127) - [ZhihuCrawler.search] search zhihu keyword: 汽车, page: 2
2024-12-09 09:11:23 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/search_v3?gk_version=gz-gaokao&t=general&q=%E6%B1%BD%E8%BD%A6&correction=1&offset=20&limit=20&filter_fields=&lc_idx=20&show_all_topics=0&search_source=Filter&time_interval=&sort=&vertical=, Request error: {"error":{"code":101,"name":"AuthenticationError","message":"ZERR_NOT_LOGIN"}}

2024-12-09 09:11:24 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/search_v3?gk_version=gz-gaokao&t=general&q=%E6%B1%BD%E8%BD%A6&correction=1&offset=20&limit=20&filter_fields=&lc_idx=20&show_all_topics=0&search_source=Filter&time_interval=&sort=&vertical=, Request error: {"error":{"code":101,"name":"AuthenticationError","message":"ZERR_NOT_LOGIN"}}

2024-12-09 09:11:25 MediaCrawler ERROR (client.py:90) - [ZhiHuClient.request] Requset Url: https://www.zhihu.com/api/v4/search_v3?gk_version=gz-gaokao&t=general&q=%E6%B1%BD%E8%BD%A6&correction=1&offset=20&limit=20&filter_fields=&lc_idx=20&show_all_topics=0&search_source=Filter&time_interval=&sort=&vertical=, Request error: {"error":{"code":101,"name":"AuthenticationError","message":"ZERR_NOT_LOGIN"}}
@Once2gain
Copy link
Author

而且这一错误抛出的是tenacity.RetryError,无法被DataFetchError捕获

except DataFetchError:

Traceback (most recent call last):                                                                                                                                                                                                                                     
  File "/opt/conda/envs/media/lib/python3.9/site-packages/tenacity/_asyncio.py", line 50, in __call__                                                                                                                                                                  
    result = await fn(*args, **kwargs)                                                                                                                                                                                                                                 
  File "/data/media_platform/zhihu/client.py", line 96, in request                                                                                                                                                                                                     
    raise DataFetchError(response.text)                                                                                                                                                                                                                                
media_platform.zhihu.exception.DataFetchError: {"error":{"code":101,"name":"AuthenticationError","message":"ZERR_NOT_LOGIN"}}                                                                                                                                          
                                                                                                                                                                                                                                                                       
                                                                                                                                                                                                                                                                       
The above exception was the direct cause of the following exception:                                                                                                                                                                                                   
                                                                                                                                                                                                                                                                       
Traceback (most recent call last):                                                                                                                                                                                                                                     
  File "/data/main.py", line 66, in <module>                                                                                                                                                                                                                           
    asyncio.get_event_loop().run_until_complete(main())                                                                                                                                                                                                                
  File "/opt/conda/envs/media/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete                                                                                                                                                                   
    return future.result()                                                                                                                                                                                                                                             
  File "/data/main.py", line 56, in main                                                                                                                                                                                                                               
    await crawler.start()                                                                                                                                                                                                                                              
  File "/data/media_platform/zhihu/core.py", line 96, in start                                                                                                                                                                                                         
    await self.search()                                                                                                                                                                                                                                                
  File "/data/media_platform/zhihu/core.py", line 127, in search                                                                                                                                                                                                       
    content_list: List[ZhihuContent]  = await self.zhihu_client.get_note_by_keyword(                                                                                                                                                                                   
  File "/data/media_platform/zhihu/client.py", line 211, in get_note_by_keyword                                                                                                                                                                                        
    search_res = await self.get(uri, params)                                                                                                                                                                                                                           
  File "/data/media_platform/zhihu/client.py", line 128, in get                                                                                                                                                                                                        
    return await self.request(method="GET", url=zhihu_constant.ZHIHU_URL + final_uri, headers=headers, **kwargs)                                                                                                                                                       
  File "/opt/conda/envs/media/lib/python3.9/site-packages/tenacity/_asyncio.py", line 88, in async_wrapped                                                                                                                                                             
    return await fn(*args, **kwargs)                                                                                                                                                                                                                                   
  File "/opt/conda/envs/media/lib/python3.9/site-packages/tenacity/_asyncio.py", line 47, in __call__                                                                                                                                                                  
    do = self.iter(retry_state=retry_state)                                                                                                                                                                                                                            
  File "/opt/conda/envs/media/lib/python3.9/site-packages/tenacity/__init__.py", line 326, in iter                                                                                                                                                                     
    raise retry_exc from fut.exception()                                                                                                                                                                                                                               
tenacity.RetryError: RetryError[<Future at 0x7f2fadc7fbb0 state=finished raised DataFetchError>]

@souldjl
Copy link

souldjl commented Dec 23, 2024

这是知乎问答吗 ?

@Once2gain
Copy link
Author

知乎,关键词搜索,不太清楚跟问答有没有关系

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants