Policy Information
Py之Crawler:基于requests库+json库实现爬取刘若英2018导演电影《后来的我们》的插曲《再见》张震岳的几十万热评+词云:发现“再见”亦是再也不见
目录
背景图片
- -*- coding: utf-8 -*-
-
- Py之Crawler:爬取刘若英2018导演电影《后来的我们》的插曲《再见》张震岳的几十万热评,发现
-
- import requests
- import json
-
- url = 'http://music.163.com/weapi/v1/resource/comments/R_SO_4_185726?csrf_token='
-
- headers = {
- 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36',
- 'Referer':'http://music.163.com/song?id=185726',
- 'Origin':'http://music.163.com',
- 'Host':'music.163.com'
- }
-
- response = requests.post(url,headers=headers,data=user_data)
-
- data = json.loads(response.text)
- hotcomments = []
- for hotcommment in data['hotComments']:
- item = {
- 'nickname':hotcommment['user']['nickname'],
- 'content':hotcommment['content'],
- 'likedCount':hotcommment['likedCount']
- }
- hotcomments.append(item)
-
- 获取评论用户名,内容,以及对应的获赞数
- content_list = [content['content'] for content in hotcomments]
- nickname = [content['nickname'] for content in hotcomments]
- liked_count = [content['likedCount'] for content in hotcomments]
-
-
- 生成图表与词云图
- from pyecharts import Bar
-
- bar = Bar("刘若英2018导演电影《后来的我们》的插曲《再见》最新热评——点赞数示例图")
- bar.add( "",nickname, liked_count, is_stack=True,mark_line=["min", "max"],mark_point=["average"])
- bar.render()
-
- from wordcloud import WordCloud
- import matplotlib.pyplot as plt
- from scipy.misc import imread
-
- content_text = " ".join(content_list)
-
- bg_pic = imread('F:/File_Python/Resources/heibai04.jpg')
- wordcloud = WordCloud(font_path=r"C:\Windows\Fonts\STXINGKA.TTF",max_words=2500,background_color="white",mask=bg_pic,scale=5).generate(content_text) ,max_words=1200,width=1800, height=1200
-
- wordcloud.to_file('zaijian.jpg') 保存词云图
-
- plt.figure()
- plt.imshow(wordcloud,interpolation='bilinear')
- plt.axis('off')
- plt.show()
-
-
相关文章
Py之Crawler:爬取刘若英2018导演电影《后来的我们》的插曲《再见》张震岳的几十万热评,发现“再见”亦然是再也不见
评论