Policy Information
ML之RS之CF:基于用户的CF算法—利用大量用户的电影及其评分数据集对一个新用户Jason进行推荐电影+(已知Jason曾观看几十部电影及其评分)
目录
先看推荐结果显示
- from math import sqrt
-
- pearson距离
- def pearson_dis(rating1, rating2):
- sum_xy = 0
- sum_x = 0
- sum_y = 0
- sum_x2 = 0
- sum_y2 = 0
- n = 0
- for key in rating1:
- if key in rating2:
- n += 1
- x = rating1[key]
- y = rating2[key]
- sum_xy += x * y
- sum_x += x
- sum_y += y
- sum_x2 += pow(x, 2)
- sum_y2 += pow(y, 2)
- now compute denominator
- denominator = sqrt(sum_x2 - pow(sum_x, 2) / n) * sqrt(sum_y2 - pow(sum_y, 2) / n)
- if denominator == 0:
- return 0
- else:
- return (sum_xy - (sum_x * sum_y) / n) / denominator
-
-
- 查找最近邻函数
- def computeNearestNeighbor(username, users):
- """在给定username的情况下,计算其他用户和它的距离并排序"""
- distances = []
- for user in users: 全用户遍历,找到两个用户,计算pearson距离,依次添加到列表内
- if user != username:
- distance = manhattan_dis(users[user], users[username])
- distance = pearson_dis(users[user], users[username])
- distances.append((distance, user))
- distances.sort()
- return distances
-
- 进行推荐函数
- def recommend(username, users):
- nearest = computeNearestNeighbor(username, users)[0][1]
-
- recommendations = []
- neighborRatings = users[nearest]
- userRatings = users[username]
- for artist in neighborRatings:
- if not artist in userRatings:
- recommendations.append((artist, neighborRatings[artist]))
- results = sorted(recommendations, key=lambda artistTuple: artistTuple[1], reverse = True)
- for result in results:
- print(result[0], result[1])
-
- recommend('Jason', users)
-
-
相关文章推荐
ML之RS之CF:基于用户的CF算法—利用大量用户的电影及其评分数据集对一个新用户Jason进行推荐电影+(已知Jason曾观看几十部电影及其评分)
评论