创造营2019爱奇艺完整版 【ai创造营】电商知识图谱链接预测
知识图谱是AI时代一项非常重要的技术,然而知识图谱普遍存在不完备的问题,本任务旨在提升电商场景下知识图谱嵌入效果,满足商品推荐等应用对推理商品潜在关联性的需求。
知识图谱是AI时代一项非常重要的技术,然而知识图谱普遍存在不完备的问题,知识图谱链接预测任务主要基于实体和关系的表示对缺失三元组进行预测。本任务旨在提升电商场景下知识图谱嵌入效果,满足商品推荐等应用对推理商品潜在关联性的需求。
本项目主要介绍一下TransE的算法原理,以及使用PGL来实现TransE在OpenBG500上的的训练和推理,并会给出其DistMult,ComplEx,RotatE,OTE算法之间训练效果对比。

OpenBG Benchmark 是一个大规模开放数字商业知识图谱评测基准,包含多个子数据集任务。数据集以开放的数字商业知识图谱 OpenBG[2]为基础构建,OpenBG 是开放的数字商业知识图谱,是一个使用统一 Schema 组织、涵盖产品和消费需求的百万级多模态数据集。OpenBG 由阿里巴巴藏经阁团队和浙江大学提供,开放的目标是利用开放的商业知识发现社会经济的价值,促进数字商务数字经济等领域的交叉学科研究,服务数字经济健康发展的国家战略需求。首期开放包含三大类任务,本项目实现其中的商品关系推理和链接预测
商品关系推理与链接预测任务描述由于知识图谱普遍存在不完整的问题,因此需要关系推理与链接预测技术对缺失的图谱节点进行预测。本任务旨在提升数字商业场景下知识图谱嵌入效果,满足商品推荐等应用对推理商品潜在关联性的需求。
任务说明知识图谱一般通过三元组(h,r,t)的形式组织数据,其中 h 被称为头实体,t 为尾实体,r 为连接头、尾实体的关系。如下图所示(“化妆棉”,“品牌”,“屈臣氏”)就是一个图谱三元组。知识图谱的链接预测任务指的是已知头实体(或尾实体)和关系的情况下,预测缺失的尾实体(或头实体)。下图中,(“化妆棉”,“适用群体”,?)就是一个链接预测任务,需要预测出尾实体。
数据集
与商品常识显著性推理以及同款挖掘任务不同的是,链接预测任务由 3 个子任务数据集组成:OpenBG500、OpenBG500-L 和 OpenBG-IMG。其中 OpenBG500 包含 500 类关系,含百万级别规模的图谱数据;OpenBG500-L 在 OpenBG500 的基础上扩大了数据规模,含千万级别规模的图谱数据,是电子商务领域大规模的知识图谱;OpenBG-IMG 是电商领域的多模态知识图谱。3 个数据集均以 OpenBG 为基础构建,构建流程如下:

# 项目开始环境准备# 项目开始运行时执行一次即可%cd /home/aistudio/!unzip -q -d data/OpenBG500 data/data177429/OpenBG500.zip# # 拉取PGL# # 1. 从github拉取项目,github网络可能无法连接,可以多重复试几次,或者换用gitee(不过gitee上的是两年前的版本,可能有问题)# # gittee# !git clone https://gitee.com/paddlepaddle/PGL.git# # github# !git clone git://github.com/PaddlePaddle/PGL.git# # 2. github不可用时,可以把项目拉到本地,然后上传压缩包,解压项目压缩包# !unzip -d PGL 0121d96a5ffb385024f8ba13285da5880dd2753c.zip!unzip -q /home/aistudio/PGL.zip!mv /home/aistudio/home/aistudio/data/PGL /home/aistudio# # 进入PGL目录,安装所需依赖# %cd PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c# !pip install -r requirements.txt# # # 进入apps目录# %cd apps# 使用networkx绘图,查看图的大体情况!pip install --upgrade numpy!pip install networkx# 安装paddle的graph相关的包!pip install pgl登录后复制
/home/aistudioLooking in indexes: https://pypi.tuna.tsinghua.edu.cn/simpleRequirement already satisfied: numpy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.19.5)Collecting numpy Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6d/ad/ff3b21ebfe79a4d25b4a4f8e5cf9fd44a204adb6b33c09010f566f51027a/numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.7/15.7 MB 2.4 MB/s eta 0:00:0000:0100:01Installing collected packages: numpy Attempting uninstall: numpy Found existing installation: numpy 1.19.5 Uninstalling numpy-1.19.5: Successfully uninstalled numpy-1.19.5ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.parl 1.4.1 requires pyzmq==18.1.1, but you have pyzmq 23.2.1 which is incompatible.Successfully installed numpy-1.21.6[notice] A new release of pip available: 22.1.2 -> 22.3.1[notice] To update, run: pip install --upgrade pipLooking in indexes: https://pypi.tuna.tsinghua.edu.cn/simpleRequirement already satisfied: networkx in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (2.4)Requirement already satisfied: decorator>=4.3.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from networkx) (4.4.2)[notice] A new release of pip available: 22.1.2 -> 22.3.1[notice] To update, run: pip install --upgrade pipLooking in indexes: https://pypi.tuna.tsinghua.edu.cn/simpleCollecting pgl Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e2/86/f32837dff03a494d6a3b3e9f578c3e12df32e05ddb389a47a02fbd1f9455/pgl-2.2.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.2/9.2 MB 1.7 MB/s eta 0:00:0000:0100:01Requirement already satisfied: numpy>=1.16.4 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pgl) (1.21.6)Requirement already satisfied: cython>=0.25.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pgl) (0.29)Installing collected packages: pglSuccessfully installed pgl-2.2.4[notice] A new release of pip available: 22.1.2 -> 22.3.1[notice] To update, run: pip install --upgrade pip登录后复制 查看数据集大体情况In [3]
# 查看训练集大致情况import pandas as pdbg_train = pd.read_csv( '/home/aistudio/data/OpenBG500/OpenBG500_train.tsv', sep='\t', index_col=False, header=None)print("=======查看训练集的前五行=========")print(bg_train.head())print("\n\n=======查看训练集的大致情况========")print(bg_train.describe())登录后复制
=======查看训练集的前五行========= 0 1 20 ent_135492 rel_0352 ent_0156511 ent_020765 rel_0448 ent_2141832 ent_106905 rel_0418 ent_1210733 ent_098167 rel_0343 ent_0904024 ent_155261 rel_0225 ent_100806=======查看训练集的大致情况======== 0 1 2count 1242550 1242550 1242550unique 116721 500 133025top ent_172515 rel_0418 ent_109153freq 2208 416742 42649登录后复制 In [14]
# 数据预处理,# 因为pgl支持处理txt和dict格式的数据,所以需要自己编写一个脚本来把OpenBG的格式转换为pgl支持的格式!python /home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph4KG/dataset/convertor.py登录后复制
file saved at /home/aistudio/data/OpenBG500/entities.dictfile saved at /home/aistudio/data/OpenBG500/relations.dictfile saved at /home/aistudio/data/OpenBG500/train.txtfile saved at /home/aistudio/data/OpenBG500/test.txtfile saved at /home/aistudio/data/OpenBG500/valid.txt登录后复制 In [1]
import numpy as npimport networkx as nximport matplotlib.pyplot as plt# 随机从训练集中选取1000(太多了networkx可能报错)条三元组,查看图的大体情况# 加载关系字典和实体字典def get_dict(): rel_dict = dict() ent_dict = dict() with open('/home/aistudio/data/OpenBG500/relations.dict', 'r') as f: lines = f.readlines() for line in lines: k, v = line.strip().split('\t') rel_dict[k] = v with open('/home/aistudio/data/OpenBG500/entities.dict', 'r') as f: lines = f.readlines() for line in lines: k, v = line.strip().split('\t') ent_dict[k] = v return rel_dict, ent_dictrel_dict, ent_dict = get_dict()# 选取一部分数据sample_num = 4000s = np.random.randint(0, 1242550-sample_num)data = []with open('/home/aistudio/data/OpenBG500/train.txt', 'r') as f: data = f.readlines() data = data[s:s+sample_num]# 实体和关系转换为idsdef to_ids(data, rel_dict, ent_dict): data = data.strip().split('\t') return ent_dict[data[0]], ent_dict[data[2]], rel_dict[data[1]]data = [to_ids(i, rel_dict, ent_dict) for i in data]plt.figure(figsize=(10, 10), dpi=200)graph = nx.Graph()graph.add_weighted_edges_from(data)node_color = np.concatenate([np.linspace(0, 1, sample_num)[:, None], np.zeros([sample_num, 2])], axis=1)# 绘图的配置options = { 'node_color': 'black', 'node_size': 10, 'width': 3, 'node_color': node_color, 'width': 0.5}nx.draw(graph, **options)plt.savefig('/home/aistudio/result/images/graph.png')plt.show()登录后复制
<Figure size 2000x2000 with 1 Axes>登录后复制 训练
表示学习旨在学习一系列低维稠密向量来表征语义信息,而知识表示学习是面向知识库中实体和关系的表示学习。当今大规模知识库(或称知识图谱)的构建为许多NLP任务提供了底层支持,但由于其规模庞大且不完备,如何高效存储和补全知识库成为了一项非常重要的任务,这就依托于知识表示学习。
transE算法就是一个非常经典的知识表示学习,用分布式表示(distributed representation)来描述知识库中的三元组。想象一下,这类表示法既避免了庞大的树结构构造,又能通过简单的数学计算获取语义信息,因此成为了当前表示学习的根基。
transE算法流程如下:

# 训练模型,可以通过改变model_name来换用不同的模型# 不同模型的相关参数可以参考/home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph4KG/models里的sh文件# 参数的含义可以参考 /home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph4KG/config.py# 进入项目文件夹%cd /home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph4KG/登录后复制
/home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph4KG登录后复制登录后复制 训练结果
MRR=∣S∣1i=1∑∣S∣ranki1
MR:MR的全称是Mean Rank。具体的计算方法如下:MR=∣S∣1i=1∑∣S∣ranki
求解思路和MRR相同,就是把倒数排名换成不是倒数排名,MR值越小越好。
HIT@n:该指标是指在链接预测中排名小于等于n的三元组的平均占比。具体的计算方法如下:HITS@n=∣S∣1i=1∑∣S∣I(ranki≤n)
符号与上述一样,另外I(⋅)是indicator函数(若条件真则函数值为1,否则为0)。一般地,取n为1、3或者10,HITS@n指标越大越好
训练TransEIn [5]# 训练TransE!python -u train.py --model_name TransE \ --data_name OpenBG500\ --data_path /home/aistudio/data/\ --save_path /home/aistudio/result/transe \ --batch_size 1000 --test_batch_size 16 \ --log_interval 1000 --eval_interval 24000 \ --reg_coef 1e-9 --reg_norm 3 \ --neg_sample_size 200 --neg_sample_type 'chunk' \ --embed_dim 400 --gamma 19.9 -adv \ --num_workers 8 --num_epoch 30 \ --print_on_screen --filter_eval --lr 0.25 \ --optimizer adagrad --valid登录后复制
---------------------------------------- Device Setting ---------------------------------------- Entity embedding place: gpu Relation embedding place: gpu-------------------------------------------------------------------------------- Embedding Setting ---------------------------------------- Entity embedding dimension: 400 Relation embedding dimension: 400----------------------------------------2022-12-02 17:59:27,276 INFO seed :02022-12-02 17:59:27,276 INFO data_path :/home/aistudio/data/2022-12-02 17:59:27,276 INFO save_path :/home/aistudio/result/transe/transe_OpenBG500_d_400_g_19.9_e_gpu_r_gpu_l_Logsigmoid_lr_0.25_0.1_KGE2022-12-02 17:59:27,276 INFO init_from_ckpt :None2022-12-02 17:59:27,276 INFO data_name :OpenBG5002022-12-02 17:59:27,276 INFO use_dict :False2022-12-02 17:59:27,276 INFO kv_mode :False2022-12-02 17:59:27,276 INFO batch_size :10002022-12-02 17:59:27,276 INFO test_batch_size :162022-12-02 17:59:27,277 INFO neg_sample_size :2002022-12-02 17:59:27,277 INFO filter_eval :True2022-12-02 17:59:27,277 INFO model_name :transe2022-12-02 17:59:27,277 INFO embed_dim :4002022-12-02 17:59:27,277 INFO reg_coef :1e-092022-12-02 17:59:27,277 INFO loss_type :Logsigmoid2022-12-02 17:59:27,277 INFO max_steps :20000002022-12-02 17:59:27,277 INFO lr :0.252022-12-02 17:59:27,277 INFO optimizer :adagrad2022-12-02 17:59:27,277 INFO cpu_lr :0.12022-12-02 17:59:27,277 INFO cpu_optimizer :adagrad2022-12-02 17:59:27,277 INFO mix_cpu_gpu :False2022-12-02 17:59:27,277 INFO async_update :False2022-12-02 17:59:27,277 INFO valid :True2022-12-02 17:59:27,277 INFO test :False2022-12-02 17:59:27,277 INFO task_name :KGE2022-12-02 17:59:27,277 INFO num_workers :82022-12-02 17:59:27,277 INFO neg_sample_type :chunk2022-12-02 17:59:27,277 INFO neg_deg_sample :False2022-12-02 17:59:27,277 INFO neg_adversarial_sampling:True2022-12-02 17:59:27,277 INFO adversarial_temperature:1.02022-12-02 17:59:27,277 INFO filter_sample :False2022-12-02 17:59:27,277 INFO valid_percent :1.02022-12-02 17:59:27,277 INFO use_feature :False2022-12-02 17:59:27,277 INFO reg_type :norm_er2022-12-02 17:59:27,278 INFO reg_norm :32022-12-02 17:59:27,278 INFO weighted_loss :False2022-12-02 17:59:27,278 INFO margin :1.02022-12-02 17:59:27,278 INFO pairwise :False2022-12-02 17:59:27,278 INFO gamma :19.92022-12-02 17:59:27,278 INFO ote_scale :02022-12-02 17:59:27,278 INFO ote_size :12022-12-02 17:59:27,278 INFO quate_lmbda1 :0.02022-12-02 17:59:27,278 INFO quate_lmbda2 :0.02022-12-02 17:59:27,278 INFO num_epoch :302022-12-02 17:59:27,278 INFO scheduler_interval :-12022-12-02 17:59:27,278 INFO num_process :12022-12-02 17:59:27,278 INFO print_on_screen :True2022-12-02 17:59:27,278 INFO log_interval :10002022-12-02 17:59:27,278 INFO save_interval :-12022-12-02 17:59:27,278 INFO eval_interval :240002022-12-02 17:59:27,278 INFO ent_emb_on_cpu :False2022-12-02 17:59:27,278 INFO rel_emb_on_cpu :False2022-12-02 17:59:27,278 INFO use_embedding_regularization:True2022-12-02 17:59:27,278 INFO ent_dim :4002022-12-02 17:59:27,278 INFO rel_dim :4002022-12-02 17:59:27,278 INFO num_chunks :5W1202 17:59:42.894878 950 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2W1202 17:59:42.897931 950 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2./opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py:3983: DeprecationWarning: Op `adagrad` is executed through `append_op` under the dynamic mode, the corresponding API implementation needs to be upgraded to using `_C_ops` method. DeprecationWarning,2022-12-02 17:59:56,356 INFO step: 999, loss: 0.92951, reg: 1.1653e-03, speed: 83.88 steps/s, time: 11.92 s2022-12-02 17:59:56,356 INFO sample: 1.688623, forward: 4.919557, backward: 0.766675, update: 4.5369722022-12-02 18:00:07,572 INFO step: 1999, loss: 0.51847, reg: 1.6318e-03, speed: 89.16 steps/s, time: 11.22 s2022-12-02 18:00:07,572 INFO sample: 1.472965, forward: 4.039794, backward: 0.771991, update: 4.9217002022-12-02 18:00:18,513 INFO step: 2999, loss: 0.38514, reg: 1.7626e-03, speed: 91.40 steps/s, time: 10.94 s2022-12-02 18:00:18,513 INFO sample: 1.473510, forward: 4.081738, backward: 0.748087, update: 4.6280822022-12-02 18:00:29,354 INFO step: 3999, loss: 0.32582, reg: 1.8105e-03, speed: 92.25 steps/s, time: 10.84 s2022-12-02 18:00:29,354 INFO sample: 1.541132, forward: 4.029341, backward: 0.737082, update: 4.5242872022-12-02 18:00:40,563 INFO step: 4999, loss: 0.29788, reg: 1.8435e-03, speed: 89.21 steps/s, time: 11.21 s2022-12-02 18:00:40,564 INFO sample: 1.484886, forward: 4.247042, backward: 0.905907, update: 4.5616742022-12-02 18:00:50,743 INFO step: 5999, loss: 0.27639, reg: 1.8786e-03, speed: 98.24 steps/s, time: 10.18 s2022-12-02 18:00:50,743 INFO sample: 0.547168, forward: 4.085115, backward: 0.844820, update: 4.6909172022-12-02 18:01:01,927 INFO step: 6999, loss: 0.26605, reg: 1.9013e-03, speed: 89.41 steps/s, time: 11.18 s2022-12-02 18:01:01,927 INFO sample: 1.570992, forward: 4.085207, backward: 0.755627, update: 4.7629352022-12-02 18:01:13,348 INFO step: 7999, loss: 0.25838, reg: 1.9202e-03, speed: 87.56 steps/s, time: 11.42 s2022-12-02 18:01:13,349 INFO sample: 1.683276, forward: 3.915311, backward: 0.832062, update: 4.9807732022-12-02 18:01:24,567 INFO step: 8999, loss: 0.25231, reg: 1.9371e-03, speed: 89.14 steps/s, time: 11.22 s2022-12-02 18:01:24,568 INFO sample: 1.619009, forward: 3.994691, backward: 0.792235, update: 4.8030782022-12-02 18:01:35,494 INFO step: 9999, loss: 0.24707, reg: 1.9526e-03, speed: 91.53 steps/s, time: 10.93 s2022-12-02 18:01:35,494 INFO sample: 1.559962, forward: 4.007938, backward: 0.758157, update: 4.5903302022-12-02 18:01:45,652 INFO step: 10999, loss: 0.23906, reg: 1.9714e-03, speed: 98.44 steps/s, time: 10.16 s2022-12-02 18:01:45,653 INFO sample: 0.566743, forward: 4.002622, backward: 0.798780, update: 4.7798582022-12-02 18:01:56,419 INFO step: 11999, loss: 0.23441, reg: 1.9833e-03, speed: 92.88 steps/s, time: 10.77 s2022-12-02 18:01:56,419 INFO sample: 1.455285, forward: 4.058579, backward: 0.742973, update: 4.5001602022-12-02 18:02:07,230 INFO step: 12999, loss: 0.23135, reg: 1.9933e-03, speed: 92.50 steps/s, time: 10.81 s2022-12-02 18:02:07,230 INFO sample: 1.447748, forward: 4.046167, backward: 0.805724, update: 4.5015692022-12-02 18:02:18,255 INFO step: 13999, loss: 0.22790, reg: 2.0011e-03, speed: 90.70 steps/s, time: 11.03 s2022-12-02 18:02:18,256 INFO sample: 1.513487, forward: 4.083624, backward: 0.746703, update: 4.6721632022-12-02 18:02:29,411 INFO step: 14999, loss: 0.22552, reg: 2.0106e-03, speed: 89.64 steps/s, time: 11.16 s2022-12-02 18:02:29,412 INFO sample: 1.557806, forward: 3.985478, backward: 0.776811, update: 4.8255412022-12-02 18:02:39,290 INFO step: 15999, loss: 0.22099, reg: 2.0221e-03, speed: 101.23 steps/s, time: 9.88 s2022-12-02 18:02:39,291 INFO sample: 0.551149, forward: 4.013759, backward: 0.778107, update: 4.5257842022-12-02 18:02:50,121 INFO step: 16999, loss: 0.21762, reg: 2.0290e-03, speed: 92.34 steps/s, time: 10.83 s2022-12-02 18:02:50,121 INFO sample: 1.527163, forward: 3.989723, backward: 0.741708, update: 4.5617252022-12-02 18:03:01,113 INFO step: 17999, loss: 0.21553, reg: 2.0352e-03, speed: 90.97 steps/s, time: 10.99 s2022-12-02 18:03:01,113 INFO sample: 1.498610, forward: 4.078822, backward: 0.780088, update: 4.6258802022-12-02 18:03:12,150 INFO step: 18999, loss: 0.21343, reg: 2.0391e-03, speed: 90.61 steps/s, time: 11.04 s2022-12-02 18:03:12,151 INFO sample: 1.405494, forward: 4.190880, backward: 0.737650, update: 4.6942162022-12-02 18:03:23,375 INFO step: 19999, loss: 0.21156, reg: 2.0436e-03, speed: 89.09 steps/s, time: 11.22 s2022-12-02 18:03:23,375 INFO sample: 1.573719, forward: 4.092272, backward: 0.754539, update: 4.7949512022-12-02 18:03:33,439 INFO step: 20999, loss: 0.20898, reg: 2.0518e-03, speed: 99.37 steps/s, time: 10.06 s2022-12-02 18:03:33,439 INFO sample: 0.576577, forward: 3.965927, backward: 0.787168, update: 4.7231142022-12-02 18:03:44,508 INFO step: 21999, loss: 0.20621, reg: 2.0568e-03, speed: 90.35 steps/s, time: 11.07 s2022-12-02 18:03:44,508 INFO sample: 1.651846, forward: 3.960945, backward: 0.768354, update: 4.6773002022-12-02 18:03:55,515 INFO step: 22999, loss: 0.20442, reg: 2.0595e-03, speed: 90.85 steps/s, time: 11.01 s2022-12-02 18:03:55,515 INFO sample: 1.491682, forward: 4.063816, backward: 0.841389, update: 4.6007082022-12-02 18:04:06,237 INFO step: 23999, loss: 0.20425, reg: 2.0644e-03, speed: 93.27 steps/s, time: 10.72 s2022-12-02 18:04:06,237 INFO sample: 1.510683, forward: 3.982136, backward: 0.715879, update: 4.5041182022-12-02 18:04:06,237 INFO [evaluation] start...100%|█████████████████████████████████████████| 313/313 [01:01<00:00, 5.13it/s]2022-12-02 18:05:07,409 INFO -------------- valid result --------------2022-12-02 18:05:07,409 INFO t,r->h |MRR: 0.0061128707602620125 MR: 12959.571 HITS@1: 0.0006 HITS@3: 0.0024 HITS@10: 0.01142022-12-02 18:05:07,409 INFO h,r->t |MRR: 0.6111788749694824 MR: 324.1712 HITS@1: 0.4724 HITS@3: 0.6978 HITS@10: 0.87142022-12-02 18:05:07,410 INFO average |MRR: 0.3086458742618561 MR: 6641.8711 HITS@1: 0.2365 HITS@3: 0.35009999999999997 HITS@10: 0.441399999999999962022-12-02 18:05:07,410 INFO -----------------------------------------2022-12-02 18:05:07,428 INFO [evaluation] finished! It takes 61.1903 sec s2022-12-02 18:05:18,428 INFO step: 24999, loss: 0.20312, reg: 2.0670e-03, speed: 13.85 steps/s, time: 72.19 s2022-12-02 18:05:18,428 INFO sample: 1.509715, forward: 4.020733, backward: 0.737582, update: 4.7228282022-12-02 18:05:28,551 INFO step: 25999, loss: 0.20132, reg: 2.0727e-03, speed: 98.78 steps/s, time: 10.12 s2022-12-02 18:05:28,552 INFO sample: 0.561651, forward: 4.033424, backward: 0.810810, update: 4.7061332022-12-02 18:05:39,378 INFO step: 26999, loss: 0.19849, reg: 2.0765e-03, speed: 92.38 steps/s, time: 10.82 s2022-12-02 18:05:39,378 INFO sample: 1.559905, forward: 3.985834, backward: 0.729987, update: 4.5403522022-12-02 18:05:50,471 INFO step: 27999, loss: 0.19730, reg: 2.0781e-03, speed: 90.15 steps/s, time: 11.09 s2022-12-02 18:05:50,471 INFO sample: 1.559080, forward: 4.042983, backward: 0.734473, update: 4.7474512022-12-02 18:06:01,203 INFO step: 28999, loss: 0.19672, reg: 2.0815e-03, speed: 93.18 steps/s, time: 10.73 s2022-12-02 18:06:01,204 INFO sample: 1.507630, forward: 4.036844, backward: 0.734679, update: 4.4443772022-12-02 18:06:12,115 INFO step: 29999, loss: 0.19574, reg: 2.0834e-03, speed: 91.65 steps/s, time: 10.91 s2022-12-02 18:06:12,115 INFO sample: 1.571107, forward: 3.980559, backward: 0.763590, update: 4.5860562022-12-02 18:06:21,961 INFO step: 30999, loss: 0.19457, reg: 2.0877e-03, speed: 101.56 steps/s, time: 9.85 s2022-12-02 18:06:21,961 INFO sample: 0.569508, forward: 3.990863, backward: 0.762789, update: 4.5132252022-12-02 18:06:32,969 INFO step: 31999, loss: 0.19238, reg: 2.0913e-03, speed: 90.84 steps/s, time: 11.01 s2022-12-02 18:06:32,970 INFO sample: 1.488254, forward: 4.046757, backward: 0.736669, update: 4.7265102022-12-02 18:06:43,727 INFO step: 32999, loss: 0.19179, reg: 2.0939e-03, speed: 92.96 steps/s, time: 10.76 s2022-12-02 18:06:43,727 INFO sample: 1.450147, forward: 4.031254, backward: 0.713830, update: 4.5533352022-12-02 18:06:54,476 INFO step: 33999, loss: 0.19153, reg: 2.0957e-03, speed: 93.03 steps/s, time: 10.75 s2022-12-02 18:06:54,477 INFO sample: 1.525651, forward: 3.943282, backward: 0.724953, update: 4.5463922022-12-02 18:07:05,339 INFO step: 34999, loss: 0.19035, reg: 2.0975e-03, speed: 92.06 steps/s, time: 10.86 s2022-12-02 18:07:05,339 INFO sample: 1.595187, forward: 3.962639, backward: 0.729714, update: 4.5657832022-12-02 18:07:15,297 INFO step: 35999, loss: 0.18968, reg: 2.1017e-03, speed: 100.43 steps/s, time: 9.96 s2022-12-02 18:07:15,297 INFO sample: 0.551351, forward: 3.950835, backward: 0.763969, update: 4.6809002022-12-02 18:07:26,293 INFO step: 36999, loss: 0.18803, reg: 2.1060e-03, speed: 90.94 steps/s, time: 11.00 s2022-12-02 18:07:26,293 INFO sample: 1.556007, forward: 3.950179, backward: 0.737953, update: 4.7428732022-12-02 18:07:29,057 INFO [evaluation] start...100%|█████████████████████████████████████████| 313/313 [01:01<00:00, 5.11it/s]2022-12-02 18:08:30,470 INFO -------------- valid result --------------2022-12-02 18:08:30,470 INFO t,r->h |MRR: 0.008901488035917282 MR: 12475.6678 HITS@1: 0.0016 HITS@3: 0.0052 HITS@10: 0.01722022-12-02 18:08:30,470 INFO h,r->t |MRR: 0.6207280158996582 MR: 281.0964 HITS@1: 0.4752 HITS@3: 0.722 HITS@10: 0.88482022-12-02 18:08:30,470 INFO average |MRR: 0.3148147463798523 MR: 6378.3821 HITS@1: 0.2384 HITS@3: 0.3636 HITS@10: 0.4512022-12-02 18:08:30,470 INFO -----------------------------------------2022-12-02 18:08:30,489 INFO [evaluation] finished! It takes 61.4317 sec s登录后复制 训练DistMultIn [6]
# 训练DistMult!python -u train.py --model_name DistMult \ --data_name OpenBG500\ --data_path /home/aistudio/data/\ --save_path /home/aistudio/result/Distmult \ --batch_size 1000 --test_batch_size 16 --log_interval 1000 --eval_interval 24000 --neg_sample_type 'chunk' \ --num_workers 2 --neg_sample_size 200 --embed_dim 400 --gamma 143.0 --lr 0.08 --optimizer adagrad \ -adv --num_epoch 30 --filter_eval --print_on_screen --reg_coef 2e-6 --reg_norm 3 --valid登录后复制
---------------------------------------- Device Setting ---------------------------------------- Entity embedding place: gpu Relation embedding place: gpu-------------------------------------------------------------------------------- Embedding Setting ---------------------------------------- Entity embedding dimension: 400 Relation embedding dimension: 400----------------------------------------2022-12-02 18:08:50,186 INFO seed :02022-12-02 18:08:50,186 INFO data_path :/home/aistudio/data/2022-12-02 18:08:50,186 INFO save_path :/home/aistudio/result/Distmult/distmult_OpenBG500_d_400_g_143.0_e_gpu_r_gpu_l_Logsigmoid_lr_0.08_0.1_KGE2022-12-02 18:08:50,186 INFO init_from_ckpt :None2022-12-02 18:08:50,186 INFO data_name :OpenBG5002022-12-02 18:08:50,186 INFO use_dict :False2022-12-02 18:08:50,186 INFO kv_mode :False2022-12-02 18:08:50,187 INFO batch_size :10002022-12-02 18:08:50,187 INFO test_batch_size :162022-12-02 18:08:50,187 INFO neg_sample_size :2002022-12-02 18:08:50,187 INFO filter_eval :True2022-12-02 18:08:50,187 INFO model_name :distmult2022-12-02 18:08:50,187 INFO embed_dim :4002022-12-02 18:08:50,187 INFO reg_coef :2e-062022-12-02 18:08:50,187 INFO loss_type :Logsigmoid2022-12-02 18:08:50,187 INFO max_steps :20000002022-12-02 18:08:50,187 INFO lr :0.082022-12-02 18:08:50,187 INFO optimizer :adagrad2022-12-02 18:08:50,187 INFO cpu_lr :0.12022-12-02 18:08:50,187 INFO cpu_optimizer :adagrad2022-12-02 18:08:50,187 INFO mix_cpu_gpu :False2022-12-02 18:08:50,187 INFO async_update :False2022-12-02 18:08:50,187 INFO valid :True2022-12-02 18:08:50,187 INFO test :False2022-12-02 18:08:50,187 INFO task_name :KGE2022-12-02 18:08:50,187 INFO num_workers :22022-12-02 18:08:50,187 INFO neg_sample_type :chunk2022-12-02 18:08:50,187 INFO neg_deg_sample :False2022-12-02 18:08:50,187 INFO neg_adversarial_sampling:True2022-12-02 18:08:50,187 INFO adversarial_temperature:1.02022-12-02 18:08:50,187 INFO filter_sample :False2022-12-02 18:08:50,187 INFO valid_percent :1.02022-12-02 18:08:50,188 INFO use_feature :False2022-12-02 18:08:50,188 INFO reg_type :norm_er2022-12-02 18:08:50,188 INFO reg_norm :32022-12-02 18:08:50,188 INFO weighted_loss :False2022-12-02 18:08:50,188 INFO margin :1.02022-12-02 18:08:50,188 INFO pairwise :False2022-12-02 18:08:50,188 INFO gamma :143.02022-12-02 18:08:50,188 INFO ote_scale :02022-12-02 18:08:50,188 INFO ote_size :12022-12-02 18:08:50,188 INFO quate_lmbda1 :0.02022-12-02 18:08:50,188 INFO quate_lmbda2 :0.02022-12-02 18:08:50,188 INFO num_epoch :302022-12-02 18:08:50,188 INFO scheduler_interval :-12022-12-02 18:08:50,188 INFO num_process :12022-12-02 18:08:50,188 INFO print_on_screen :True2022-12-02 18:08:50,188 INFO log_interval :10002022-12-02 18:08:50,188 INFO save_interval :-12022-12-02 18:08:50,188 INFO eval_interval :240002022-12-02 18:08:50,188 INFO ent_emb_on_cpu :False2022-12-02 18:08:50,188 INFO rel_emb_on_cpu :False2022-12-02 18:08:50,188 INFO use_embedding_regularization:True2022-12-02 18:08:50,188 INFO ent_dim :4002022-12-02 18:08:50,188 INFO rel_dim :4002022-12-02 18:08:50,188 INFO num_chunks :5W1202 18:09:05.583375 4119 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2W1202 18:09:05.586427 4119 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2./opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py:3983: DeprecationWarning: Op `adagrad` is executed through `append_op` under the dynamic mode, the corresponding API implementation needs to be upgraded to using `_C_ops` method. DeprecationWarning,2022-12-02 18:09:18,491 INFO step: 999, loss: 0.65767, reg: 3.0928e-02, speed: 88.19 steps/s, time: 11.34 s2022-12-02 18:09:18,491 INFO sample: 1.415885, forward: 4.589931, backward: 0.587078, update: 4.7364942022-12-02 18:09:28,727 INFO step: 1999, loss: 0.47731, reg: 5.1126e-02, speed: 97.69 steps/s, time: 10.24 s2022-12-02 18:09:28,727 INFO sample: 1.130671, forward: 3.708618, backward: 0.588126, update: 4.7989842022-12-02 18:09:38,812 INFO step: 2999, loss: 0.36043, reg: 7.8275e-02, speed: 99.18 steps/s, time: 10.08 s2022-12-02 18:09:38,812 INFO sample: 1.150921, forward: 3.723767, backward: 0.571187, update: 4.6285712022-12-02 18:09:48,997 INFO step: 3999, loss: 0.26451, reg: 1.0150e-01, speed: 98.19 steps/s, time: 10.18 s2022-12-02 18:09:48,997 INFO sample: 1.137154, forward: 3.755923, backward: 0.576677, update: 4.7061782022-12-02 18:09:59,035 INFO step: 4999, loss: 0.22269, reg: 1.0864e-01, speed: 99.62 steps/s, time: 10.04 s2022-12-02 18:09:59,036 INFO sample: 1.155761, forward: 3.717498, backward: 0.594281, update: 4.5608832022-12-02 18:10:08,577 INFO step: 5999, loss: 0.19857, reg: 1.1085e-01, speed: 104.81 steps/s, time: 9.54 s2022-12-02 18:10:08,578 INFO sample: 0.539480, forward: 3.752833, backward: 0.611382, update: 4.6272522022-12-02 18:10:18,732 INFO step: 6999, loss: 0.18926, reg: 1.1167e-01, speed: 98.48 steps/s, time: 10.15 s2022-12-02 18:10:18,732 INFO sample: 1.147535, forward: 3.698858, backward: 0.579680, update: 4.7189012022-12-02 18:10:29,036 INFO step: 7999, loss: 0.18307, reg: 1.1203e-01, speed: 97.06 steps/s, time: 10.30 s2022-12-02 18:10:29,036 INFO sample: 1.256027, forward: 3.665453, backward: 0.609613, update: 4.7615582022-12-02 18:10:39,387 INFO step: 8999, loss: 0.17884, reg: 1.1225e-01, speed: 96.61 steps/s, time: 10.35 s2022-12-02 18:10:39,387 INFO sample: 1.297724, forward: 3.694467, backward: 0.585683, update: 4.7635232022-12-02 18:10:49,545 INFO step: 9999, loss: 0.17628, reg: 1.1231e-01, speed: 98.45 steps/s, time: 10.16 s2022-12-02 18:10:49,545 INFO sample: 1.148723, forward: 3.705680, backward: 0.581249, update: 4.7123432022-12-02 18:10:59,257 INFO step: 10999, loss: 0.17163, reg: 1.1230e-01, speed: 102.97 steps/s, time: 9.71 s2022-12-02 18:10:59,257 INFO sample: 0.656959, forward: 3.726715, backward: 0.609718, update: 4.7078002022-12-02 18:11:09,316 INFO step: 11999, loss: 0.16926, reg: 1.1216e-01, speed: 99.41 steps/s, time: 10.06 s2022-12-02 18:11:09,316 INFO sample: 1.168218, forward: 3.704518, backward: 0.582222, update: 4.5948222022-12-02 18:11:19,360 INFO step: 12999, loss: 0.16832, reg: 1.1201e-01, speed: 99.57 steps/s, time: 10.04 s2022-12-02 18:11:19,360 INFO sample: 1.115536, forward: 3.718422, backward: 0.570747, update: 4.6294762022-12-02 18:11:29,518 INFO step: 13999, loss: 0.16686, reg: 1.1190e-01, speed: 98.45 steps/s, time: 10.16 s2022-12-02 18:11:29,518 INFO sample: 1.128502, forward: 3.734195, backward: 0.591375, update: 4.6942442022-12-02 18:11:39,692 INFO step: 14999, loss: 0.16598, reg: 1.1179e-01, speed: 98.29 steps/s, time: 10.17 s2022-12-02 18:11:39,692 INFO sample: 1.130593, forward: 3.707155, backward: 0.588573, update: 4.7380012022-12-02 18:11:48,977 INFO step: 15999, loss: 0.16416, reg: 1.1161e-01, speed: 107.71 steps/s, time: 9.28 s2022-12-02 18:11:48,977 INFO sample: 0.525574, forward: 3.727848, backward: 0.585805, update: 4.4356222022-12-02 18:11:59,296 INFO step: 16999, loss: 0.16211, reg: 1.1149e-01, speed: 96.91 steps/s, time: 10.32 s2022-12-02 18:11:59,296 INFO sample: 1.236274, forward: 3.714505, backward: 0.597196, update: 4.7604562022-12-02 18:12:09,569 INFO step: 17999, loss: 0.16160, reg: 1.1128e-01, speed: 97.35 steps/s, time: 10.27 s2022-12-02 18:12:09,569 INFO sample: 1.280665, forward: 3.682961, backward: 0.573797, update: 4.7261572022-12-02 18:12:19,900 INFO step: 18999, loss: 0.16120, reg: 1.1113e-01, speed: 96.80 steps/s, time: 10.33 s2022-12-02 18:12:19,900 INFO sample: 1.334101, forward: 3.705495, backward: 0.583138, update: 4.6987042022-12-02 18:12:30,103 INFO step: 19999, loss: 0.16087, reg: 1.1097e-01, speed: 98.01 steps/s, time: 10.20 s2022-12-02 18:12:30,104 INFO sample: 1.148571, forward: 3.696767, backward: 0.588128, update: 4.7605862022-12-02 18:12:40,133 INFO step: 20999, loss: 0.15984, reg: 1.1080e-01, speed: 99.71 steps/s, time: 10.03 s2022-12-02 18:12:40,133 INFO sample: 0.571947, forward: 3.864387, backward: 0.747324, update: 4.8354152022-12-02 18:12:50,892 INFO step: 21999, loss: 0.15812, reg: 1.1065e-01, speed: 92.95 steps/s, time: 10.76 s2022-12-02 18:12:50,892 INFO sample: 1.227075, forward: 3.768062, backward: 0.633664, update: 5.1177992022-12-02 18:13:01,230 INFO step: 22999, loss: 0.15765, reg: 1.1051e-01, speed: 96.74 steps/s, time: 10.34 s2022-12-02 18:13:01,230 INFO sample: 1.141957, forward: 3.835448, backward: 0.589297, update: 4.7616782022-12-02 18:13:11,617 INFO step: 23999, loss: 0.15778, reg: 1.1034e-01, speed: 96.27 steps/s, time: 10.39 s2022-12-02 18:13:11,618 INFO sample: 1.147387, forward: 3.803446, backward: 0.597208, update: 4.8296012022-12-02 18:13:11,618 INFO [evaluation] start...100%|█████████████████████████████████████████| 313/313 [00:48<00:00, 6.45it/s]2022-12-02 18:14:00,294 INFO -------------- valid result --------------2022-12-02 18:14:00,294 INFO t,r->h |MRR: 0.02051205188035965 MR: 14726.6136 HITS@1: 0.0072 HITS@3: 0.0168 HITS@10: 0.03982022-12-02 18:14:00,294 INFO h,r->t |MRR: 0.37609413266181946 MR: 2597.6142 HITS@1: 0.2858 HITS@3: 0.4226 HITS@10: 0.53822022-12-02 18:14:00,294 INFO average |MRR: 0.19830308854579926 MR: 8662.1139 HITS@1: 0.1465 HITS@3: 0.21969999999999998 HITS@10: 0.289000000000000032022-12-02 18:14:00,294 INFO -----------------------------------------2022-12-02 18:14:00,312 INFO [evaluation] finished! It takes 48.6946 sec s2022-12-02 18:14:10,745 INFO step: 24999, loss: 0.15764, reg: 1.1025e-01, speed: 16.91 steps/s, time: 59.13 s2022-12-02 18:14:10,745 INFO sample: 1.156561, forward: 3.777792, backward: 0.597010, update: 4.8911002022-12-02 18:14:20,353 INFO step: 25999, loss: 0.15691, reg: 1.1009e-01, speed: 104.08 steps/s, time: 9.61 s2022-12-02 18:14:20,353 INFO sample: 0.534036, forward: 3.759421, backward: 0.578114, update: 4.7275322022-12-02 18:14:30,873 INFO step: 26999, loss: 0.15545, reg: 1.0995e-01, speed: 95.05 steps/s, time: 10.52 s2022-12-02 18:14:30,874 INFO sample: 1.198932, forward: 3.849348, backward: 0.617739, update: 4.8445652022-12-02 18:14:41,429 INFO step: 27999, loss: 0.15578, reg: 1.0980e-01, speed: 94.74 steps/s, time: 10.56 s2022-12-02 18:14:41,430 INFO sample: 1.198625, forward: 3.796988, backward: 0.616503, update: 4.9319342022-12-02 18:14:51,571 INFO step: 28999, loss: 0.15543, reg: 1.0968e-01, speed: 98.61 steps/s, time: 10.14 s2022-12-02 18:14:51,571 INFO sample: 1.192123, forward: 3.727292, backward: 0.587189, update: 4.6248932022-12-02 18:15:01,772 INFO step: 29999, loss: 0.15530, reg: 1.0959e-01, speed: 98.03 steps/s, time: 10.20 s2022-12-02 18:15:01,772 INFO sample: 1.146616, forward: 3.724371, backward: 0.591486, update: 4.7288662022-12-02 18:15:11,202 INFO step: 30999, loss: 0.15507, reg: 1.0945e-01, speed: 106.04 steps/s, time: 9.43 s2022-12-02 18:15:11,203 INFO sample: 0.540366, forward: 3.709316, backward: 0.595906, update: 4.5748842022-12-02 18:15:21,339 INFO step: 31999, loss: 0.15347, reg: 1.0933e-01, speed: 98.67 steps/s, time: 10.14 s2022-12-02 18:15:21,339 INFO sample: 1.132487, forward: 3.742264, backward: 0.583816, update: 4.6674462022-12-02 18:15:31,489 INFO step: 32999, loss: 0.15357, reg: 1.0922e-01, speed: 98.53 steps/s, time: 10.15 s2022-12-02 18:15:31,489 INFO sample: 1.173653, forward: 3.757434, backward: 0.608012, update: 4.5997032022-12-02 18:15:41,934 INFO step: 33999, loss: 0.15379, reg: 1.0911e-01, speed: 95.74 steps/s, time: 10.44 s2022-12-02 18:15:41,935 INFO sample: 1.167904, forward: 3.821983, backward: 0.691112, update: 4.7533602022-12-02 18:15:51,966 INFO step: 34999, loss: 0.15405, reg: 1.0901e-01, speed: 99.68 steps/s, time: 10.03 s2022-12-02 18:15:51,967 INFO sample: 1.086866, forward: 3.739779, backward: 0.579035, update: 4.6165172022-12-02 18:16:01,354 INFO step: 35999, loss: 0.15383, reg: 1.0892e-01, speed: 106.53 steps/s, time: 9.39 s2022-12-02 18:16:01,354 INFO sample: 0.516522, forward: 3.756625, backward: 0.573577, update: 4.5318232022-12-02 18:16:11,496 INFO step: 36999, loss: 0.15245, reg: 1.0879e-01, speed: 98.60 steps/s, time: 10.14 s2022-12-02 18:16:11,497 INFO sample: 1.118747, forward: 3.701485, backward: 0.576978, update: 4.7364082022-12-02 18:16:14,089 INFO [evaluation] start...100%|█████████████████████████████████████████| 313/313 [00:47<00:00, 6.55it/s]2022-12-02 18:17:02,014 INFO -------------- valid result --------------2022-12-02 18:17:02,014 INFO t,r->h |MRR: 0.021342920139431953 MR: 14117.3868 HITS@1: 0.0074 HITS@3: 0.0158 HITS@10: 0.04762022-12-02 18:17:02,014 INFO h,r->t |MRR: 0.40351319313049316 MR: 2034.8442 HITS@1: 0.3066 HITS@3: 0.453 HITS@10: 0.5842022-12-02 18:17:02,014 INFO average |MRR: 0.21242806315422058 MR: 8076.1155 HITS@1: 0.157 HITS@3: 0.2344 HITS@10: 0.315799999999999972022-12-02 18:17:02,014 INFO -----------------------------------------2022-12-02 18:17:02,033 INFO [evaluation] finished! It takes 47.9434 sec s登录后复制 训练ComplExIn [8]
# ComplEx!python -u train.py --model_name ComplEx \ --data_name OpenBG500\ --data_path /home/aistudio/data/\ --save_path /home/aistudio/result/Complex \ --batch_size 1000 --log_interval 1000 --test_batch_size 16 --neg_sample_type 'chunk' --num_workers 2 \ --neg_sample_size 200 --embed_dim 400 --gamma 143.0 --lr 0.1 --optimizer adagrad --reg_coef 2e-6 \ --valid -adv --num_epoch 30 --filter_eval --print_on_screen登录后复制
---------------------------------------- Device Setting ---------------------------------------- Entity embedding place: gpu Relation embedding place: gpu-------------------------------------------------------------------------------- Embedding Setting ---------------------------------------- Entity embedding dimension: 800 Relation embedding dimension: 800----------------------------------------2022-12-02 18:19:13,502 INFO seed :02022-12-02 18:19:13,502 INFO data_path :/home/aistudio/data/2022-12-02 18:19:13,502 INFO save_path :/home/aistudio/result/Complex/complex_OpenBG500_d_400_g_143.0_e_gpu_r_gpu_l_Logsigmoid_lr_0.1_0.1_KGE2022-12-02 18:19:13,502 INFO init_from_ckpt :None2022-12-02 18:19:13,502 INFO data_name :OpenBG5002022-12-02 18:19:13,502 INFO use_dict :False2022-12-02 18:19:13,502 INFO kv_mode :False2022-12-02 18:19:13,503 INFO batch_size :10002022-12-02 18:19:13,503 INFO test_batch_size :162022-12-02 18:19:13,503 INFO neg_sample_size :2002022-12-02 18:19:13,503 INFO filter_eval :True2022-12-02 18:19:13,503 INFO model_name :complex2022-12-02 18:19:13,503 INFO embed_dim :4002022-12-02 18:19:13,503 INFO reg_coef :2e-062022-12-02 18:19:13,503 INFO loss_type :Logsigmoid2022-12-02 18:19:13,503 INFO max_steps :20000002022-12-02 18:19:13,503 INFO lr :0.12022-12-02 18:19:13,503 INFO optimizer :adagrad2022-12-02 18:19:13,503 INFO cpu_lr :0.12022-12-02 18:19:13,503 INFO cpu_optimizer :adagrad2022-12-02 18:19:13,503 INFO mix_cpu_gpu :False2022-12-02 18:19:13,503 INFO async_update :False2022-12-02 18:19:13,503 INFO valid :True2022-12-02 18:19:13,503 INFO test :False2022-12-02 18:19:13,503 INFO task_name :KGE2022-12-02 18:19:13,503 INFO num_workers :22022-12-02 18:19:13,503 INFO neg_sample_type :chunk2022-12-02 18:19:13,503 INFO neg_deg_sample :False2022-12-02 18:19:13,503 INFO neg_adversarial_sampling:True2022-12-02 18:19:13,503 INFO adversarial_temperature:1.02022-12-02 18:19:13,503 INFO filter_sample :False2022-12-02 18:19:13,504 INFO valid_percent :1.02022-12-02 18:19:13,504 INFO use_feature :False2022-12-02 18:19:13,504 INFO reg_type :norm_er2022-12-02 18:19:13,504 INFO reg_norm :32022-12-02 18:19:13,504 INFO weighted_loss :False2022-12-02 18:19:13,504 INFO margin :1.02022-12-02 18:19:13,504 INFO pairwise :False2022-12-02 18:19:13,504 INFO gamma :143.02022-12-02 18:19:13,504 INFO ote_scale :02022-12-02 18:19:13,504 INFO ote_size :12022-12-02 18:19:13,504 INFO quate_lmbda1 :0.02022-12-02 18:19:13,504 INFO quate_lmbda2 :0.02022-12-02 18:19:13,504 INFO num_epoch :302022-12-02 18:19:13,504 INFO scheduler_interval :-12022-12-02 18:19:13,504 INFO num_process :12022-12-02 18:19:13,504 INFO print_on_screen :True2022-12-02 18:19:13,504 INFO log_interval :10002022-12-02 18:19:13,504 INFO save_interval :-12022-12-02 18:19:13,504 INFO eval_interval :500002022-12-02 18:19:13,504 INFO ent_emb_on_cpu :False2022-12-02 18:19:13,504 INFO rel_emb_on_cpu :False2022-12-02 18:19:13,504 INFO use_embedding_regularization:True2022-12-02 18:19:13,504 INFO ent_dim :8002022-12-02 18:19:13,504 INFO rel_dim :8002022-12-02 18:19:13,505 INFO num_chunks :5W1202 18:19:30.590066 6301 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2W1202 18:19:30.593201 6301 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2./opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py:3983: DeprecationWarning: Op `adagrad` is executed through `append_op` under the dynamic mode, the corresponding API implementation needs to be upgraded to using `_C_ops` method. DeprecationWarning,2022-12-02 18:19:49,241 INFO step: 999, loss: 0.60153, reg: 4.4154e-02, speed: 58.71 steps/s, time: 17.03 s2022-12-02 18:19:49,241 INFO sample: 1.430161, forward: 10.054655, backward: 0.936485, update: 4.6026252022-12-02 18:20:05,297 INFO step: 1999, loss: 0.37613, reg: 7.2329e-02, speed: 62.28 steps/s, time: 16.06 s2022-12-02 18:20:05,298 INFO sample: 1.160390, forward: 9.158856, backward: 0.912364, update: 4.8142982022-12-02 18:20:21,512 INFO step: 2999, loss: 0.24694, reg: 9.4915e-02, speed: 61.67 steps/s, time: 16.21 s2022-12-02 18:20:21,512 INFO sample: 1.209912, forward: 9.202023, backward: 1.047861, update: 4.7433572022-12-02 18:20:37,306 INFO step: 3999, loss: 0.21125, reg: 9.8401e-02, speed: 63.32 steps/s, time: 15.79 s2022-12-02 18:20:37,306 INFO sample: 1.128468, forward: 9.161491, backward: 0.928098, update: 4.5649742022-12-02 18:20:52,975 INFO step: 4999, loss: 0.19485, reg: 9.8803e-02, speed: 63.82 steps/s, time: 15.67 s2022-12-02 18:20:52,976 INFO sample: 1.157603, forward: 9.174937, backward: 0.910179, update: 4.4166902022-12-02 18:21:08,161 INFO step: 5999, loss: 0.18083, reg: 9.8726e-02, speed: 65.85 steps/s, time: 15.19 s2022-12-02 18:21:08,161 INFO sample: 0.545803, forward: 9.165393, backward: 0.943437, update: 4.5195302022-12-02 18:21:23,969 INFO step: 6999, loss: 0.17592, reg: 9.8283e-02, speed: 63.26 steps/s, time: 15.81 s2022-12-02 18:21:23,969 INFO sample: 1.130007, forward: 9.167275, backward: 0.914725, update: 4.5856092022-12-02 18:21:39,742 INFO step: 7999, loss: 0.17271, reg: 9.7748e-02, speed: 63.41 steps/s, time: 15.77 s2022-12-02 18:21:39,742 INFO sample: 1.149211, forward: 9.181113, backward: 0.915362, update: 4.5158982022-12-02 18:21:55,700 INFO step: 8999, loss: 0.16979, reg: 9.7300e-02, speed: 62.66 steps/s, time: 15.96 s2022-12-02 18:21:55,701 INFO sample: 1.251904, forward: 9.145716, backward: 0.920547, update: 4.6293782022-12-02 18:22:11,576 INFO step: 9999, loss: 0.16876, reg: 9.6864e-02, speed: 62.99 steps/s, time: 15.87 s2022-12-02 18:22:11,576 INFO sample: 1.144423, forward: 9.155142, backward: 0.940488, update: 4.6238802022-12-02 18:22:26,903 INFO step: 10999, loss: 0.16426, reg: 9.6493e-02, speed: 65.24 steps/s, time: 15.33 s2022-12-02 18:22:26,903 INFO sample: 0.700834, forward: 9.158820, backward: 0.920809, update: 4.5366752022-12-02 18:22:42,639 INFO step: 11999, loss: 0.16180, reg: 9.6037e-02, speed: 63.55 steps/s, time: 15.73 s2022-12-02 18:22:42,639 INFO sample: 1.182562, forward: 9.178500, backward: 0.911814, update: 4.4522072022-12-02 18:22:58,673 INFO step: 12999, loss: 0.16086, reg: 9.5627e-02, speed: 62.37 steps/s, time: 16.03 s2022-12-02 18:22:58,673 INFO sample: 1.191781, forward: 9.131900, backward: 0.937113, update: 4.7616832022-12-02 18:23:14,506 INFO step: 13999, loss: 0.16056, reg: 9.5248e-02, speed: 63.16 steps/s, time: 15.83 s2022-12-02 18:23:14,506 INFO sample: 1.133638, forward: 9.148603, backward: 0.923684, update: 4.6163512022-12-02 18:23:30,298 INFO step: 14999, loss: 0.16007, reg: 9.4936e-02, speed: 63.33 steps/s, time: 15.79 s2022-12-02 18:23:30,298 INFO sample: 1.127198, forward: 9.169777, backward: 0.923043, update: 4.5610792022-12-02 18:23:45,355 INFO step: 15999, loss: 0.15787, reg: 9.4598e-02, speed: 66.41 steps/s, time: 15.06 s2022-12-02 18:23:45,356 INFO sample: 0.539107, forward: 9.188465, backward: 0.938140, update: 4.3804242022-12-02 18:24:01,272 INFO step: 16999, loss: 0.15570, reg: 9.4335e-02, speed: 62.83 steps/s, time: 15.92 s2022-12-02 18:24:01,273 INFO sample: 1.231464, forward: 9.158331, backward: 0.924203, update: 4.5921282022-12-02 18:24:17,703 INFO step: 17999, loss: 0.15567, reg: 9.3985e-02, speed: 60.86 steps/s, time: 16.43 s2022-12-02 18:24:17,704 INFO sample: 1.202493, forward: 9.243131, backward: 1.140142, update: 4.8342522022-12-02 18:24:33,842 INFO step: 18999, loss: 0.15539, reg: 9.3735e-02, speed: 61.97 steps/s, time: 16.14 s2022-12-02 18:24:33,842 INFO sample: 1.403193, forward: 9.119840, backward: 0.937885, update: 4.6665092022-12-02 18:24:49,843 INFO step: 19999, loss: 0.15579, reg: 9.3491e-02, speed: 62.50 steps/s, time: 16.00 s2022-12-02 18:24:49,843 INFO sample: 1.191690, forward: 9.132360, backward: 0.938562, update: 4.7262862022-12-02 18:25:04,912 INFO step: 20999, loss: 0.15388, reg: 9.3330e-02, speed: 66.36 steps/s, time: 15.07 s2022-12-02 18:25:04,912 INFO sample: 0.530497, forward: 9.179391, backward: 0.917852, update: 4.4310052022-12-02 18:25:20,711 INFO step: 21999, loss: 0.15223, reg: 9.3063e-02, speed: 63.30 steps/s, time: 15.80 s2022-12-02 18:25:20,711 INFO sample: 1.115061, forward: 9.183473, backward: 0.901767, update: 4.5891422022-12-02 18:25:36,459 INFO step: 22999, loss: 0.15214, reg: 9.2812e-02, speed: 63.50 steps/s, time: 15.75 s2022-12-02 18:25:36,459 INFO sample: 1.134653, forward: 9.177601, backward: 0.925939, update: 4.4995982022-12-02 18:25:52,546 INFO step: 23999, loss: 0.15249, reg: 9.2637e-02, speed: 62.16 steps/s, time: 16.09 s2022-12-02 18:25:52,546 INFO sample: 1.135302, forward: 9.202430, backward: 0.945634, update: 4.7937042022-12-02 18:26:08,196 INFO step: 24999, loss: 0.15223, reg: 9.2440e-02, speed: 63.90 steps/s, time: 15.65 s2022-12-02 18:26:08,196 INFO sample: 1.055773, forward: 9.212041, backward: 0.885661, update: 4.4872272022-12-02 18:26:23,154 INFO step: 25999, loss: 0.15170, reg: 9.2288e-02, speed: 66.86 steps/s, time: 14.96 s2022-12-02 18:26:23,155 INFO sample: 0.503797, forward: 9.239100, backward: 0.877107, update: 4.3292042022-12-02 18:26:38,797 INFO step: 26999, loss: 0.15009, reg: 9.2081e-02, speed: 63.93 steps/s, time: 15.64 s2022-12-02 18:26:38,797 INFO sample: 1.093831, forward: 9.204090, backward: 0.905856, update: 4.4292412022-12-02 18:26:54,377 INFO step: 27999, loss: 0.14990, reg: 9.1920e-02, speed: 64.19 steps/s, time: 15.58 s2022-12-02 18:26:54,377 INFO sample: 1.045313, forward: 9.235479, backward: 0.879603, update: 4.4113552022-12-02 18:27:09,898 INFO step: 28999, loss: 0.15004, reg: 9.1758e-02, speed: 64.43 steps/s, time: 15.52 s2022-12-02 18:27:09,898 INFO sample: 1.020164, forward: 9.246398, backward: 0.882828, update: 4.3630022022-12-02 18:27:25,464 INFO step: 29999, loss: 0.15046, reg: 9.1633e-02, speed: 64.24 steps/s, time: 15.57 s2022-12-02 18:27:25,464 INFO sample: 1.031425, forward: 9.239507, backward: 0.881955, update: 4.4043272022-12-02 18:27:40,455 INFO step: 30999, loss: 0.14995, reg: 9.1487e-02, speed: 66.71 steps/s, time: 14.99 s2022-12-02 18:27:40,455 INFO sample: 0.510480, forward: 9.214767, backward: 0.893431, update: 4.3630402022-12-02 18:27:56,214 INFO step: 31999, loss: 0.14854, reg: 9.1298e-02, speed: 63.46 steps/s, time: 15.76 s2022-12-02 18:27:56,214 INFO sample: 1.067064, forward: 9.228900, backward: 0.917439, update: 4.5361162022-12-02 18:28:11,800 INFO step: 32999, loss: 0.14825, reg: 9.1183e-02, speed: 64.16 steps/s, time: 15.59 s2022-12-02 18:28:11,800 INFO sample: 1.041273, forward: 9.236031, backward: 0.882269, update: 4.4176202022-12-02 18:28:27,995 INFO step: 33999, loss: 0.14832, reg: 9.1084e-02, speed: 61.75 steps/s, time: 16.19 s2022-12-02 18:28:27,995 INFO sample: 1.062898, forward: 9.357672, backward: 1.267072, update: 4.4962532022-12-02 18:28:44,399 INFO step: 34999, loss: 0.14902, reg: 9.0942e-02, speed: 60.96 steps/s, time: 16.40 s2022-12-02 18:28:44,400 INFO sample: 1.095686, forward: 9.385500, backward: 1.307641, update: 4.6050812022-12-02 18:29:00,103 INFO step: 35999, loss: 0.14900, reg: 9.0824e-02, speed: 63.68 steps/s, time: 15.70 s2022-12-02 18:29:00,103 INFO sample: 0.518057, forward: 9.401174, backward: 1.304467, update: 4.4694232022-12-02 18:29:16,446 INFO step: 36999, loss: 0.14712, reg: 9.0693e-02, speed: 61.19 steps/s, time: 16.34 s2022-12-02 18:29:16,446 INFO sample: 1.114131, forward: 9.371427, backward: 1.286469, update: 4.5618062022-12-02 18:29:20,440 INFO [evaluation] start...100%|█████████████████████████████████████████| 313/313 [01:30<00:00, 3.44it/s]2022-12-02 18:30:51,460 INFO -------------- valid result --------------2022-12-02 18:30:51,460 INFO t,r->h |MRR: 0.02636275626718998 MR: 12106.768 HITS@1: 0.011 HITS@3: 0.0218 HITS@10: 0.05122022-12-02 18:30:51,460 INFO h,r->t |MRR: 0.39752769470214844 MR: 1558.398 HITS@1: 0.2928 HITS@3: 0.4508 HITS@10: 0.60122022-12-02 18:30:51,461 INFO average |MRR: 0.21194522082805634 MR: 6832.583 HITS@1: 0.1519 HITS@3: 0.23629999999999998 HITS@10: 0.32622022-12-02 18:30:51,461 INFO -----------------------------------------2022-12-02 18:30:51,476 INFO [evaluation] finished! It takes 91.0356 sec s登录后复制 训练OTEIn [10]
# OTE!python -u train.py --model_name OTE \ --data_name OpenBG500\ --data_path /home/aistudio/data/\ --save_path /home/aistudio/result/transe \ --batch_size 512 --log_interval 1000 --neg_sample_type 'chunk' --neg_sample_size 256 --max_steps 10000 \ --embed_dim 400 --gamma 15.0 -adv -a 0.5 --ote_scale 2 --ote_size 20 --print_on_screen --test --filter_eval \ --lr 0.002 --optimizer adam --scheduler_interval 25000 --valid登录后复制
---------------------------------------- Device Setting ---------------------------------------- Entity embedding place: gpu Relation embedding place: gpu-------------------------------------------------------------------------------- Embedding Setting ---------------------------------------- Entity embedding dimension: 400 Relation embedding dimension: 8400----------------------------------------2022-12-03 20:17:21,327 INFO seed :02022-12-03 20:17:21,327 INFO data_path :/home/aistudio/data/2022-12-03 20:17:21,327 INFO save_path :/home/aistudio/result/transe/ote_OpenBG500_d_400_g_15.0_e_gpu_r_gpu_l_Logsigmoid_lr_0.002_0.1_KGE2022-12-03 20:17:21,327 INFO init_from_ckpt :None2022-12-03 20:17:21,328 INFO data_name :OpenBG5002022-12-03 20:17:21,328 INFO use_dict :False2022-12-03 20:17:21,328 INFO kv_mode :False2022-12-03 20:17:21,328 INFO batch_size :5122022-12-03 20:17:21,328 INFO test_batch_size :162022-12-03 20:17:21,328 INFO neg_sample_size :2562022-12-03 20:17:21,328 INFO filter_eval :True2022-12-03 20:17:21,328 INFO model_name :ote2022-12-03 20:17:21,328 INFO embed_dim :4002022-12-03 20:17:21,328 INFO reg_coef :02022-12-03 20:17:21,328 INFO loss_type :Logsigmoid2022-12-03 20:17:21,328 INFO max_steps :100002022-12-03 20:17:21,328 INFO lr :0.0022022-12-03 20:17:21,328 INFO optimizer :adam2022-12-03 20:17:21,328 INFO cpu_lr :0.12022-12-03 20:17:21,328 INFO cpu_optimizer :adagrad2022-12-03 20:17:21,328 INFO mix_cpu_gpu :False2022-12-03 20:17:21,329 INFO async_update :False2022-12-03 20:17:21,329 INFO valid :True2022-12-03 20:17:21,329 INFO test :True2022-12-03 20:17:21,329 INFO task_name :KGE2022-12-03 20:17:21,329 INFO num_workers :02022-12-03 20:17:21,329 INFO neg_sample_type :chunk2022-12-03 20:17:21,329 INFO neg_deg_sample :False2022-12-03 20:17:21,329 INFO neg_adversarial_sampling:True2022-12-03 20:17:21,329 INFO adversarial_temperature:0.52022-12-03 20:17:21,329 INFO filter_sample :False2022-12-03 20:17:21,329 INFO valid_percent :1.02022-12-03 20:17:21,329 INFO use_feature :False2022-12-03 20:17:21,329 INFO reg_type :norm_er2022-12-03 20:17:21,329 INFO reg_norm :32022-12-03 20:17:21,329 INFO weighted_loss :False2022-12-03 20:17:21,329 INFO margin :1.02022-12-03 20:17:21,330 INFO pairwise :False2022-12-03 20:17:21,330 INFO gamma :15.02022-12-03 20:17:21,330 INFO ote_scale :22022-12-03 20:17:21,330 INFO ote_size :202022-12-03 20:17:21,330 INFO quate_lmbda1 :0.02022-12-03 20:17:21,330 INFO quate_lmbda2 :0.02022-12-03 20:17:21,330 INFO num_epoch :10000002022-12-03 20:17:21,330 INFO scheduler_interval :250002022-12-03 20:17:21,330 INFO num_process :12022-12-03 20:17:21,330 INFO print_on_screen :True2022-12-03 20:17:21,330 INFO log_interval :10002022-12-03 20:17:21,330 INFO save_interval :-12022-12-03 20:17:21,330 INFO eval_interval :500002022-12-03 20:17:21,330 INFO ent_emb_on_cpu :False2022-12-03 20:17:21,330 INFO rel_emb_on_cpu :False2022-12-03 20:17:21,330 INFO use_embedding_regularization:False2022-12-03 20:17:21,331 INFO ent_dim :4002022-12-03 20:17:21,331 INFO rel_dim :84002022-12-03 20:17:21,331 INFO num_chunks :2W1203 20:17:40.313774 15093 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2W1203 20:17:40.317301 15093 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.Epoch 0: StepDecay set learning rate to 0.002.2022-12-03 20:18:30,970 INFO step: 999, loss: 3.82042, reg: 0.0000e+00, speed: 20.39 steps/s, time: 49.06 s2022-12-03 20:18:30,970 INFO sample: 2.537456, forward: 35.948929, backward: 10.029451, update: 0.5198032022-12-03 20:19:18,248 INFO step: 1999, loss: 1.52916, reg: 0.0000e+00, speed: 21.15 steps/s, time: 47.28 s2022-12-03 20:19:18,248 INFO sample: 1.869959, forward: 34.982310, backward: 9.909149, update: 0.4980442022-12-03 20:20:06,206 INFO step: 2999, loss: 0.70538, reg: 0.0000e+00, speed: 20.85 steps/s, time: 47.96 s2022-12-03 20:20:06,206 INFO sample: 2.374590, forward: 35.117019, backward: 9.945933, update: 0.5007092022-12-03 20:20:53,649 INFO step: 3999, loss: 0.45315, reg: 0.0000e+00, speed: 21.08 steps/s, time: 47.44 s2022-12-03 20:20:53,650 INFO sample: 1.874641, forward: 35.024741, backward: 10.005383, update: 0.5187272022-12-03 20:21:42,188 INFO step: 4999, loss: 0.38411, reg: 0.0000e+00, speed: 20.60 steps/s, time: 48.54 s2022-12-03 20:21:42,189 INFO sample: 2.392149, forward: 35.182606, backward: 10.389102, update: 0.5533132022-12-03 20:22:30,258 INFO step: 5999, loss: 0.22264, reg: 0.0000e+00, speed: 20.80 steps/s, time: 48.07 s2022-12-03 20:22:30,258 INFO sample: 1.858560, forward: 35.172825, backward: 10.468420, update: 0.5479932022-12-03 20:23:18,059 INFO step: 6999, loss: 0.24304, reg: 0.0000e+00, speed: 20.92 steps/s, time: 47.80 s2022-12-03 20:23:18,060 INFO sample: 1.789395, forward: 35.161546, backward: 10.283572, update: 0.5454242022-12-03 20:24:06,737 INFO step: 7999, loss: 0.19999, reg: 0.0000e+00, speed: 20.54 steps/s, time: 48.68 s2022-12-03 20:24:06,738 INFO sample: 2.542719, forward: 35.021526, backward: 10.546183, update: 0.5465232022-12-03 20:24:55,473 INFO step: 8999, loss: 0.19842, reg: 0.0000e+00, speed: 20.52 steps/s, time: 48.74 s2022-12-03 20:24:55,474 INFO sample: 2.101255, forward: 34.948719, backward: 11.074480, update: 0.5878842022-12-03 20:25:44,587 INFO step: 9999, loss: 0.20394, reg: 0.0000e+00, speed: 20.36 steps/s, time: 49.11 s2022-12-03 20:25:44,588 INFO sample: 2.529099, forward: 35.144207, backward: 10.839553, update: 0.5785732022-12-03 20:25:44,588 INFO [evaluation] start...100%|█████████████████████████████████████████| 313/313 [01:06<00:00, 4.74it/s]2022-12-03 20:26:50,790 INFO -------------- test result --------------2022-12-03 20:26:50,791 INFO t,r->h |MRR: 6.202812073752284e-05 MR: 102160.9966 HITS@1: 0.0 HITS@3: 0.0 HITS@10: 0.02022-12-03 20:26:50,791 INFO h,r->t |MRR: 4.467178769118618e-06 MR: 226216.9044 HITS@1: 0.0 HITS@3: 0.0 HITS@10: 0.02022-12-03 20:26:50,791 INFO average |MRR: 3.324764838907868e-05 MR: 164188.9505 HITS@1: 0.0 HITS@3: 0.0 HITS@10: 0.02022-12-03 20:26:50,791 INFO -----------------------------------------2022-12-03 20:26:50,819 INFO [evaluation] finished! It takes 66.2304 sec s2022-12-03 20:26:50,819 INFO [evaluation] start...100%|█████████████████████████████████████████| 313/313 [01:05<00:00, 4.77it/s]2022-12-03 20:27:56,598 INFO -------------- valid result --------------2022-12-03 20:27:56,598 INFO t,r->h |MRR: 0.014652694575488567 MR: 14285.9508 HITS@1: 0.0046 HITS@3: 0.0124 HITS@10: 0.02782022-12-03 20:27:56,598 INFO h,r->t |MRR: 0.5156063437461853 MR: 2179.2114 HITS@1: 0.396 HITS@3: 0.596 HITS@10: 0.73322022-12-03 20:27:56,598 INFO average |MRR: 0.2651295065879822 MR: 8232.5811 HITS@1: 0.2003 HITS@3: 0.30419999999999997 HITS@10: 0.38052022-12-03 20:27:56,598 INFO -----------------------------------------2022-12-03 20:27:56,624 INFO [evaluation] finished! It takes 65.8045 sec s登录后复制 预测In [ ]
%cd /home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph4KG/# 这里我们没有测试集标签,所以只需要把预测结果保存下来,这里保存预测top_10# 记得修改embed_dim,上面训练的模型嵌入维度是400,这里需要修改到一致,默认为200# 通过init_from_ckpt来加载训练的模型# 修改模式为test!python predict.py --seed 1107 --data_name OpenBG500\ --data_path /home/aistudio/data/\ --model_name TransE \ --save_path /home/aistudio/result/pred\ --embed_dim 400 \ --init_from_ckpt /home/aistudio/result/transe_fb_sgpu/transe_OpenBG500_d_400_g_19.9_e_gpu_r_gpu_l_Logsigmoid_lr_0.25_0.1_KGE\ --test登录后复制 In [1]
# 输入头实体和关系 预测尾实体%cd /home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph4KG/登录后复制
/home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph4KG登录后复制登录后复制 In [2]
import pglimport osimport paddlefrom dataset.reader import read_trigraphimport paddle.distributed as distfrom models.ke_model import KGEModelclass Args: def __init__(self): self.seed = 0 self.data_path='/home/aistudio/data/' self.save_path='/home/aistudio/result/transe/transe_OpenBG500_d_400_g_19.9_e_gpu_r_gpu_l_Logsigmoid_lr_0.25_0.1_KGE' self.init_from_ckpt='/home/aistudio/result/transe/transe_OpenBG500_d_400_g_19.9_e_gpu_r_gpu_l_Logsigmoid_lr_0.25_0.1_KGE' self.data_name='OpenBG500' self.use_dict=False self.kv_mode='kv' self.valid_percent=1. self.filter_sample=False self.filter_eval=True self.weighted_loss=False self.model_name='transe' self.ent_dim=400 self.rel_dim=400 self.ent_emb_on_cpu=False self.rel_emb_on_cpu=False self.num_chunks=5 self.cpu_lr=0.1 self.mix_cpu_gpu=False self.gamma=19.9 self.ote_size=0 self.ote_scale=1 self.use_feature=False self.use_dict=Falsedef build_model(args): trigraph = read_trigraph(args.data_path, args.data_name, args.use_dict, args.kv_mode) if args.valid_percent < 1: trigraph.sampled_subgraph(args.valid_percent, dataset='valid') use_filter_set = args.filter_sample or args.filter_eval or args.weighted_loss if use_filter_set: filter_dict = { 'head': trigraph.true_heads_for_tail_rel, 'tail': trigraph.true_tails_for_head_rel } else: filter_dict = None if dist.get_world_size() > 1: dist.init_parallel_env() model = KGEModel(args.model_name, trigraph, args) if args.init_from_ckpt: state_dict = paddle.load( os.path.join(args.init_from_ckpt, 'params.pdparams')) return model登录后复制 In [6]
def get_dict(): rel_dict = dict() ent_dict = dict() with open('/home/aistudio/data/OpenBG500/relations.dict', 'r') as f: lines = f.readlines() for line in lines: k, v = line.strip().split('\t') rel_dict[k] = v with open('/home/aistudio/data/OpenBG500/entities.dict', 'r') as f: lines = f.readlines() for line in lines: k, v = line.strip().split('\t') ent_dict[k] = v return rel_dict, ent_dictdef do_predict(model, head, relation): model.eval() rel_dict, ent_dict = get_dict() h = paddle.to_tensor([ent_dict[head]], 'int64') r = paddle.to_tensor([rel_dict[relation]], 'int64') with paddle.no_grad(): t_score = model.predict(h, r, mode='tail') t_score = t_score.argsort(descending=True) return t_score[:, :10][0]def get_key(id_): _, ent_dict = get_dict() for item in ent_dict.items(): if str(id_) in item: return item[0]# ===========示例==============args = Args()model = build_model(args)head = 'ent_238303'relation = 'rel_0320'tails = do_predict(model, head, relation)print(f"得分最高的前十个实体id为: {tails.numpy()}")print(f"{head} - {relation} --> {get_key(int(tails[0]))}")登录后复制
得分最高的前十个实体id为: [ 138 146677 221006 180932 27258 187664 145176 190357 94183 41276]ent_238303 - rel_0320 --> ent_122958登录后复制 参考文献
[1] Qu, Yincen, et al. "Commonsense Knowledge Salience Evaluation with a Benchmark Dataset in E-commerce." Findings of EMNLP 2022.
[2] Xie, Xin, et al. "From Discrimination to Generation: Knowledge Graph Completion with Generative Transformer." WWW 2022 (Poster).
[3] Deng, Shumin, et al. "Construction and Applications of Billion-Scale Multimodal Pre-trained Business Knowledge Graph." arXiv preprint arXiv:2209.15214 2022.
[4] Kadlec, Rudolf, Ondrej Bajgar, and Jan Kleindienst. "Knowledge base completion: Baselines strike back." arXiv preprint arXiv:1705.10744 (2017).
[5] Trouillon, Théo, et al. "Complex embeddings for simple link prediction." International conference on machine learning. PMLR, 2016.
[6] Sun, Zhiqing, et al. "Rotate: Knowledge graph embedding by relational rotation in complex space." arXiv preprint arXiv:1902.10197 (2019).
[7] Tang, Yun, et al. "Orthogonal relation transforms with graph context modeling for knowledge graph embedding." arXiv preprint arXiv:1911.04910 (2019).
参考博客[1] transE(Translating Embedding)详解+简单python实现 https://blog.csdn.net/shunaoxi2313/article/details/89766467
[2] 大规模开放数字商业知识图谱评测基准来了:OpenBG上线天池 https://m.thepaper.cn/baijiahao_20744274
以上就是【ai创造营】电商知识图谱链接预测的详细内容,更多请关注乐哥常识网其它相关文章!