microsoft/DialogRPT-human-vs-rand
The microsoft/DialogRPT-human-vs-rand model is a machine learning model.
About microsoft/DialogRPT-human-vs-rand
DialogRPT is a set of dialog response ranking models proposed by Microsoft Research NLP Group trained on 100 + millions of human feedback data . It can be used to improve existing dialog generation model (e.g., DialoGPT) by re-ranking the generated response candidates . The human_vs_rand score predicts how likely the response is corresponding to the given context, rather than a random response . We considered the following tasks and provided corresponding pretrained models . We also provided a Colab Notebook Demo with a training and evaluation tool . The results are based on the training, training, and evaluation of large-scale feedback data from Microsoft Research's ENCI-20 paper,