News

Databricks trained a model to predict which best-of-N result human testers would prefer, based on examples. The Databricks reward model, or DBRM, can then be used to improve the performance of ...