Coleridge Initiative

Rich Context Project

Building policy based on evidence and science is at the center of new legislation and presidential executive orders to restore trust in government. But how can agencies demonstrate the impact of their data?  The rich context project – which has been called a gamechanger by a former US chief statistician –  uses AI and Machine Learning to find What data are being used, by whom and for what purpose?   We have shown how to do this in two machine learning competitions , a workshop and a book.   Our goal is to develop an open source platform for dataset discovery from research publications, the development of a catalog of datasets produced with agency funding, and a scorecard for those data that shows the relative value of the dataset based on 

The approach is to apply machine-learning and natural language processing techniques that searches publications to

  • Find what datasets are in the publications
  • Show how they’ve been used
  • Find other experts who have used the data
  • Identify other related datasets
  • Show how the data have been used

Follow our Kaggle competition here.

Share this post

Share on facebook
Share on google
Share on twitter
Share on linkedin
Share on pinterest
Share on print
Share on email