Administrative Data Research Facility
The Administrative Data Research Facility (ADRF) is a secure computing platform designed to support federal, state and local government agencies that wish to share and analyze confidential datasets. It was established by the Census Bureau with funding from the Office of Management and Budget to inform the decision-making of the Commission on Evidence-based Policy.
The ADRF has provided secure access to over 100 confidential government datasets from 50 different agencies at all levels of government. The ADRF has received Authorization to Operate from the Census Bureau and the US Department of Agriculture, has achieved a FedRAMP Moderate approval and is listed on the FedRAMP Marketplace. It won a 2018 Government Innovation Award.
Safe Data Strategy
The platform implements the Five Safes framework to ensure the safe and responsible use of data.
Approved, trained researchers
Approved projects consistent with agency mission
Only access data in a secure environment
Only the minimum data required for a project is made available
Review to limit disclosure before data are released
The ADRF is a cloud-based platform operating inside the AWS GovCloud (US) Regions built to host sensitive Controlled Unclassified Information. The ADRF’s container-based infrastructure provides the flexibility to scale workloads as necessary based on specific agency demand and allows access to best of breed technology to meet changing needs.
The ADRF is designed to provide secure methods of data transfer of agency micro‐data, especially where Personally Identifiable Information (PII) is included in the data set. Only agency identified and authorized personnel have access to the Secure Data Transfer system. Data may be transferred in to the ADRF platform only. Project data (results) may be exported out only after the ADRF Privacy Officer reviews and approves the content to be transferred.
The ADRF also leverages open source container-based infrastructure (Docker & Kubernetes) to ensure that there is no shared environment between projects, i.e., complete isolation and controlled access to resources.
The data management process begins with the identification of an authorized agency Data Steward. This person (or persons) is responsible for managing access permissions to data sets provided by their agency. The Data Stewardship Module provides Data Stewards with information about how their data sets are being used. The module is accessible from outside the ADRF secure environment and is protected via user login credentials and two factor authentication.
Access to data is managed by individual user identity via LDAP (Lightweight Directory Access Protocol) and Access Control Lists. Based on the access granted by the Data Steward users are automatically granted permissions to data set schema’s and/or IAM roles to access the following platforms:
- PostgreSQL: an open source object-relational database management platform designed to support an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions
- Athena: an interactive query service that is based on the Presto open-source distributed SQL query engine and is designed to analyze petabyte-scale data on Amazon’s cost effective S3 storage platform. Its server-less design means there are no idle compute costs and data requires little to no transformation before it can be queries by Athena.
If you are part of a training program, an account will be created once you have been accepted into the program.
If you are an independent researcher, please contact us; additionally, you can refer to our pricing structure.
Once you are on an approved research project, you will have access to a project workspace that contains analytical tools and datasets.