Collaborators in the ADRF
The biggest economic and social issues today require using confidential data from multiple organizations to be addressed. For example, examining the impact of access to jobs and neighborhood characteristics on the earnings and employment outcomes of ex-offenders and social benefit recipients on their subsequent recidivism or retention on welfare requires data from at least four different agencies (Corrections, Human Services, Labor and Housing). The Coleridge Initiative and collaborators have developed technical and human approaches to enable community access to and use of data on human subjects to make this kind of research possible. Join our collaboration to help address pressing social problems.
- Have secure remote access to their administrative data in the ADRF
- Make use of the Data Stewardship web application that manages data workflows, including reports on access and use
- Participate in class training in data science methods
- Securely combine data with data from other agencies for approved projects.
Become a Collaborator
Joining this effort will allow your agency to help lead the federal data strategy, its practical implementation and value at both the state and national levels. You will host your data through the ADRF which guarantees secure access to administrative records and related data. You will analyze your data on the ADRF to generate the evidence required to develop policies, put in place procedures, share derived data, and take other actions to enable secure data sharing. You can use the reporting tool to support credible outreach to your various constituencies.
The primary ADRF contact is Orande Peoples.
Data transfer is handled through Secure File Transfer Protocol (SFTP). Our Data Transfer Team will generate login credentials specific to your agency, which will generate a welcome email, and then you will be required to change the password. If you prefer to use a graphical interface to initiate and monitor file transfer you may download tools like WinSCP or Cyberduck, both free and publicly available. Otherwise you can initiate the transfer through any SFTP command line tool installed with your operating system. Once your transfer is complete, please inform our Data Transfer Team.
When transferring data to the ADRF, please fill out and return the completed Data Transfer Form (please note there are two worksheets). On the Dataset Core Description worksheet, please fill in all relevant, mandatory metadata fields in the column: Value. On the Data Fields worksheet, describe the fields in your data. Information from the metadata form will be made available in the Data Explorer on the ADRF.
Yes, data is safe during the transfer process.
Data will be encrypted for transmission to the ADRF using a unique public-private key pair for each transfer using GPG. This ensures that the data will be encrypted during transfer and rest until getting decrypted on the ADRF. In order to encrypt data, you may download the software GPG (for Mac or Windows). This software has a Graphic User Interface (GUI) which allows you to select the public key received by the Coleridge Initiative, in addition to the file you wish to encrypt. Once encrypted, the data is ready for transfer.
The data provided will be de-identified by applying a HMAC (Hash-based Message Authentication Code) algorithm to key variables, such as first name, middle name, last name, and social security number.
The ADRF HMAC uses a “salt” to create an encryption key first, that is then used to encrypt the value of the variable that needs to be hashed. The salt is created by generating 32 random hexadecimal digits which are converted to integers and then hashed using SHA256.
Afterwards the encrypted value is hashed using SHA256. The hashing is one way and cannot be ‘decrypted’, however, it will always lead to the same resulting hash value for a given value. This allows joins of hashed values in two different tables (data from different agencies). The Coleridge Initiative will provide you with the hashing program, unit tests, an end-to-end test harness, and a sample data file used in the tests before hashing.
By default, you and all Coleridge staff outlined in the agreement will have access to the data. If your data is used for teaching purposes, class participants will have access to the data for the time period of the class once they’ve signed the respective agreements. If you allow researchers to access your data you will have full control over who can access your data. Your dataset will be assigned a data steward within your organization who will be the point of contact for all access requests. Access to data will only be granted after approval from the data steward.
The ADRF is FedRAMP certified and follows an extensive security protocol. For more information, view the description of the data management plan, and you can request the technical FedRAMP documentation on all the security protocols in place.
General requirements for data stewardship are specified in Title III of the Evidence-Based Policymaking Act of 2018, “PART D – ACCESS TO DATA FOR EVIDENCE, § 3583.
Application to access data assets for developing evidence, (a) Standard Application Process” and NIST Special Publication 800-53 Revision 4 standards document – “Security and Privacy Controls for Federal Information Systems and Organizations”. These documents state that agencies shall follow a defined process to ensure full transparency which is guided by an agency official with statutory or operational authority for specified information and responsibility for establishing the controls for its generation, collection, processing, dissemination, and disposal.
The ADRF is designed to address the core data stewardship functionalities: meeting the information security requirements and operational responsibilities of data stewards, streamlining the data request and approval process, and monitoring and reporting about the usage of sensitive data. The initial step in implementing a data governance framework involves defining the owners or custodians of data assets within an agency, in a process called data stewardship. Processes and workflows are defined to formalize how the data will be stored, archived, backed up, and protected from mishaps, theft or attacks. A set of standards and procedures are developed that define how data is to be used by authorized personnel. Controls and audit procedures are put into place to ensure ongoing compliance with internal data policies and external government regulations, to guarantee that data gets used in a consistent manner across multiple applications.
The Data Stewardship module is implemented as a web-portal which can be accessed by approved users. A user submits a project proposal using the Project Request workflow; the proposal includes the datasets to be used, the project members, and other information such as start and end dates. Before a request gets submitted to data stewards, members of the project must sign and upload any required non-disclosure agreements (NDAs) for their requested datasets. The request is then routed to the designated data stewards for evaluation. If approved, ADRF staff ensure that each user has completed the required security trainings before activating the project. Once a project is active, the Data Stewardship module includes an additional workflow for Monitoring and Reporting. These monitoring tools give data providers visibility into how their data is being used. Currently the ADRF platform logs all data access so that data owners can request to see how many people, on which projects, have accessed their data over a given period of time. Please refer to the official documentation for further details.