Official US Government Icon

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure Site Icon

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

General

Legacy ID
1

Secure Data Commons - Getting Data to SDC

Getting Data to SDC

Upload, store, and access your datasets quickly, easily, and securely.

The Secure Data Commons (SDC) enables collaborative but controlled integration and analysis of research data at the moderate sensitivity level, including personally identifiable information (PII) and confidential business information (CBI).

Within the SDC, you can upload data in near real time throughout your project. You can develop common data formats and fix issues as data starts to be generated.

As a data provider, you define the terms of access and grant/deny access to specific SDC users or groups. You can also control the type of derived data users can export or copy from the system.

Features That Matter

Frequent data transfers

  • Real-time (e.g., streaming) data ingestion
  • Batch (e.g., daily or weekly) data ingests
  • Ad hoc (occasional) uploads

Cloud-based data management 

  • Strong security controls to export  derived data out 
  • Set levels of data curation and warehousing to support analysis
  • Validation of incoming datasets in near real time
  • Real-time data ingestion streams and batch uploads

Strong access controls

  • Multifactor authentication and personal identity verification (PIV) card integration
  • Secure workflow for data import and export
  • Controlled access to specific datasets and metadata by individual users and teams

 

Built-in data analysis tools

  • Preestablished workstations with open source tools
  • Ability to import and share code between researchers

Effective team collaboration

  • Project-level controls and teams
  • Shared team internal code repositories

Multiple data formats

  • Raw
  • Curated
  • Published
 

What's Next?

User the steps below to bring your data into the SDC:

Contact the SDC team for a discovery meeting to discuss data restrictions, Data Provider agreement, and user access level to your data within the SDC. Review the Data Provider User Guide.

Once the data is in the SDC, work with the SDC team to monitor the quality and quantity of your data.

Our Enablement Services team offers custom upgrades to help to support your research mission needs along the way.

Last updated October 2023

Secure Data Commons - Enablement Services Program

Enablement Services Program

The Enablement Services Program provides options for you (project owners, data providers, data analysts) and your project team to optimize use of the Secure Data Commons (SDC). Project teams start with five offerings at the baseline category. You may select upgrades (silver or gold) for each offering to meet your project needs.

Still new to the SDC? You can learn more about what the SDC is.

Program Offerings

  • Project Onboarding and Training

  • Uploading Data

  • Data Cleanup

  • Analyst Setup and Collaboration

  • Data Analysis Consulting

Service Category Overview

Every project starts with baseline services for each offering. You may wish to upgrade to silver or gold services for additional assistance.

Baseline

Services

($)


Every project begins here, which includes:

  • Overview web session
  • Access to training materials
  • Help for uploading data
  • Analyst workstation setup
  • Assistance with basic queries
  • Cross-project collaboration

Silver

Services

($$)


Silver services include all baseline services, plus:

  • Consultations with data analysts
  • Pre-planning report
  • Sessions about project costs
  • Data preparation for queries
  • Analytical tool installation and support
  • Database optimization

Gold

Services

($$$)


With gold services, the premier option, you receive all silver services, plus:

  • On-site consultations
  • Building analytical tools
  • In-person onboarding
  • Specialized training
  • Data documentation
  • Automating data upload scripts
  • Performance boost of analytical models

 

Available Program Offerings

See more information about each program offering below.

Contact the SDC team to get a quote, upgrade or change your services. If you are an independent analyst not on a project team, email us to find out how the offerings can work for you.

Email SDC Team

Project Onboarding and Training

Offering Details Overview web session with your team
Baseline ($)
Checkmark
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering DetailsDocumentation and training materials
Baseline ($)
Checkmark
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering DetailsDiscovery sessions about business needs and costs
Baseline ($)
Checkmark
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Project pre-planning assessment report
Baseline ($)
 
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Multiple sessions to estimate and refine project costs
Baseline ($)
 
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Dedicated onboarding support
Baseline ($)
 
Silver ($$)
 
Gold ($$$)
Checkmark
Offering Details 2-day training with your team
Baseline ($)
 
Silver ($$)
 
Gold ($$$)
Checkmark

Uploading Data

Working with data providers to safely upload data into the SDC, regardless of the size

Get Full Offering Details (PDF 270 KB)

Offering Details Process setup, provide user guide for secure upload
Baseline ($)
Checkmark
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Refining or development of your external scripts
Baseline ($)
 
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Build metadata, dictionaries for your project
Baseline ($)
 
Silver ($$)
 
Gold ($$$)
Checkmark
Offering Details Software development for automating data upload
Baseline ($)
 
Silver ($$)
 
Gold ($$$)
Checkmark

Data Cleanup

Support to ensure quality, performance, and reliability of your data entering the SDC

Get Full Offering Details (PDF 270 KB)

Offering Details Setting up data for easy queries
Baseline ($)
Checkmark
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Making complex datasets available to query
Baseline ($)
 
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Error detection and reporting during data upload
Baseline ($)
 
Silver ($$)
 
Gold ($$$)
Checkmark
Offering Details Building automated tools to fix errors for the upload process
Baseline ($)
 
Silver ($$)
 
Gold ($$$)
Checkmark

Analyst Setup and Collaboration

Help to organize and improve your analyst's research capabilities while using the SDC

Get Full Offering Details (PDF 270 KB)

Offering Details Analyst workstations that come with standard tools
Baseline ($)
Checkmark
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Help with installation of other approved open-source software
Baseline ($)
Checkmark
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Setting up collaboration tools based on individual needs
Baseline ($)
 
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Assist install of new analytical tools on the SDC
Baseline ($)
 
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Create custom analyst software, up to two per year
Baseline ($)
 
Silver ($$)
 
Gold ($$$)
Checkmark

Data Analysis Consulting

Tailored advice, including technical support and resources, to advance the performance of your models and analytical outputs

Get Full Offering Details (PDF 270 KB)

Offering Details Specific guidance on using SDC infrastructure
Baseline ($)
Checkmark
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Regular consultations to review challenges and SDC updates
Baseline ($)
 
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Database optimization for your project needs
Baseline ($)
 
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Reviewing and improving data scripts and queries
Baseline ($)
 
Silver ($$)
Checkmark
Gold ($$$)
Checkmark
Offering Details Support to boost the performance of data analyst models
Baseline ($)
 
Silver ($$)
 
Gold ($$$)
Checkmark

Last updated November 2021

Secure Data Commons - Current Data

Featured Data

The USDOT Secure Data Commons (SDC) platform features datasets for the following projects - check back soon as new projects (datasets) are added:

Waze icon

Waze for Cities Program

 

Federal Highway Administration (FHWA) Highway Safety Information System (HSIS)

Federal Aviation Administration Functional Genomics Research

 

 


Waze for Cities Program

The U.S. Department of Transportation's (USDOT) Safety Data Initiative (SDI) is a cross-cutting, collaborative effort within DOT, led by the Office of the Secretary of Transportation. The intent of SDI is to build on and enhance safety efforts related to data, analysis, and policy making. SDI is integrating Waze data with transportation data to develop rapid crash indicators.

 


Federal Aviation Administration (FAA) Functional Genomics Research 

FAA Functional Genomics team uses the SDC to perform large-scale analyses of data generated from research on biological specimens, primarily derived from human subjects research. The data generated may include genetic sequence and other molecular data, physiology, demographics, study conditions, and performance metrics. This is not available to the public through the SDC at this time. FAA Genomics data owners will make subsets of data accessible in alternative locations, to the extent compatible with subject consent and Institutional Review Board concurrence as appropriate, after review and consideration of proper privacy and data quality elements. For instance, sequence data when ready may be hosted in secure-access repositories appropriate to those types of data.

 

FHWA Highway Safety Information System (HSIS)

The Federal Highway Administration (FHWA) developed the Highway Safety Information System (HSIS) to support safety research programs and provides input for program policy decisions. HSIS is a roadway-based system that provides quality data on a large number of crash, roadway, and traffic variables. The crash, roadway inventory, and traffic volume data are acquired annually from a select group of States.

FHWA provides this data to researchers upon request through the HSIS webpage. Educators who wish to use HSIS data for instructional purposes in a road safety course should contact HSIS staff directly at Ana.Eigen@dot.gov. For more information https://highways.dot.gov/research/safety/hsis

December 2024

Secure Data Commons - Conducting Analysis

Conducting Analysis in SDC

As a data analyst, work within the USDOT Secure Data Commons (SDC) to share code and data, upload datasets, and export approved derived analyses. Through the SDC, you can:

  • Share code and data with other analysts
  • Upload your own datasets
  • Export approved derived analysis

We'll provide you with a cloud-based workstation with preloaded programming environments and software that grants you access to the data lake and data warehouse. The workstation also includes commercially available tools - no local software or tool installation needed!

Analytical Tools and Query Languages Supported

The SDC platform provides on-demand access to popular programming and statistical tool packages for cloud-based processing (for experienced analysts). Other, nonstandard software can be installed upon request, both individually and across user groups. For software requiring special licenses, analysts may provide their own existing licenses.

Analytical Tools 

 

 

Custom options available upon request

 

Types of Datasets

The SDC platform provides a data lake of transportation-related structured, semi-structured, and unstructured datasets that are stored in raw, curated, and published formats. Each dataset has different data agreements based on the complexity and sensitivity of the data. Access to specific data is approved by data providers - learn more about specific dataset formats below:

Raw Datasets

Raw datasets are unaltered data are stored in their native/original "as-is" format. Uploads can be continuous through streaming sources (i.e., APIs or sensors) or through one-time uploads from external sources. This data can be structured (databases, logs, financial data), semi-structured (HTML, XML, RDF, CSV), or unstructured (images, PDFs, Word documents). Raw data cannot be copied or exported out of SDC.

Curated

Data curation is the organization and integration of raw data collected from various sources. The curated data is annotated, so that the value of the data is maintained and made available for reuse and preservation. During the curation process, data is transformed from unstructured and semi-structured formats to structured formats; and data deduplication, obfuscation, and cleansing processes are conducted - resulting in high-quality data that enables researchers to elicit meaningful insights.

Published

Researchers create published datasets to disclose their research and allow other users to verify and reuse the data beyond their original purpose. Published datasets are a result of combining analyses on curated datasets in the SDC platform with other datasets or algorithms owned or created by a researcher or data scientist.

What's Next?

As a data analyst planning to do analysis in the SDC, use the steps below to get started.

Download the access request form , fill out the required details, and send an email to sdc-support@dot.gov. Once approved, we will send you an email with the instructions for accessing the platform.

Follow the instructions in the Welcome Email from the SDC. Review the Research Analyst User Guide .

Our Enablement Services team offers custom upgrades to help your project team along the way

Last updated October 2021

Secure Data Commons - About

What is the SDC?

The Secure Data Commons (SDC) is a cloud-based analytics platform that enables traffic engineers, researchers, and data scientists to access, analyze, and connect transportation-related datasets. The U.S. Department of Transportation (USDOT) created the SDC to provide a secure platform for sharing and collaborating on research, tools, algorithms, and analysis involving moderate sensitivity level (PII & CBI) datasets using commercially available tools, without needing to install tools or software locally.

The SDC offers a common platform for innovative data analysis and sharing of results that allows research to cut across the data silos.

 

How It works

With a robust security architecture that secures datasets and user workstations, the SDC:

Leverages cloud capabilities to share complex (high volume, velocity, or variety) transportation datasets with the transportation research community

Provides authorized access to users through a data use agreement with revocable access terms to protect the sensitivity of the data

Enables scalable data storage and analysis and user access protocol via cloud-based platforms

 

Ensures that sensitive data are protected through adherence to USDOT's security standards for information technology

Provides users with predefined tools for data analysis and encourages the creation of custom toolsets, open sharing of code, and addition of datasets among the user community

Provides users with capability to curate datasets from multiple data sources

 

SDC Capabilities

What differentiates the SDC from other platforms?

SDC Capabilities are Data and Visualization, Security, Open Source Collaboration, Input that Matters, and Support Team

 

SDC Platform vs Other Cloud Platforms

The Secure Data Commons differs from other cloud platforms that are readily available to you.

SDC Platform versus DOT Managed Cloud versus Cloud.gov

 

SDC Cloud Computing Model

These cloud infrastructure components enable your research needs.

SDC Research and Analytics User Support

Provide a workstation for users to query multiple datasets from the SDC web portal

  • Bring your own data, and combine with existing datasets from Waze and other 3rd party datasets
  • Approve access to specific datasets
  • Preload analytical tools to encourage open sharing of code and datasets

SDC Platform Engineering Services

Provide a cloud platform for research and analysis involving moderate sensitivity level datasets

  • Maintain ATO with security policies and regulations
  • Monitor to rapidly detect data anomalies
  • Use cutting-edge AWS managed services to support innovative analytical data solutions

DOT Enterprise Cloud: Amazon Web Services (AWS)

Provide SDC team access to cloud computing resources to enhance and maintain the platform

  • Enable networking and active directory within DOT environment
  • Allow SDC Team to provision and control of their own infrastructure and managed services

 

For more information on Secure Data Commons, please see SDC Capabilities (pdf).

Last updated March 2023