Official US Government Icon

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure Site Icon

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

General

Legacy ID
10811

Secure Data Commons - Conducting Analysis

Conducting Analysis in SDC

As a data analyst, work within the USDOT Secure Data Commons (SDC) to share code and data, upload datasets, and export approved derived analyses. Through the SDC, you can:

  • Share code and data with other analysts
  • Upload your own datasets
  • Export approved derived analysis

We'll provide you with a cloud-based workstation with preloaded programming environments and software that grants you access to the data lake and data warehouse. The workstation also includes commercially available tools - no local software or tool installation needed!

Analytical Tools and Query Languages Supported

The SDC platform provides on-demand access to popular programming and statistical tool packages for cloud-based processing (for experienced analysts). Other, nonstandard software can be installed upon request, both individually and across user groups. For software requiring special licenses, analysts may provide their own existing licenses.

Analytical Tools 

 

 

Custom options available upon request

 

Types of Datasets

The SDC platform provides a data lake of transportation-related structured, semi-structured, and unstructured datasets that are stored in raw, curated, and published formats. Each dataset has different data agreements based on the complexity and sensitivity of the data. Access to specific data is approved by data providers - learn more about specific dataset formats below:

Raw Datasets

Raw datasets are unaltered data are stored in their native/original "as-is" format. Uploads can be continuous through streaming sources (i.e., APIs or sensors) or through one-time uploads from external sources. This data can be structured (databases, logs, financial data), semi-structured (HTML, XML, RDF, CSV), or unstructured (images, PDFs, Word documents). Raw data cannot be copied or exported out of SDC.

Curated

Data curation is the organization and integration of raw data collected from various sources. The curated data is annotated, so that the value of the data is maintained and made available for reuse and preservation. During the curation process, data is transformed from unstructured and semi-structured formats to structured formats; and data deduplication, obfuscation, and cleansing processes are conducted - resulting in high-quality data that enables researchers to elicit meaningful insights.

Published

Researchers create published datasets to disclose their research and allow other users to verify and reuse the data beyond their original purpose. Published datasets are a result of combining analyses on curated datasets in the SDC platform with other datasets or algorithms owned or created by a researcher or data scientist.

What's Next?

As a data analyst planning to do analysis in the SDC, use the steps below to get started.

Download the access request form , fill out the required details, and send an email to sdc-support@dot.gov. Once approved, we will send you an email with the instructions for accessing the platform.

Follow the instructions in the Welcome Email from the SDC. Review the Research Analyst User Guide .

Our Enablement Services team offers custom upgrades to help your project team along the way

Last updated October 2021

Secure Data Commons - About

What is the SDC?

The Secure Data Commons (SDC) is a cloud-based analytics platform that enables traffic engineers, researchers, and data scientists to access, analyze, and connect transportation-related datasets. The U.S. Department of Transportation (USDOT) created the SDC to provide a secure platform for sharing and collaborating on research, tools, algorithms, and analysis involving moderate sensitivity level (PII & CBI) datasets using commercially available tools, without needing to install tools or software locally.

The SDC offers a common platform for innovative data analysis and sharing of results that allows research to cut across the data silos.

 

How It works

With a robust security architecture that secures datasets and user workstations, the SDC:

Leverages cloud capabilities to share complex (high volume, velocity, or variety) transportation datasets with the transportation research community

Provides authorized access to users through a data use agreement with revocable access terms to protect the sensitivity of the data

Enables scalable data storage and analysis and user access protocol via cloud-based platforms

 

Ensures that sensitive data are protected through adherence to USDOT's security standards for information technology

Provides users with predefined tools for data analysis and encourages the creation of custom toolsets, open sharing of code, and addition of datasets among the user community

Provides users with capability to curate datasets from multiple data sources

 

SDC Capabilities

What differentiates the SDC from other platforms?

SDC Capabilities are Data and Visualization, Security, Open Source Collaboration, Input that Matters, and Support Team

 

SDC Platform vs Other Cloud Platforms

The Secure Data Commons differs from other cloud platforms that are readily available to you.

SDC Platform versus DOT Managed Cloud versus Cloud.gov

 

SDC Cloud Computing Model

These cloud infrastructure components enable your research needs.

SDC Research and Analytics User Support

Provide a workstation for users to query multiple datasets from the SDC web portal

  • Bring your own data, and combine with existing datasets from Waze and other 3rd party datasets
  • Approve access to specific datasets
  • Preload analytical tools to encourage open sharing of code and datasets

SDC Platform Engineering Services

Provide a cloud platform for research and analysis involving moderate sensitivity level datasets

  • Maintain ATO with security policies and regulations
  • Monitor to rapidly detect data anomalies
  • Use cutting-edge AWS managed services to support innovative analytical data solutions

DOT Enterprise Cloud: Amazon Web Services (AWS)

Provide SDC team access to cloud computing resources to enhance and maintain the platform

  • Enable networking and active directory within DOT environment
  • Allow SDC Team to provision and control of their own infrastructure and managed services

 

For more information on Secure Data Commons, please see SDC Capabilities (pdf).

Last updated March 2023

Promotion Statistics, FY 2020

This spreadsheet provides promotion statistics for FY 2020. Demographics are particularly useful to managers in the workforce planning process. This data is also used by Congressional staffers and others who have an interest in the size and makeup of the federal workforce, and by public and...

Promotion Statistics, 4th Quarter FY 2020

This spreadsheet provides promotion statistics for the fourth quarter of FY 2020. Demographics are particularly useful to managers in the workforce planning process.  This data is also used by Congressional staffers and others who have an interest in the size and makeup of the federal...

Separation Statistics, FY 2020

This spreadsheet provides separation statistics for FY 2020. Demographics are particularly useful to managers in the workforce planning process.  This data is also used by Congressional staffers and others who have an interest in the size and makeup of the federal workforce, and by...

Separation Statistics, 4th Quarter FY 2020

This spreadsheet provides separation statistics for the fourth quarter of FY 2020. Demographics are particularly useful to managers in the workforce planning process.  This data is also used by Congressional staffers and others who have an interest in the size and makeup of the federal...

Appointment Statistics, FY 2020

This spreadsheet provides appointment statistics for FY 2020. Demographics are particularly useful to managers in the workforce planning process.  This data is also used by Congressional staffers and others who have an interest in the size and makeup of the federal workforce, and by...

Appointment Statistics, 4th Quarter FY 2020

This spreadsheet provides appointment statistics for the fourth quarter of FY 2020. Demographics are particularly useful to managers in the workforce planning process.  This data is also used by Congressional staffers and others who have an interest in the size and makeup of the federal...