Architecting Synthetic Data Pipelines for Enterprise AI

Hardware sensors fail manufacturers in edge cases but AI systems can’t step in without sufficient training data. However edge cases also present hurdles to creating effective training sets. This is where synthetic data can help.

I was the Lead Designer on the team incubating Synthetic Data at Microsoft.

Impacts:

Co-designed high-quality synthetic data pipeline which was used by 5 B2B customers and drove millions of dollars in contracts
Created tooling that was used to augment and supplement real-world datasets in defect and equipment status scenarios
Reduced time to train and increased model reliability in factory floor computer vision applications

Responsibilities:

Collaborated closely with Product Managers and customers to define the core problem and establish focused direction for the project.
Worked closely with engineers to map out a robust and efficient pipeline
Collaborated with researchers to define detailed personas with clear requirements
Mapped ideal user journey to ensure a seamless and intuitive user experience, and to identify potential pain points ahead of testing
Tested multiple prototypes to iterate and refine the product
Managed UI designers through iterative loops
Identified and clarified any misalignments within the team, and working to resolve any issues
Worked closely with the launch team to ensure a seamless handoff from development to launch