Braden Hancock
Ph.D. Candidate - Machine Learning
I'm a third-year Computer Science Ph.D. student and NSF Fellow at Stanford University. My goal is to make it possible for anyone–regardless of programming ability or machine learning expertise–to create state-of-the-art machine learning systems in new domains in hours instead of months.
Research interests:
machine learning, weak supervision, information extraction, program synthesis

Education

Stanford University
Ph.D. Computer Science (Jun. 2020)
Advisor: Chris Ré
Machine Learning Emphasis (GPA 4.00)
Brigham Young University
B.S. Mechanical Engineering, Mathematics Minor
Advisor: Chris Mattson
Valedictorian, summa cum laude (GPA 4.00)

Experience

Stanford University
2015-Present
Groups: Stanford DAWN, Stanford NLP Group, Stanford InfoLab
Mentors: Chris Ré, Percy Liang
Topics: Weak supervision, natural language supervision, information extraction
Google
Summer 2017
Groups: Google Brain, Google Search
Mentors: Hongrae Lee, Cong Yu, Quoc Le
Topics: Abstractive summarization of semi-structured content, recursive neural networks
MIT Lincoln Laboratory
Summers 2014-2015
Group: Computing & Analytics
Mentors: Vijay Gadepally, Jeremy Kepner
Topics: Recommender systems for Department of Defense applications, cryptography
Johns Hopkins University
Summer 2013
Group: Human Language Technology Center of Excellence
Mentors: Mark Dredze, Glen Coppersmith
Topics: Public health trend extraction from social media, topic modeling
Brigham Young University
2011-2015
Group: Design Exploration Research Group
Mentor: Chris Mattson
Topics: Multi-objective optimization, design space exploration
Air Force Research Laboratory
Summer 2011
Group: Turbine Engine Division
Mentor: John Clark
Topics: Evolutionary algorithms for optimization, turbine engine simulation

Research

Babble Labble: Learning from Natural Language Explanations
We show that by collecting explanations for why annotators give the labels they do and parsing these into executable functions, we can generate large "good enough" training sets from unlabeled data and improve downstream discriminative model performance.
In progress
Fonduer: Knowledge Base Construction from Richly Formatted Data
We introduce a information extraction framework that utilizes multiple representations of the data (structural, tabular, visual, and textual) to achieve state-of-the-art performance in four real-world extraction taks. Our framework is currently in use commercially at Alibaba and with law enforcement agencies fighting online human trafficking.
arXiv pre-print 2017
Snorkel: A System for Fast Training Data Creation
Snorkel is a system for rapidly creating, modeling, and managing training data. It is the flagship implementation of the new data programming paradigm for supporting weak supervision resources. Development is ongoing, with collaborators and active users at over a dozen major technical and medical organizations. As one of the core contributors to Snorkel, I have implemented many of my other research products as extensions to this framework.
A Machine-Compiled Database of Genome-Wide Association Studies
Using the multi-modal parsing and extraction tools from Fonduer and learning and inference tools from Snorkel, we construct a knowledge base of genotype/phenotype associations extracted from the text and tables in ~600 open-access papers from PubMed Central. Our system expands existing manually curated databases by approximately 20% with 92% precision.
Bio-Ontologies 2017
Collective Supervision of Topic Models for Predicting Surveys with Social Media
We use topic models to correlate social media messages with survey outcomes and to provide an interpretable representation of the data. Rather than rely on fully unsupervised topic models, we use existing aggregated survey data to inform the inferred topics, a class of topic model supervision referred to as collective supervision.
AAAI 2016
Recommender Systems for the Department of Defense and Intelligence Community
With an internal committee of 20 MIT and DoD researchers, I spearheaded the construction of this report, which formalizes the components and complexities of recommender systems and surveys their existing and potential uses in the Department of Defense and U.S. Intelligence community.
MITLL Journal 2016
L-dominance: An approximate-domination mechanism for adaptive resolution of Pareto frontiers
We propose a mechanism called L-dominance (based on the Lamé curve) which promotes adaptive resolution of solutions on the Pareto frontier for evolutionary multi-objective optimization algorithms.
SMO Journal, AIAA ASM 2015, Honors Thesis
Best Student Paper
Reducing Shock Interactions in a High Pressure Turbine via 3D Aerodynamic Shaping
We show that the shock wave reflections inside a turbine engine can be approximated by calculating the 3D surface normal projections of the airfoils. Using a genetic algorithm, We produce superior airfoil geometries (with respect to high cycle fatigue failure) four orders of magnitude faster than the traditional CFD-based approach.
AIAA Journal, AIAA ASM 2014
Best Student Paper
The Smart Normal Constraint Method for Directly Generating a Smart Pareto Set
We introduce the Smart Normal Constraint (SNC) method, the first method capable of directly generating a smart Pareto set (a Pareto set in which the density of solutions varies such that regions of significant tradeoff have the greatest resolution). This is accomplished by iteratively updating an approximation of the design space geometry, which is used to guide subsequent searches in the design space.
SMO Journal, AIAA MDO 2013
Usage Scenarios for Design Space Exploration with a Dynamic Multiobjective Optimization Formulation
We investigate three usage scenarios for formulation space exploration, building on previous work that introduced a new way to formulate multi-objective problems, allowing a designer to change up update design objectives, constraints, and variables in a fluid manner that promotes exploration.
RiED Journal, ASME DETC 2012
Best Paper

Last updated on 24 Aug 2017.