Date

Speaker

Title

27 Sep 2016 
Le Song (Gatech) 
Discriminative Embedding of Latent Variable Models for Structured Data
Time:
4:005:00pm
Abstract:
Structured data, such as sequences, trees, graphs and hypergraphs, are prevalent in a number of interdisciplinary areas such as network analysis, knowledge engineering, computational biology, drug design and materials science. The availability of large amount of such structured data has posed great challenges for the machine learning community. How to represent such data to capture their similarities or differences? How to learn predictive models from a large amount of such data, and efficiently? How to learn to generate structured data de novo given certain desired properties?
A common approach to tackle these challenges is to first design a similarity measure, called the kernel function, between two data points, based on either statistics of the substructures or probabilistic generative models; and then a machine learning algorithm will optimize a predictive model based on such similarity measure. However, this elegant twostage approach has difficulty scaling up, and discriminative information is also not exploited during the design of similarity measure.
In this talk, I will present Structure2Vec, an effective and scalable approach for representing structured data based on the idea of embedding latent variable models into a feature space, and learning such feature space using discriminative information. Interestingly, Structure2Vec extracts features by performing a sequence of nested nonlinear operations in a way similar to graphical model inference procedures, such as mean field and belief propagation. In applications involving genome and protein sequences, drug molecules and energy materials, Structure2Vec consistently produces thestateoftheart predictive performance. Furthermore, in the materials property prediction problem involving 2.3 million data points, Structure2Vec is able to produces a more accurate model yet being 10,000 times smaller. In the end, I will also discuss potential improvements over current work, possible extensions to network analysis and computer vision, and thoughts on the structured data design problem.
BIO:
Le Song is an assistant professor in the Department of Computational Science and Engineering, College of Computing, Georgia Institute of Technology. He received his Ph.D. in Machine Learning from University of Sydney and NICTA in 2008, and then conducted his postdoctoral research in the Department of Machine Learning, Carnegie Mellon University, between 2008 and 2011. Before he joined Georgia Institute of Technology, he was a research scientist at Google. His principal research direction is machine learning, especially kernel methods and probabilistic graphical models for large scale and complex problems, arising from artificial intelligence, network analysis, computational biology and other interdisciplinary domains. He is the recipient of the AISTATS'16 Best Student Paper Award, IPDPS'15 Best Paper Award, NSF CAREER Award’14, NIPS’13 Outstanding Paper Award, and ICML’10 Best Paper Award. He has also served as the area chair or senior program committee for many leading machine learning and AI conferences such as ICML, NIPS, AISTATS and AAAI, and the action editor for JMLR.

06 Oct 2016 
Rong Ge (Duke) 
Avoid Spurious Local Optima: Homotopy Method for Tensor PCA
Time:
4:005:00pm
Abstract:
Recently, several nonconvex problems such as tensor decomposition, phase retrieval and matrix completion are shown to have no spurious local minima, which allows them to be solved by very simple local search algorithms. However, more complicated nonconvex problems such as the Tensor PCA do have local optima that are not global, and previous results rely on techniques inspired by SumofSquares hierarchy. In this work we show the commonly applied homotopy method, which tries to solve the optimization problem by considering different levels of "smoothing", can be applied to tensor PCA and achieve similar guarantees as the best known SumofSquares algorithms. This is one of the first settings where local search algorithms are guaranteed to avoid spurious local optima even in high dimensions.
This is based on joint work with Yuan Deng (Duke University).
BIO:
Rong Ge is an assistant professor at Duke computer science department. He got his Ph.D. in Princeton University and was a postdoc at Microsoft Research New England before joining Duke. Rong Ge is broadly interested in theoretical computer science and machine learning. His research focuses on designing algorithms with provable guarantees for machine learning problems, with applications to topic models, sparse coding and computational biology.

27 Oct 2016

Sewoong Oh (UIUC) 
Fundamental Limits and Efficient Algorithms in Adaptive Crowdsourcing
Time:
4:005:00pm
Abstract:
Adaptive schemes, where tasks are assigned based on the data collected thus far, are
widely used in practical crowdsourcing systems to efficiently allocate the budget.
However, existing theoretical analyses of crowdsourcing systems suggest that the
gain of adaptive task assignments is minimal. To bridge this gap,
we propose a new model for representing practical crowdsourcing systems,
which strictly generalizes the popular DawidSkene model, and
characterize the fundamental tradeoff between budget and accuracy.
We introduce a novel adaptive scheme that matches this fundamental limit.
We introduce new techniques to analyze the
spectral analyses of nonbacktracking operators, using density evolution techniques from coding theory.
BIO:
Sewoong Oh is an Assistant Professor of Industrial and Enterprise Systems Engineering at UIUC. He received his PhD from the department of Electrical Engineering at Stanford University. Following his PhD, he worked as a postdoctoral researcher at Laboratory for Information and Decision Systems (LIDS) at MIT. He was coawarded the Kenneth C. Sevcik outstanding student paper award at the Sigmetrics 2010, the best paper award at the SIGMETRICS 2015, and NSF CAREER award in 2016.

08 Nov 2016
Location: RTH 526

Robert Nowak (UW–Madison) 
Learning Human Preferences and Perceptions From Data
Time:
11:00am12:00pm
Abstract:
Modeling human perception has many applications in cognitive, social, and educational science, as well as in advertising and commerce. This talk discusses theory and methods for learning rankings and embeddings representing perceptions from datasets of human judgments, such as ratings or comparisons. I will briefly describe an ongoing largescale experiment with the New Yorker magazine that deals with ranking cartoon captions using on our nextml.org system. Then I will discuss our recent work on ordinal embedding, also known as nonmetric multidimensional scaling, which is the problem of representing items (e.g., images) as points in a lowdimensional Euclidean space given constraints of the form "item i is closer to item j than item k.” In other words, the goal is to find a geometric representation of data that is faithful to comparative similarity judgments. This classic problem is often used to gauge and visualize perceptual similarities. A variety of algorithms exist for learning metric embeddings from comparison data, but the accuracy and performance of these methods were poorly understood. I will present a new theoretical framework that quantifies the accuracy of learned embeddings and indicates how many comparisons suffice as a function of the number of items and the dimension of the embedding. Furthermore, the theory points to new algorithms that outperform previously proposed methods. I will also describe a few applications of ordinal embedding.
This joint work with Lalit Jain and Kevin Jamieson.
BIO:
Rob is the McFarlandBascom Professor in Engineering at the University of WisconsinMadison, where his research focuses on signal processing, machine learning, optimization, and statistics. The BeerMapper and NEXT systems are recent applications of his research. Rob is a professor in Electrical and Computer Engineering, as well as being affiliated with the departments of Computer Sciences, Statistics, and Biomedical Engineering at the University of Wisconsin. He is also a Fellow of the IEEE and the Wisconsin Institute for Discovery, a member of the Wisconsin Optimization Research Consortium and Machine Learning @ Wisconsin, and organizer of the SILO seminar series. Rob is also an Adjoint Professor at the Toyota Technological Institute at Chicago.

14 Nov 2016
Location: SGM 123

Hal Daumé III (UMD) 
Learning Language through Interaction
Time:
12:001:00pm
Abstract:
Machine learningbased natural language processing systems are
amazingly effective, when plentiful labeled training data exists for the
task/domain of interest. Unfortunately, for broad coverage (both in task
and domain) language understanding, we're unlikely to ever have
sufficient labeled data, and systems must find some other way to learn.
I'll describe a novel algorithm for learning from interactions, and
several problems of interest, most notably machine simultaneous
interpretation (translation while someone is still speaking). This is
all joint work with some amazing (former) students He He, Alvin Grissom
II, John Morgan, Mohit Iyyer, Sudha Rao and Leonardo Claudino, as well
as colleagues Jordan BoydGraber, KaiWei Chang, John Langford, Akshay
Krishnamurthy, Alekh Agarwal, Stéphane Ross, Alina Beygelzimer and Paul
Mineiro.
BIO: Hal Daumé III is an associate professor in Computer Science at the University of Maryland,
College Park. He holds joint appointments in UMIACS and Linguistics.
He was previously an assistant professor in the School of Computing
at the University of Utah. His primary research interest is in developing
new learning algorithms for prototypical problems that arise in the context
of language processing and artificial intelligence. This includes topics
like structured prediction, domain adaptation and unsupervised learning;
as well as multilingual modeling and affect analysis.
He associates himself most with conferences like ACL, ICML, NIPS and EMNLP. He earned his PhD at the University of Southern California with a thesis on structured prediction for language (his advisor was Daniel Marcu). He spent the summer of 2003 working with Eric Brill in the machine learning and applied statistics group at Microsoft Research. Prior to that, he studied math (mostly logic) at Carnegie Mellon University.

17 Nov 2016

Arindam Banerjee (UMN) 
Learning with Low Samples in HighDimensions: Estimators, Geometry,and Applications
Time:
4:005:00pm
Abstract:
Many machine learning problems, especially scientific problems in
areas such as ecology, climate science, and brain sciences, operate
in the socalled `low samples, high dimensions' regime. Such
problems typically have numerous possible predictors or features,
but the number of training examples is small, often much smaller
than the number of features. In this talk, we will discuss recent
advances in general formulations and estimators for such problems.
These formulations generalize prior work such as the Lasso and the
Dantzig selector. We will discuss the geometry underlying such
formulations, and how the geometry helps in establishing finite
sample properties of the estimators. We will also discuss
applications of such results in structure learning in probabilistic
graphical models, along with real world applications in ecology and
climate science.
This is joint work with Soumyadeep Chatterjee, Sheng Chen, Farideh
Fazayeli, Andre Goncalves, Jens Kattge, Igor Melnyk, Peter Reich,
Franziska Schrodt, Hanhuai Shan, and Vidyashankar Sivakumar.
BIO:
Arindam Banerjee is an Associate Professor at the Department of
Computer & Engineering and a Resident Fellow at the Institute on
the Environment at the University of Minnesota, Twin Cities. His
research interests are in statistical machine learning and data
mining, and applications in complex realworld problems including
climate science, ecology, recommendation systems, text analysis,
brain sciences, finance, and aviation safety. He has won several
awards, including the Adobe Research Award (2016), the IBM Faculty
Award (2013), the NSF CAREER award (2010), and six Best Paper
awards in toptier conferences.

29 Nov 2016

Richard Samworth (U. Cambridge) 
Highdimensional changepoint estimation via sparse projection
Time:
4:005:00pm
Abstract:
Abstract: Changepoints are a very common feature of Big Data that arrive in the form of a data stream. We study highdimensional time series in which, at certain time points, the mean structure changes in a sparse subset of the coordinates. The challenge is to borrow strength across the coordinates in order to detect smaller changes than could be observed in any individual component series. We propose a twostage procedure called 'inspect' for estimation of the changepoints: first, we argue that a good projection direction can be obtained as the leading left singular vector of the matrix that solves a convex optimisation problem derived from the CUSUM transformation of the time series. We then apply an existing univariate changepoint detection algorithm to the projected series. Our theory provides strong guarantees on both the number of estimated changepoints and the rates of convergence of their locations, and our numerical studies validate its highly competitive empirical performance for a wide range of data generating mechanisms.
BIO:
Richard Samworth is Professor of Statistics in the Statistical Laboratory at the
University of Cambridge, and currently holds a GBP 1.2M Engineering and Physical
Sciences Research Council Early Career Fellowship. He received his PhD in Statistics,
also from the University of Cambridge, in 2004. Richard's main research interests
are in nonparametric and highdimensional statistical inference. Particular
research topics include shapeconstrained density and other nonparametric
function estimation problems, nonparametric classification, clustering
and regression, Independent Component Analysis, bagging and highdimensional variable selection problems. Richard was awarded the Royal Statistical Society (RSS) Research prize (2008), the RSS Guy Medal in Bronze (2012) and a Philip Leverhulme prize (2014). He has been elected a Fellow of the Institute for Mathematical Statistics (2014) and the American Statistical Association (2015).
