Keynote Speakers

szeliski_sep12_180x240Richard Szeliski

Affiliate Professor, University of Washington
Director of the Computational Photography Group at Facebook

About: Richard Szeliski was the leader of Interactive Visual Media Group at Microsoft Research for 20 years. In late October 2015, he joined Facebook as a founding member of the Computational Photography group and he also currently holds an Affiliate Professor appointment at the University of Washington. He has been affiliated with Digital Equipment Corporation’s Cambridge Research Laboratory, SRI International’s AI Center (Perception Program) and Carnegie Mellon Vision and Autonomous Systems Center. In February 2015, he was elected to the National Academy of Engineering. He is also the author of the book Computer Vision: Algorithms and Applications, published in 2010 by Springer. His research interests are in using vision to automatically build 3-D models from images, computational photography, and image-based rendering. He has worked on both traditional 3-D volumetric and surface model reconstruction, and on high-resolution image mosaic construction. He has additional research interests in geometric modeling, motion estimation, multiresolution algorithms and representations, and optimization algorithms.

Title: 3D Reconstruction for Image-Based Rendering

Abstract: The reconstruction of 3D scenes and their appearance from imagery is one of the longest-standing problems in computer vision. Originally developed to support robotics and artificial intelligence applications, it has found some of its most widespread use in the support of interactive 3D scene visualization. One of the keys to this success has been the melding of 3D geometric and photometric reconstruction with a heavy re-use of the original imagery, which produces more realistic rendering than a pure 3D model-driven approach. In this talk, I give a retrospective of two decades of research in this area, touching on topics such as sparse and dense 3D reconstruction, the fundamental concepts in image-based rendering and computational photography, applications to virtual reality, as well as ongoing research in the areas of layered decompositions and 3D-enabled video stabilization.

tamaraberg2

Tamara Berg

CEO Shopagon Inc
Associate Professor, UNC Chapel Hill

About: Tamara Berg received her B.S. in Mathematics and Computer Science from the University of Wisconsin, Madison in 2001. She then completed a PhD at the University of California, Berkeley in 2007 and spent a year as a research scientist at Yahoo! Research. From 2008 to 2013 Tamara was an Assistant Professor in the computer science department at Stony Brook University and core member of the consortium for Digital Art, Culture, and Technology (cDACT). In 2013, Tamara joined the University of North Carolina Chapel Hill as an Assistant Professor and was promoted to tenured Associate Professor in 2015. She is also co-founder of Shopagon Inc, a start-up in the computer vision and retail space that uses artificial intelligence algorithms to personalize the online clothing shopping experience. Tamara is a recipient of the NSF Career Award, the 2013 Marr Prize, and the 2016 Hettleman Award. Her research straddles the boundary between Computer Vision and Natural Language Processing with applications to large-scale recognition, retrieval, fashion, and social network analysis.

Title: Image Description & Beyond…

Abstract: Much of everyday language and discourse concerns the visual world around us, making understanding the relationship between the physical world and language describing that world an important challenge problem for AI. Comprehending the complex and subtle interplay between the visual and linguistic domains will have broad applicability toward inferring human-like understanding of images, producing natural human-robot interactions, and grounding natural language. In computer vision, along with improvements in deep learning based visual recognition, there has been an explosion of recent interest in methods to automatically generate natural language outputs for images and videos. In this talk I will describe our group’s efforts to understand and produce relevant natural language about images, from developing early methods to generate complete and human-like image descriptions, to modeling how people interpret and describe image content, to moving beyond general image descriptions toward more focused natural language, such as referring expressions and question-answering.

Marc-1-180x180

Marc Pollefeys

Professor of Computer Science, ETH Zurich
Director of Science (Advanced Perception for HoloLens), Microsoft

About: Marc Pollefeys is best known for his work in 3D computer vision, having been the first to develop a software pipeline to automatically turn photographs into 3D models, but also works on robotics, graphics and machine learning problems. Other noteworthy projects he worked on with collaborators at UNC Chapel Hill and ETH Zurich are real-time 3D scanning with mobile devices, a real-time pipeline for 3D reconstruction of cities from vehicle mounted-cameras, camera-based self-driving cars and the first fully autonomous vision-based drone. Most recently his academic research has focused on combining 3D reconstruction with semantic scene understanding. He has published over 250 peer-reviewed publications and holds several patents. His lab at ETH Zurich also developed the PixHawk auto-pilot which can be found in over half a million drones and he has co-founded several computer vision start-ups.

Title: Computer Vision for Mixed Reality

Abstract: This is a golden age for computer vision. Research breakthroughs are leaving the lab and getting into users’ hands in record time. Computer vision now plays a pivotal role in many advances benefitting society, such as autonomous vehicles, improved biometric security, and medical imaging. But out of all these innovations, one stands out as having the potential to completely upend how we access information and communicate with each other: mixed reality. Spurred by recent developments in SLAM, 3D reconstruction, gesture recognition, scene understanding, and power-efficient embedded computing, we’re already experiencing it in the form of groundbreaking products like Microsoft HoloLens. In this talk I will present some of the key computer vision components that are essential for enabling compelling mixed reality experiences on HoloLens and also discuss some of the unique features that HoloLens offers as an experimental platform for computer vision researchers.