8:30am-8:40am | Ballrooms A,B,C | Opening Remarks from Conference Chairs |
8:40am-10:10am | Oral Session | |
10:10am-12:30pm | Exhibit Hall A | Poster Session 1A |
12:30pm-2:00pm | Exhibit Hall B | Lunch |
2:00pm-3:30pm | Oral Session | |
3:30pm-6:00pm | Exhibit Hall A | Poster Session 1B |
6:00pm-7:30pm | Ballrooms A,B,C | Reception & Awards |
7:30pm-8:30pm | Rooms 302,304,306 | PAMI Technical Committee/Computer Vision Foundation Meeting |
8:30am-10:00am | Oral Session | |
10:00am-12:30pm | Exhibit Hall A | Poster Session 2A |
12:30pm-2:00pm | Exhibit Hall B | Lunch |
2:00pm-3:30pm | Oral Session | |
3:30pm-6:00pm | Exhibit Hall A | Poster Session 2B |
6:00pm-9:00pm | Sheraton Grand Ballroom | Banquet Dinner |
8:30am-10:00am | Oral Session | |
10:30am-11:25am | Ballrooms A,B,C | Plenary Speaker: |
11:30am-12:25pm | Ballrooms A,B,C | Plenary Speaker: |
12:30pm-2:00pm | Exhibit Hall B | Lunch |
2:00pm-3:30pm | Oral Session | |
3:30pm-6:00pm | Exhibit Hall A | Poster Session 3B |
Monday June 8, 8:40am-10:10am
CNN Architectures | Depth and 3D Surfaces |
---|---|
Ballrooms A,B,C | Rooms 302,304,306 |
Hypercolumns for Object Segmentation and Fine-Grained Localization | DynamicFusion: Reconstruction and Tracking of Non-Rigid Scenes in Real-Time |
Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection | 3D Scanning Deformable Objects With a Single RGBD Sensor |
Improving Object Detection With Deep Convolutional Networks via Bayesian Optimization and Structured Prediction | An Efficient Volumetric Framework for Shape Tracking |
Going Deeper With Convolutions | Part-Based Modelling of Compound Scenes From Images |
Understanding Image Representations by Measuring Their Equivariance and Equivalence | SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite |
Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images | Small-Variance Nonparametric Clustering on the Hypersphere |
Monday June 8, 10:10am-12:30pm
Poster Session | |
---|---|
Session 1A, Exhibit Hall A | |
Poster # | Title and Authors |
1 | Going Deeper With Convolutions |
2 | Propagated Image Filtering |
3 | Web Scale Photo Hash Clustering on A Single Machine |
4 | Expanding Object Detector's Horizon: Incremental Learning Framework for Object Detection in Videos |
5 | Supervised Discrete Hashing |
6 | What do 15,000 Object Categories Tell Us About Classifying and Localizing Actions? |
7 | Landmarks-Based Kernelized Subspace Alignment for Unsupervised Domain Adaptation |
8 | Blur Kernel Estimation Using Normalized Color-Line Prior |
9 | A Light Transport Model for Mitigating Multipath Interference in Time-of-Flight Sensors |
10 | Traditional Saliency Reloaded: A Good Old Model in New Shape |
11 | Automatic Construction Of Robust Spherical Harmonic Subspaces |
12 | Leveraging Stereo Matching With Learning-Based Confidence Measures |
13 | Saliency Detection via Cellular Automata |
14 | Efficient Sparse-to-Dense Optical Flow Estimation Using a Learned Basis and Layers |
15 | Learning Multiple Visual Tasks While Discovering Their Structure |
16 | Projection Metric Learning on Grassmann Manifold With Application to Video Based Face Recognition |
17 | Structural Sparse Tracking |
18 | Data-Driven Depth Map Refinement via Multi-Scale Sparse Representation |
19 | Uncalibrated Photometric Stereo Based on Elevation Angle Recovery From BRDF Symmetry of Isotropic Materials |
20 | Attributes and Categories for Generic Instance Search From One Example |
21 | Heat Diffusion Over Weighted Manifolds: A New Descriptor for Textured 3D Non-Rigid Shapes |
22 | A Dynamic Programming Approach for Fast and Robust Object Pose Recognition From Range Images |
23 | Beyond Gaussian Pyramid: Multi-Skip Feature Stacking for Action Recognition |
24 | A Geodesic-Preserving Method for Image Warping |
25 | Shape Driven Kernel Adaptation in Convolutional Neural Network for Robust Facial Traits Recognition |
26 | From Categories to Subcategories: Large-Scale Image Classification With Partial Class Label Refinement |
27 | Combination Features and Models for Human Detection |
28 | Improving Object Detection With Deep Convolutional Networks via Bayesian Optimization and Structured Prediction |
29 | A Metric Parametrization for Trifocal Tensors With Non-Colinear Pinholes |
30 | An Efficient Volumetric Framework for Shape Tracking |
31 | Structured Sparse Subspace Clustering: A Unified Optimization Framework |
32 | Delving Into Egocentric Actions |
33 | Latent Trees for Estimating Intensity of Facial Action Units |
34 | Robust Regression on Image Manifolds for Ordered Label Denoising |
35 | Privacy Preserving Optics for Miniature Vision Sensors |
36 | Deep Transfer Metric Learning |
37 | Small-Variance Nonparametric Clustering on the Hypersphere |
38 | DynamicFusion: Reconstruction and Tracking of Non-Rigid Scenes in Real-Time |
39 | Reliable Patch Trackers: Robust Visual Tracking by Exploiting Reliable Patches |
40 | Predicting Eye Fixations Using Convolutional Neural Networks |
41 | Kernel Fusion for Better Image Deblurring |
42 | Direction Matters: Depth Estimation With a Surface Normal Classifier |
43 | Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection |
44 | Grasp Type Revisited: A Modern Perspective on a Classical Feature for Vision |
45 | Learning Hypergraph-Regularized Attribute Predictors |
46 | A Coarse-to-Fine Model for 3D Pose Estimation and Sub-Category Recognition |
47 | Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images |
48 | Deformable Part Models are Convolutional Neural Networks |
49 | Hypercolumns for Object Segmentation and Fine-Grained Localization |
50 | Mapping Visual Features to Semantic Profiles for Retrieval in Medical Imaging |
51 | Event-Driven Stereo Matching for Real-Time 3D Panoramic Vision |
52 | Graph-Based Simplex Method for Pairwise Energy Minimization With Binary Variables |
53 | Image Denoising via Adaptive Soft-Thresholding Based on Non-Local Samples |
54 | 3D Scanning Deformable Objects With a Single RGBD Sensor |
55 | Nested Motion Descriptors |
56 | Efficient Minimal-Surface Regularization of Perspective Depth Maps in Variational Stereo |
57 | Maximum Persistency via Iterative Relaxed Inference With Graphical Models |
58 | Deep Hierarchical Parsing for Semantic Segmentation |
59 | Designing Deep Networks for Surface Normal Estimation |
60 | Layered RGBD Scene Flow Estimation |
61 | Hashing With Binary Autoencoders |
62 | SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite |
63 | Collaborative Feature Learning From Social Media |
64 | Diversity-Induced Multi-View Subspace Clustering |
65 | Building a Bird Recognition App and Large Scale Dataset With Citizen Scientists: The Fine Print in Fine-Grained Dataset Collection |
66 | Early Burst Detection for Memory-Efficient Image Retrieval |
67 | Indoor Scene Structure Analysis for Single Image Depth Estimation |
68 | Light Field Layer Matting |
69 | Depth Camera Tracking With Contour Cues |
70 | Radial Distortion Homography |
71 | Efficient Object Localization Using Convolutional Networks |
72 | Just Noticeable Defocus Blur Detection and Estimation |
73 | How Do We Use Our Hands? Discovering a Diverse Set of Common Grasps |
74 | Rotating Your Face Using Multi-Task Deep Neural Network |
75 | Is Object Localization for Free? - Weakly-Supervised Learning With Convolutional Neural Networks |
76 | Super-Resolution Person Re-Identification With Semi-Coupled Low-Rank Discriminant Dictionary Learning |
77 | Dual Domain Filters Based Texture and Structure Preserving Image Non-Blind Deconvolution |
78 | Region-Based Temporally Consistent Video Post-Processing |
79 | Global Refinement of Random Forest |
80 | Adaptive Region Pooling for Object Detection |
81 | Discriminative and Consistent Similarities in Instance-Level Multiple Instance Learning |
82 | MUlti-Store Tracker (MUSTer): A Cognitive Psychology Inspired Approach to Object Tracking |
83 | Finding Action Tubes |
84 | Learning a Convolutional Neural Network for Non-Uniform Motion Blur Removal |
85 | Complexity-Adaptive Distance Metric for Object Proposals Generation |
86 | High-Fidelity Pose and Expression Normalization for Face Recognition in the Wild |
87 | Transformation of Markov Random Fields for Marginal Distribution Estimation |
88 | Sparse Convolutional Neural Networks |
89 | FaceNet: A Unified Embedding for Face Recognition and Clustering |
90 | Cascaded Hand Pose Regression |
91 | Cross-Scene Crowd Counting via Deep Convolutional Neural Networks |
92 | The Application of Two-Level Attention Models in Deep Convolutional Neural Network for Fine-Grained Image Classification |
93 | End-to-End Integration of a Convolution Network, Deformable Parts Model and Non-Maximum Suppression |
94 | A Mixed Bag of Emotions: Model, Predict, and Transfer Emotion Distributions |
95 | Neuroaesthetics in Fashion: Modeling the Perception of Fashionability |
96 | Part-Based Modelling of Compound Scenes From Images |
97 | Efficient Parallel Optimization for Potts Energy With Hierarchical Fusion |
98 | Pooled Motion Features for First-Person Videos |
99 | Functional Correspondence by Matrix Completion |
100 | Elastic-Net Regularization of Singular Values for Robust Subspace Learning |
101 | Hardware Compliant Approximate Image Codes |
102 | Photometric Refinement of Depth Maps for Multi-Albedo Objects |
103 | Predicting the Future Behavior of a Time-Varying Probability Distribution |
104 | Classifier Based Graph Construction for Video Segmentation |
105 | ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding |
106 | Mid-Level Deep Pattern Mining |
107 | Prediction of Search Targets From Fixations in Open-World Settings |
108 | Understanding Image Representations by Measuring Their Equivariance and Equivalence |
109 | Effective Learning-Based Illuminant Estimation Using Simple Features |
110 | PAIGE: PAirwise Image Geometry Encoding for Improved Efficiency in Structure-From-Motion |
111 | Dense, Accurate Optical Flow Estimation With Piecewise Parametric Model |
112 | Single-Image Estimation of the Camera Response Function in Near-Lighting |
113 | Multispectral Pedestrian Detection: Benchmark Dataset and Baseline |
114 | A Low-Dimensional Step Pattern Analysis Algorithm With Application to Multimodal Retinal Image Registration |
115 | Bilinear Heterogeneous Information Machine for RGB-D Action Recognition |
116 | MRF Optimization by Graph Approximation |
117 | SALICON: Saliency in Context |
118 | Weakly Supervised Object Detection With Convex Clustering |
119 | Interleaved Text/Image Deep Mining on a Very Large-Scale Radiology Database |
120 | Learning Semantic Relationships for Better Action Retrieval in Images |
121 | Hierarchical Recurrent Neural Network for Skeleton Based Action Recognition |
Monday June 8, 2:00pm-3:30pm
Discovery and Dense Correspondences | 3D Shape: Matching, Recognition, Reconstruction |
---|---|
Ballrooms A,B,C | Rooms 302,304,306 |
Discovering States and Transformations in Image Collections | Category-Specific Object Reconstruction From a Single Image |
Unsupervised Object Discovery and Localization in the Wild: Part-Based Matching With Bottom-Up Region Proposals | Discriminative Shape From Shading in Uncalibrated Illumination |
FlowWeb: Joint Image Set Alignment by Weaving Consistent, Pixel-Wise Correspondences | Learning to Generate Chairs With Convolutional Neural Networks |
EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow | 3D ShapeNets: A Deep Representation for Volumetric Shapes |
Phase-Based Frame Interpolation for Video | Sketch-Based 3D Shape Retrieval Using Convolutional Neural Networks |
Towards Open World Recognition | Data-Driven 3D Voxel Patterns for Object Category Recognition |
Monday June 8, 3:30pm-6:00pm
Poster Session | |
---|---|
Session 1B, Exhibit Hall A | |
Poster # | Title and Authors |
1 | Depth and Surface Normal Estimation From Monocular Images Using Regression on Deep Features and Hierarchical CRFs |
2 | Discriminative Shape From Shading in Uncalibrated Illumination |
3 | Multi-Manifold Deep Metric Learning for Image Set Classification |
4 | Target Identity-Aware Network Flow for Online Multiple Target Tracking |
5 | Adaptive As-Natural-As-Possible Image Stitching |
6 | EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow |
7 | Learning Coarse-to-Fine Sparselets for Efficient Object Detection and Scene Classification |
8 | Continuous Visibility Feature |
9 | FlowWeb: Joint Image Set Alignment by Weaving Consistent, Pixel-Wise Correspondences |
10 | Unsupervised Object Discovery and Localization in the Wild: Part-Based Matching With Bottom-Up Region Proposals |
11 | Supervised Descriptor Learning for Multi-Output Regression |
12 | A Statistical Model of Riemannian Metric Variation for Deformable Shape Analysis |
13 | Temporally Coherent Interpretations for Long Videos Using Pattern Theory |
14 | Line-Sweep: Cross-Ratio For Wide-Baseline Matching and 3D Reconstruction |
15 | Simplified Mirror-Based Camera Pose Computation via Rotation Averaging |
16 | On the Relationship Between Visual Attributes and Convolutional Networks |
17 | Saliency Detection by Multi-Context Deep Learning |
18 | DeepShape: Deep Learned Shape Descriptor for 3D Shape Matching and Retrieval |
19 | Bayesian Adaptive Matrix Factorization With Automatic Model Selection |
20 | Joint Action Recognition and Pose Estimation From Video |
21 | Fast Action Proposals for Human Action Detection and Search |
22 | Joint Multi-Feature Spatial Context for Scene Recognition on the Semantic Manifold |
23 | Large-Scale Damage Detection Using Satellite Imagery |
24 | A Novel Locally Linear KNN Model for Visual Recognition |
25 | Bilinear Random Projections for Locality-Sensitive Binary Codes |
26 | Combining Local Appearance and Holistic View: Dual-Source Deep Neural Networks for Human Pose Estimation |
27 | Superpixel Segmentation Using Linear Spectral Clustering |
28 | Person Count Localization in Videos From Noisy Foreground and Detections |
29 | Good Features to Track for Visual SLAM |
30 | Discovering States and Transformations in Image Collections |
31 | Generalized Deformable Spatial Pyramid: Geometry-Preserving Dense Correspondence Estimation |
32 | Classifier Adaptation at Prediction Time |
33 | Phase-Based Frame Interpolation for Video |
34 | Matching-CNN Meets KNN: Quasi-Parametric Human Parsing |
35 | Absolute Pose for Cameras Under Flat Refractive Interfaces |
36 | Protecting Against Screenshots: An Image Processing Approach |
37 | Pose-Conditioned Joint Angle Limits for 3D Human Pose Reconstruction |
38 | VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases |
39 | A Graphical Model Approach for Matching Partial Signatures |
40 | From Captions to Visual Concepts and Back |
41 | Semi-Supervised Low-Rank Mapping Learning for Multi-Label Classification |
42 | ConceptLearner: Discovering Visual Concepts From Weakly Labeled Image Collections |
43 | Computationally Bounded Retrieval |
44 | Viewpoints and Keypoints |
45 | Discrete Hyper-Graph Matching |
46 | Rolling Shutter Motion Deblurring |
47 | Learning to Generate Chairs With Convolutional Neural Networks |
48 | Accurate Depth Map Estimation From a Lenslet Light Field Camera |
49 | Deep Semantic Ranking Based Hashing for Multi-Label Image Retrieval |
50 | Similarity Learning on an Explicit Polynomial Kernel Feature Map for Person Re-Identification |
51 | Learning to Propose Objects |
52 | Basis Mapping Based Boosting for Object Detection |
53 | Computing the Stereo Matching Cost With a Convolutional Neural Network |
54 | Recognize Complex Events From Static Images by Fusing Deep Channels |
55 | Multi-Feature Max-Margin Hierarchical Bayesian Model for Action Recognition |
56 | Model Recommendation: Generating Object Detectors From Few Samples |
57 | A Linear Least-Squares Solution to Elastic Shape-From-Template |
58 | Robust Large Scale Monocular Visual SLAM |
59 | Membership Representation for Detecting Block-Diagonal Structure in Low-Rank or Sparse Subspace Clustering |
60 | Bayesian Inference for Neighborhood Filters With Application in Denoising |
61 | Deep LAC: Deep Localization, Alignment and Classification for Fine-Grained Recognition |
62 | Unconstrained Realtime Facial Performance Capture |
63 | Blind Optical Aberration Correction by Exploring Geometric and Visual Priors |
64 | Ontological Supervision for Fine Grained Classification of Street View Storefronts |
65 | Finding Distractors In Images |
66 | From Image-Level to Pixel-Level Labeling With Convolutional Networks |
67 | Semantic Alignment of LiDAR Data at City Scale |
68 | Oriented Edge Forests for Boundary Detection |
69 | Query-Adaptive Late Fusion for Image Search and Person Re-Identification |
70 | Filtered Feature Channels for Pedestrian Detection |
71 | GRSA: Generalized Range Swap Algorithm for the Efficient Optimization of MRFs |
72 | PatchCut: Data-Driven Object Segmentation via Local Shape Transfer |
73 | Illumination and Reflectance Spectra Separation of a Hyperspectral Image Meets Low-Rank Matrix Factorization |
74 | Semantic Part Segmentation Using Compositional Model Combining Shape and Appearance |
75 | A Discriminative CNN Video Representation for Event Detection |
76 | 24/7 Place Recognition by View Synthesis |
77 | Understanding Image Virality |
78 | Book2Movie: Aligning Video Scenes With Book Chapters |
79 | 3D Model-Based Continuous Emotion Recognition |
80 | Learning to Rank in Person Re-Identification With Metric Ensembles |
81 | Making Better Use of Edges via Perceptual Grouping |
82 | Real-Time Joint Estimation of Camera Orientation and Vanishing Points |
83 | Sketch-Based 3D Shape Retrieval Using Convolutional Neural Networks |
84 | Salient Object Detection via Bootstrap Learning |
85 | Towards Open World Recognition |
86 | Data-Driven 3D Voxel Patterns for Object Category Recognition |
87 | 3D ShapeNets: A Deep Representation for Volumetric Shapes |
88 | Robust Image Alignment With Multiple Feature Descriptors and Matching-Guided Neighborhoods |
89 | Pushing the Frontiers of Unconstrained Face Detection and Recognition: IARPA Janus Benchmark A |
90 | Depth From Shading, Defocus, and Correspondence Using Light-Field Angular Coherence |
91 | New Insights Into Laplacian Similarity Search |
92 | Feature-Independent Context Estimation for Automatic Image Annotation |
93 | Category-Specific Object Reconstruction From a Single Image |
94 | Active Sample Selection and Correction Propagation on a Gradually-Augmented Graph |
95 | Efficient and Accurate Approximations of Nonlinear Convolutional Networks |
96 | Ranking and Retrieval of Image Sequences From Multiple Paragraph Queries |
97 | Casual Stereoscopic Panorama Stitching |
98 | Superpixel Meshes for Fast Edge-Preserving Surface Reconstruction |
99 | Best-Buddies Similarity for Robust Template Matching |
100 | Superdifferential Cuts for Binary Energies |
101 | The S-Hock Dataset: Analyzing Crowds at the Stadium |
102 | Discriminant Analysis on Riemannian Manifold of Gaussian Distributions for Face Recognition With Image Sets |
103 | Texture Representations for Image and Video Synthesis |
104 | Shadow Optimization From Structured Deep Edge Detection |
105 | Total Variation Regularization of Shape Signals |
106 | Learning Similarity Metrics for Dynamic Scene Segmentation |
107 | Subspace Clustering by Mixture of Gaussian Regression |
108 | DASC: Dense Adaptive Self-Correlation Descriptor for Multi-Modal and Multi-Spectral Correspondence |
109 | In Defense of Color-Based Model-Free Tracking |
110 | Best of Both Worlds: Human-Machine Collaboration for Object Annotation |
111 | Robust Multiple Homography Estimation: An Ill-Solved Problem |
112 | Semi-Supervised Domain Adaptation With Subspace Learning for Visual Recognition |
113 | Articulated Motion Discovery Using Pairs of Trajectories |
114 | A Solution for Multi-Alignment by Transformation Synchronisation |
115 | A Convex Optimization Approach to Robust Fundamental Matrix Estimation |
116 | Simultaneous Pose and Non-Rigid Shape With Particle Dynamics |
117 | Semi-Supervised Learning With Explicit Relationship Regularization |
118 | Person Re-Identification by Local Maximal Occurrence Representation and Metric Learning |
119 | Joint Patch and Multi-Label Learning for Facial Action Unit Detection |
120 | Real-Time Visual Analysis of Microvascular Blood Flow for Critical Care |
Tuesday June 9, 8:30am-10:00am
Images and Language | Multiple View Geometry |
---|---|
Ballrooms A,B,C | Rooms 302,304,306 |
Show and Tell: A Neural Image Caption Generator | Reconstructing the World* in Six Days *(As Captured by the Yahoo 100 Million Image Dataset) |
Deep Visual-Semantic Alignments for Generating Image Descriptions | Joint Vanishing Point Extraction and Tracking |
Long-Term Recurrent Convolutional Networks for Visual Recognition and Description | Robust Camera Location Estimation by Convex Programming |
Image Specificity | Efficient Globally Optimal Consensus Maximisation With Tree Search |
Don't Just Listen, Use Your Imagination: Leveraging Visual Common Sense for Non-Visual Tasks | R6P - Rolling Shutter Absolute Camera Pose |
Becoming the Expert - Interactive Multi-Class Machine Teaching | Building Proteins in a Day: Efficient 3D Molecular Reconstruction |
Tuesday June 9, 10:00am-12:30pm
Poster Session | |
---|---|
Session 2A, Exhibit Hall A | |
Poster # | Title and Authors |
1 | JOTS: Joint Online Tracking and Segmentation |
2 | Gaze-Enabled Egocentric Video Summarization via Constrained Submodular Maximization |
3 | Sparse Depth Super Resolution |
4 | Efficient Illuminant Estimation for Color Constancy Using Grey Pixels |
5 | Can Humans Fly? Action Understanding With Multiple Classes of Actors |
6 | Reweighted Laplace Prior Based Hyperspectral Compressive Sensing for Unknown Sparsity |
7 | Class Consistent Multi-Modal Fusion With Binary Features |
8 | R6P - Rolling Shutter Absolute Camera Pose |
9 | Embedded Phase Shifting: Robust Phase Shifting With Embedded Signals |
10 | Shape and Light Directions From Shading and Polarization |
11 | 3D Deep Shape Descriptor |
12 | Cross-Age Face Verification by Coordinating With Cross-Face Age Verification |
13 | Beyond Mahalanobis Metric: Cayley-Klein Metric Learning |
14 | From Dictionary of Visual Words to Subspaces: Locality-Constrained Affine Subspace Coding |
15 | FPA-CS: Focal Plane Array-Based Compressive Imaging in Short-Wave Infrared |
16 | BOLD - Binary Online Learned Descriptor For Efficient Image Matching |
17 | Defocus Deblurring and Superresolution for Time-of-Flight Depth Cameras |
18 | Burst Deblurring: Removing Camera Shake Through Fourier Burst Accumulation |
19 | SOM: Semantic Obviousness Metric for Image Quality Assessment |
20 | DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection |
21 | Efficient Globally Optimal Consensus Maximisation With Tree Search |
22 | Mind's Eye: A Recurrent Visual Representation for Image Caption Generation |
23 | Hierarchical Sparse Coding With Geometric Prior For Visual Geo-Location |
24 | P3.5P: Pose Estimation With Unknown Focal Length |
25 | Joint Vanishing Point Extraction and Tracking |
26 | Learning a Non-Linear Knowledge Transfer Model for Cross-View Action Recognition |
27 | Random Tree Walk Toward Instantaneous 3D Human Pose Estimation |
28 | Deep Hashing for Compact Binary Codes Learning |
29 | Completing 3D Object Shape From One Depth Image |
30 | Encoding Based Saliency Detection for Videos and Images |
31 | Online Sketching Hashing |
32 | Enriching Object Detection With 2D-3D Registration and Continuous Viewpoint Estimation |
33 | Representing 3D Texture on Mesh Manifolds for Retrieval and Recognition Applications |
34 | Saliency Propagation From Simple to Difficult |
35 | Learning an Efficient Model of Hand Shape Variation From Depth Images |
36 | On the Minimal Problems of Low-Rank Matrix Factorization |
37 | Symmetry-Based Text Line Detection in Natural Scenes |
38 | DevNet: A Deep Event Network for Multimedia Event Detection and Evidence Recounting |
39 | Learning to Detect Motion Boundaries |
40 | Improving Object Proposals With Multi-Thresholding Straddling Expansion |
41 | Visual Recognition by Counting Instances: A Multi-Instance Cardinality Potential Kernel |
42 | Unconstrained 3D Face Reconstruction |
43 | Becoming the Expert - Interactive Multi-Class Machine Teaching |
44 | Long-Term Recurrent Convolutional Networks for Visual Recognition and Description |
45 | Zero-Shot Object Recognition by Semantic Manifold Distance |
46 | Hyper-Class Augmented and Regularized Deep Learning for Fine-Grained Image Classification |
47 | Direct Structure Estimation for 3D Reconstruction |
48 | Global Supervised Descent Method |
49 | Robust Camera Location Estimation by Convex Programming |
50 | Practical Robust Two-View Translation Estimation |
51 | Learning From Massive Noisy Labeled Data for Image Classification |
52 | KL Divergence Based Agglomerative Clustering for Automated Vitiligo Grading |
53 | Robust Saliency Detection via Regularized Random Walks Ranking |
54 | Weakly Supervised Semantic Segmentation for Social Images |
55 | Image Specificity |
56 | A Multi-Plane Block-Coordinate Frank-Wolfe Algorithm for Training Structural SVMs With a Costly Max-Oracle |
57 | Web-Scale Training for Face Identification |
58 | Dynamically Encoded Actions Based on Spacetime Saliency |
59 | Three Viewpoints Toward Exemplar SVM |
60 | Visual Recognition by Learning From Web Data: A Weakly Supervised Domain Generalization Approach |
61 | Clustering of Static-Adaptive Correspondences for Deformable Object Tracking |
62 | Geo-Semantic Segmentation |
63 | Towards Unified Depth and Semantic Prediction From a Single Image |
64 | Towards Force Sensing From Vision: Observing Hand-Object Interactions to Infer Manipulation Forces |
65 | A MRF Shape Prior for Facade Parsing With Occlusions |
66 | Probability Occupancy Maps for Occluded Depth Images |
67 | Segment Based 3D Object Shape Priors |
68 | Shape-From-Template in Flatland |
69 | Understanding Tools: Task-Oriented Object Modeling, Learning and Recognition |
70 | Deep Roto-Translation Scattering for Object Classification |
71 | Non-Rigid Registration of Images With Geometric and Photometric Deformation by Using Local Affine Fourier-Moment Matching |
72 | Detector Discovery in the Wild: Joint Multiple Instance and Representation Learning |
73 | Deeply Learned Face Representations Are Sparse, Selective, and Robust |
74 | Unsupervised Visual Alignment With Similarity Graphs |
75 | Video Anomaly Detection and Localization Using Hierarchical Feature Representation and Gaussian Process Regression |
76 | Inferring 3D Layout of Building Facades From a Single Image |
77 | Evaluation of Output Embeddings for Fine-Grained Image Classification |
78 | Virtual View Networks for Object Reconstruction |
79 | Real-Time Coarse-to-Fine Topologically Preserving Segmentation |
80 | Supervised Mid-Level Features for Word Image Representation |
81 | Learning Lightness From Human Judgement on Relative Reflectance |
82 | Scene Classification With Semantic Fisher Vectors |
83 | Don't Just Listen, Use Your Imagination: Leveraging Visual Common Sense for Non-Visual Tasks |
84 | Co-Saliency Detection via Looking Deep and Wide |
85 | Adopting an Unconstrained Ray Model in Light-Field Cameras for 3D Shape Reconstruction |
86 | Towards 3D Object Detection With Bimodal Deep Boltzmann Machines Over RGBD Imagery |
87 | An Active Search Strategy for Efficient Object Class Detection |
88 | Geodesic Exponential Kernels: When Curvature and Linearity Conflict |
89 | Transformation-Invariant Convolutional Jungles |
90 | Exemplar SVMs as Visual Feature Encoders |
91 | Object Scene Flow for Autonomous Vehicles |
92 | Reflectance Hashing for Material Recognition |
93 | Joint Photo Stream and Blog Post Summarization and Exploration |
94 | Video Summarization by Learning Submodular Mixtures of Objectives |
95 | Building Proteins in a Day: Efficient 3D Molecular Reconstruction |
96 | Learning Descriptors for Object Recognition and 3D Pose Estimation |
97 | Image Partitioning Into Convex Polygons |
98 | Deep Visual-Semantic Alignments for Generating Image Descriptions |
99 | Unsupervised Learning of Complex Articulated Kinematic Structures Combining Motion and Skeleton Information |
100 | Elastic Functional Coding of Human Actions: From Vector-Fields to Latent Variables |
101 | Show and Tell: A Neural Image Caption Generator |
102 | Descriptor Free Visual Indoor Localization With Line Segments |
103 | Fixation Bank: Learning to Reweight Fixation Candidates |
104 | Deep Networks for Saliency Detection via Local Estimation and Global Search |
105 | Reflection Removal Using Ghosting Cues |
106 | A Dataset for Movie Description |
107 | Fast and Robust Hand Tracking Using Detection-Guided Optimization |
108 | Efficient SDP Inference for Fully-Connected CRFs Based on Low-Rank Decomposition |
109 | Discriminative Learning of Iteration-Wise Priors for Blind Deconvolution |
110 | Eye Tracking Assisted Extraction of Attentionally Important Objects From Videos |
111 | Multi-View Feature Engineering and Learning |
112 | Self Scaled Regularized Robust Regression |
113 | Simultaneous Feature Learning and Hash Coding With Deep Neural Networks |
114 | MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching |
115 | Reconstructing the World* in Six Days *(As Captured by the Yahoo 100 Million Image Dataset) |
116 | Exact Bias Correction and Covariance Estimation for Stereo Vision |
117 | Computing Similarity Transformations From Only Image Correspondences |
118 | Image Segmentation in Twenty Questions |
119 | Interaction Part Mining: A Mid-Level Approach for Fine-Grained Action Recognition |
120 | Sparse Projections for High-Dimensional Binary Codes |
Tuesday June 9, 2:00pm-3:30pm
Segmentation in Images and Video | 3D Models and Images |
---|---|
Ballrooms A,B,C | Rooms 302,304,306 |
Causal Video Object Segmentation From Persistence of Occlusions | Picture: A Probabilistic Programming Language for Scene Perception |
Semantic Object Segmentation via Detection in Weakly Labeled Video | Rent3D: Floor-Plan Priors for Monocular Layout Estimation |
Fully Convolutional Networks for Semantic Segmentation | The Stitched Puppet: A Graphical Model of 3D Human Shape and Pose |
Shape-Tailored Local Descriptors and Their Application to Segmentation and Tracking | 3D Shape Estimation From 2D Landmarks: A Convex Relaxation Approach |
Deep Filter Banks for Texture Recognition and Segmentation | Holistic 3D Scene Understanding From a Single Geo-Tagged Image |
Active Learning for Structured Probabilistic Models With Histogram Approximation | Joint SFM and Detection Cues for Monocular 3D Localization in Road Scenes |
Tuesday June 9, 3:30pm-6:00pm
Poster Session | |
---|---|
Session 2B, Exhibit Hall A | |
Poster # | Title and Authors |
1 | Hierarchically-Constrained Optical Flow |
2 | The k-Support Norm and Convex Envelopes of Cardinality and Rank |
3 | Matching Bags of Regions in RGBD images |
4 | Recurrent Convolutional Neural Network for Object Recognition |
5 | Feedforward Semantic Segmentation With Zoom-Out Features |
6 | The Aperture Problem for Refractive Motion |
7 | Saliency-Aware Geodesic Video Object Segmentation |
8 | DEEP-CARVING: Discovering Visual Attributes by Carving Deep Neural Nets |
9 | Rent3D: Floor-Plan Priors for Monocular Layout Estimation |
10 | Learning a Sequential Search for Landmarks |
11 | Fully Convolutional Networks for Semantic Segmentation |
12 | Deep Correlation for Matching Images and Text |
13 | Multi-Objective Convolutional Learning for Face Labeling |
14 | Deep Multiple Instance Learning for Image Classification and Auto-Annotation |
15 | Multi-Instance Object Segmentation With Occlusion Handling |
16 | Material Recognition in the Wild With the Materials in Context Database |
17 | Understanding Pedestrian Behaviors From Stationary Crowd Groups |
18 | Depth From Focus With Your Mobile Phone |
19 | Fusion Moves for Correlation Clustering |
20 | Second-Order Constrained Parametric Proposals and Sequential Search-Based Structured Prediction for Semantic Segmentation in RGB-D Images |
21 | Metric Imitation by Manifold Transfer for Efficient Vision Applications |
22 | The Stitched Puppet: A Graphical Model of 3D Human Shape and Pose |
23 | Scene Labeling With LSTM Recurrent Neural Networks |
24 | FAemb: A Function Approximation-Based Embedding Method for Image Retrieval |
25 | Automatically Discovering Local Visual Material Attributes |
26 | Depth Image Enhancement Using Local Tangent Plane Approximations |
27 | Video Co-Summarization: Video Summarization by Visual Co-Occurrence |
28 | Watch and Learn: Semi-Supervised Learning for Object Detectors From Video |
29 | Generalized Tensor Total Variation Minimization for Visual Data Recovery |
30 | Active Learning for Structured Probabilistic Models With Histogram Approximation |
31 | Image Parsing With a Wide Range of Classes and Scene-Level Context |
32 | Bayesian Sparse Representation for Hyperspectral Image Super Resolution |
33 | Semantic Object Segmentation via Detection in Weakly Labeled Video |
34 | Learning With Dataset Bias in Latent Subcategory Models |
35 | Project-Out Cascaded Regression With an Application to Face Alignment |
36 | Image Retrieval Using Scene Graphs |
37 | Unifying Holistic and Parts-Based Deformable Model Fitting |
38 | Small Instance Detection by Integer Programming on Object Density Maps |
39 | Motion Part Regularization: Improving Action Recognition via Trajectory Selection |
40 | Multi-Task Deep Visual-Semantic Embedding for Video Thumbnail Selection |
41 | Fine-Grained Visual Categorization via Multi-Stage Metric Learning |
42 | Saturation-Preserving Specular Reflection Separation |
43 | Joint SFM and Detection Cues for Monocular 3D Localization in Road Scenes |
44 | Fisher Vectors Meet Neural Networks: A Hybrid Classification Architecture |
45 | UniHIST: A Unified Framework for Image Restoration With Marginal Histogram Constraints |
46 | Human Action Segmentation With Hierarchical Supervoxel Consistency |
47 | Robust Manhattan Frame Estimation From a Single RGB-D Image |
48 | Learning to Segment Under Various Forms of Weak Supervision |
49 | Fast and Accurate Image Upscaling With Super-Resolution Forests |
50 | Light Field From Micro-Baseline Image Pair |
51 | Efficient ConvNet-Based Marker-Less Motion Capture in General Scenes With a Low Number of Cameras |
52 | Learning Scene-Specific Pedestrian Detectors Without Real Data |
53 | Deep Filter Banks for Texture Recognition and Segmentation |
54 | Multiple Random Walkers and Their Application to Image Cosegmentation |
55 | Beyond the Shortest Path : Unsupervised Domain Adaptation by Sampling Subspaces Along the Spline Flow |
56 | Spherical Embedding of Inlier Silhouette Dissimilarities |
57 | Semantics-Preserving Hashing for Cross-View Retrieval |
58 | Object Proposal by Multi-Branch Hierarchical Segmentation |
59 | Ambient Occlusion via Compressive Visibility Estimation |
60 | Shape-Tailored Local Descriptors and Their Application to Segmentation and Tracking |
61 | Scalable Object Detection by Filter Compression With Regularized Sparse Coding |
62 | An Improved Deep Learning Architecture for Person Re-Identification |
63 | Understanding Classifier Errors by Examining Influential Neighbors |
64 | Riemannian Coding and Dictionary Learning: Kernels to the Rescue |
65 | Scalable Structure From Motion for Densely Sampled Videos |
66 | Parsing Occluded People by Flexible Compositions |
67 | Joint Calibration of Ensemble of Exemplar SVMs |
68 | Holistic 3D Scene Understanding From a Single Geo-Tagged Image |
69 | A Large-Scale Car Dataset for Fine-Grained Categorization and Verification |
70 | DeepContour: A Deep Convolutional Feature Learned by Positive-Sharing Loss for Contour Detection |
71 | Convolutional Feature Masking for Joint Object and Stuff Segmentation |
72 | A Fixed Viewpoint Approach for Dense Reconstruction of Transparent Objects |
73 | Low-Level Vision by Consensus in a Spatial Hierarchy of Regions |
74 | Line Drawing Interpretation in a Multi-View Context |
75 | Toward User-Specific Tracking by Detection of Human Shapes in Multi-Cameras |
76 | Intra-Frame Deblurring by Leveraging Inter-Frame Camera Motion |
77 | Salient Object Subitizing |
78 | Hierarchical-PEP Model for Real-World Face Recognition |
79 | The Common Self-Polar Triangle of Concentric Circles and Its Application to Camera Calibration |
80 | Taking a Deeper Look at Pedestrians |
81 | Learning to Segment Moving Objects in Videos |
82 | GMMCP Tracker: Globally Optimal Generalized Maximum Multi Clique Problem for Multiple Object Tracking |
83 | Learning Graph Structure for Multi-Label Image Classification via Clique Generation |
84 | Matrix Completion for Resolving Label Ambiguity |
85 | Video Magnification in Presence of Large Motions |
86 | Flying Objects Detection From a Single Moving Camera |
87 | Line-Based Multi-Label Energy Optimization for Fisheye Image Rectification and Calibration |
88 | Adaptive Eye-Camera Calibration for Head-Worn Devices |
89 | Modeling Object Appearance Using Context-Conditioned Component Analysis |
90 | Displets: Resolving Stereo Ambiguities Using Object Knowledge |
91 | Time-to-Contact From Image Intensity |
92 | Transferring a Semantic Representation for Person Re-Identification and Search |
93 | Robust Video Segment Proposals With Painless Occlusion Handling |
94 | Face Alignment Using Cascade Gaussian Process Regression Trees |
95 | Regularizing Max-Margin Exemplars by Reconstruction and Generative Models |
96 | A Fast Algorithm for Elastic Shape Distances Between Closed Planar Curves |
97 | Reflection Removal for In-Vehicle Black Box Videos |
98 | Tree Quantization for Large-Scale Similarity Search and Classification |
99 | Integrating Parametric and Non-Parametric Models For Scene Labeling |
100 | Mining Semantic Affordances of Visual Object Categories |
101 | Causal Video Object Segmentation From Persistence of Occlusions |
102 | Multiple Instance Learning for Soft Bags via Top Instances |
103 | Multiclass Semantic Video Segmentation With Object-Level Active Inference |
104 | Effective Face Frontalization in Unconstrained Images |
105 | Action Recognition With Trajectory-Pooled Deep-Convolutional Descriptors |
106 | Weakly Supervised Localization of Novel Objects Using Appearance Transfer |
107 | First-Person Pose Recognition Using Egocentric Workspaces |
108 | Simultaneous Time-of-Flight Sensing and Photometric Stereo With a Single ToF Sensor |
109 | Active Learning and Discovery of Object Categories in the Presence of Unnameable Instances |
110 | Learning to Compare Image Patches via Convolutional Neural Networks |
111 | Watch-n-Patch: Unsupervised Understanding of Actions and Relations |
112 | Optimal Graph Learning With Partial Tags and Multiple Features for Image and Video Annotation |
113 | DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection |
114 | Picture: A Probabilistic Programming Language for Scene Perception |
115 | Exploiting Uncertainty in Regression Forests for Accurate Camera Relocalization |
116 | Fusing Subcategory Probabilities for Texture Classification |
117 | Video Event Recognition With Deep Hierarchical Context Model |
118 | Object-Based RGBD Image Co-Segmentation With Mutex Constraint |
119 | Associating Neural Word Embeddings With Deep Image Representations Using Fisher Vectors |
120 | 3D Shape Estimation From 2D Landmarks: A Convex Relaxation Approach |
Wednesday June 10, 8:30am-10:00am
Action and Event Recognition | Computational Photography |
---|---|
Ballrooms A,B,C | Rooms 302,304,306 |
How Many Bits Does it Take for a Stimulus to Be Salient? | Visual Vibrometry: Estimating Material Properties From Small Motion in Video |
Deeply Learned Attributes for Crowded Scene Understanding | Recovering Inner Slices of Translucent Objects by Multi-Frequency Illumination |
Joint Inference of Groups, Events and Human Roles in Aerial Videos | Fast Bilateral-Space Stereo for Synthetic Defocus |
Modeling Video Evolution for Action Recognition | Simultaneous Video Defogging and Stereo Reconstruction |
Space-Time Tree Ensemble for Action Recognition | One-Day Outdoor Photometric Stereo via Skylight Estimation |
Social Saliency Prediction |
Wednesday June 10, 10:30am-12:25pm
Plenary Speakers |
---|
Ballrooms A,B,C |
What's Wrong with Deep Learning? Yann LeCun Facebook AI Research & New York University Deep learning methods have had a profound impact on a number of areas in recent years, including natural image understanding and speech recognition. Other areas seem on the verge of being similarly impacted, notably natural language processing, biomedical image analysis, and the analysis of sequential signals in a variety of application domains. But deep learning systems, as they exist today, have many limitations. First, they lack mechanisms for reasoning, search, and inference. Complex and/or ambiguous inputs require deliberate reasoning to arrive at a consistent interpretation. Producing structured outputs, such as a long text, or a label map for image segmentation, require sophisticated search and inference algorithms to satisfy complex sets of constraints. One approach to this problem is to marry deep learning with structured prediction (an idea first presented at CVPR 1997). While several deep learning systems augmented with structured prediction modules trained end to end have been proposed for OCR, body pose estimation, and semantic segmentation, new concepts are needed for tasks that require more complex reasoning. Second, they lack short-term memory. Many tasks in natural language understanding, such as question-answering, require a way to temporarily store isolated facts. Correctly interpreting events in a video and being able to answer questions about it requires remembering abstract representations of what happens in the video. Deep learning systems, including recurrent nets, are notoriously inefficient at storing temporary memories. This has led researchers to propose neural nets systems augmented with separate memory modules, such as LSTM, Memory Networks, Neural Turing Machines, and Stack-Augmented RNN. While these proposals are interesting, new ideas are needed. Lastly, they lack the ability to perform unsupervised learning. Animals and humans learn most of the structure of the perceptual world in an unsupervised manner. While the interest of the ML community in neural nets was revived in the mid-2000s by progress in unsupervised learning, the vast majority of practical applications of deep learning have used purely supervised learning. There is little doubt that future progress in computer vision will require breakthroughs in unsupervised learning, particularly for video understanding, But what principles should unsupervised learning be based on? Preliminary works in each of these areas pave the way for future progress in image and video understanding. Biography: Yann LeCun is Director of AI Research at Facebook, and Silver Professor of Data Science, Computer Science, Neural Science, and Electrical Engineering at New York University, affiliated with the NYU Center for Data Science, the Courant Institute of Mathematical Science, the Center for Neural Science, and the Electrical and Computer Engineering Department. He received the Electrical Engineer Diploma from Ecole Superieure d'Ingenieurs en Electrotechnique et Electronique (ESIEE), Paris in 1983, and a PhD in Computer Science from Universite Pierre et Marie Curie (Paris) in 1987. After a postdoc at the University of Toronto, he joined AT&T Bell Laboratories in Holmdel, NJ in 1988. He became head of the Image Processing Research Department at AT&T Labs-Research in 1996, and joined NYU as a professor, after a brief period as a Fellow of the NEC Research Institute in Princeton. He directed NYU's initiative in data science and became the founding director of the NYU Center for Data Science. He was named Director of AI Research at Facebook in late and retains a part-time position on the NYU faculty. His current interests include AI, machine learning, computer perception, mobile robotics, and computational neuroscience. He has published over 180 technical papers and book chapters on these topics as well as on neural networks, handwriting recognition, image processing and compression, and on dedicated circuits and architectures for computer perception. The character recognition technology he developed at Bell Labs is used by several banks around the world to read checks and was reading between 10 and 20% of all the checks in the US in the early 2000s. His image compression technology, called DjVu, is used by hundreds of web sites and publishers and millions of users to access scanned documents on the Web. Since the late 80's he has been working on deep learning methods, particularly the convolutional network model, which is the basis of many products and services deployed by companies such as Facebook, Google, Microsoft, Baidu, IBM, NEC, AT&T and others for image and video understanding, document recognition, human-computer interaction, and speech recognition. LeCun has been on the editorial board of IJCV, IEEE PAMI, and IEEE Trans. Neural Networks, was program chair of CVPR'06, and is chair of ICLR. He is on the science advisory board of Institute for Pure and Applied Mathematics, and has advised many large and small companies about machine learning technology, including several startups he co-founded. He is the lead faculty at NYU for the Moore-Sloan Data Science Environment, a $36M initiative in collaboration with UC Berkeley and University of Washington to develop data-driven methods in the sciences. He is the recipient of the IEEE Neural Network Pioneer Award. |
Reverse Engineering the Human Visual System Jack L. Gallant University of California at Berkeley The human brain is the most sophisticated image processing system known, capable of impressive feats of recognition and discrimination under challenging natural conditions. Reverse-engineering the brain might enable us to design artificial systems with the same capabilities. My laboratory uses a data-driven system identification approach to tackle this reverse-engineering problem. Our approach consists of four broad stages. First, we use functional MRI to measure brain activity while people watch naturalistic movies. We divide these data into two parts, one use to fit models and one for testing model predictions. Second, we use a system identification framework (based on multiple linearizing feature spaces) to model activity measured at each point in the brain. Third, we inspect the most accurate models to understand how the brain represents low-, mid- and high-level information in the movies. Finally, we use the estimated models to decode brain activity, reconstructing the structural and semantic content in the movies. Any effort to reverse-engineer the brain is inevitably limited by the spatial and temporal resolution of brain measurements, and at this time the resolution of human brain measurements is relatively poor. Still, as measurement technology progresses this framework could inform development of biologically-inspired computer vision systems, and it could aid in development of practical new brain reading technologies. Biography: Jack Gallant is Chancellor's Professor of Psychology at the University of California at Berkeley. He is affiliated with the graduate programs in Bioengineering, Biophysics, Neuroscience and Vision Science. He received his Ph.D. from Yale University and did post-doctoral work at the California Institute of Technology and Washington University Medical School. His research program focuses on computational modeling of the human brain. These models accurately describe how the brain encodes information during complex, naturalistic tasks, and they show how information about the external and internal world are mapped systematically across the surface of the cerebral cortex. These models can also be used to decode information in the brain in order to reconstruct mental experiences. Gallant's brain decoding algorithm was one of Times Magazine's Inventions of the Year, and he appears frequently on radio and television. Further information about ongoing work in the Gallant lab, links to talks and papers, and links to an online interactive brain viewer. |
Wednesday June 10, 2:00pm-3:30pm
Learning and Matching Local Features | Image and Video Processing and Restoration |
---|---|
Ballrooms A,B,C | Rooms 302,304,306 |
Domain-Size Pooling in Local Descriptors: DSP-SIFT | Generalized Video Deblurring for Dynamic Scenes |
Learning Deep Representations for Ground-to-Aerial Geolocalization | Approximate Nearest Neighbor Fields in Video |
Understanding Deep Image Representations by Inverting Them | Single Image Super-Resolution From Transformed Self-Exemplars |
Situational Object Boundary Detection | L0TV: A New Method for Image Restoration in the Presence of Impulse Noise |
Fast 2D Border Ownership Assignment | On Learning Optimized Reaction Diffusion Processes for Effective Image Restoration |
A Flexible Tensor Block Coordinate Ascent Scheme for Hypergraph Matching | Fast and Flexible Convolutional Sparse Coding |
Wednesday June 10, 3:30pm-6:00pm
Poster Session | |
---|---|
Session 3B, Exhibit Hall A | |
Poster # | Title and Authors |
1 | 3D All The Way: Semantic Segmentation of Urban Scenes From Start to End in 3D |
2 | Fast Bilateral-Space Stereo for Synthetic Defocus |
3 | Large-Scale and Drift-Free Surface Reconstruction Using Online Subvolume Registration |
4 | Fast Randomized Singular Value Thresholding for Nuclear Norm Minimization |
5 | LMI-Based 2D-3D Registration: From Uncalibrated Images to Euclidean Scene |
6 | Clique-Graph Matching by Preserving Global & Local Structure |
7 | Appearance-Based Gaze Estimation in the Wild |
8 | One-Day Outdoor Photometric Stereo via Skylight Estimation |
9 | A New Retraction for Accelerating the Riemannian Three-Factor Low-Rank Matrix Completion Algorithm |
10 | Heteroscedastic Max-Min Distance Analysis |
11 | Sparse Composite Quantization |
12 | Sparse Representation Classification With Manifold Constraints Transfer |
13 | CIDEr: Consensus-Based Image Description Evaluation |
14 | Joint Inference of Groups, Events and Human Roles in Aerial Videos |
15 | Photometric Stereo With Near Point Lighting: A Solution by Mesh Deformation |
16 | Efficient Label Collection for Unlabeled Image Datasets |
17 | Separating Objects and Clutter in Indoor Scenes |
18 | FaLRR: A Fast Low Rank Representation Solver |
19 | Simulating Makeup Through Physics-Based Manipulation of Intrinsic Image Layers |
20 | Correlation Filters With Limited Boundaries |
21 | Shape-Based Automatic Detection of a Large Number of 3D Facial Landmarks |
22 | Material Classification With Thermal Imagery |
23 | Deeply Learned Attributes for Crowded Scene Understanding |
24 | Learning To Look Up: Realtime Monocular Gaze Correction Using Machine Learning |
25 | Background Subtraction via Generalized Fused Lasso Foreground Modeling |
26 | Mirror, Mirror on the Wall, Tell Me, Is the Error Small? |
27 | Beyond Short Snippets: Deep Networks for Video Classification |
28 | segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection |
29 | Situational Object Boundary Detection |
30 | Real-Time 3D Head Pose and Facial Landmark Estimation From Depth Images Using Triangular Surface Patch Features |
31 | Aligning 3D Models to RGB-D Images of Cluttered Scenes |
32 | A Stable Multi-Scale Kernel for Topological Machine Learning |
33 | The Treasure Beneath Convolutional Layers: Cross-Convolutional-Layer Pooling for Image Classification |
34 | Face Video Retrieval With Image Query via Hashing Across Euclidean Space and Riemannian Manifold |
35 | EgoSampling: Fast-Forward and Stereo for Egocentric Videos |
36 | Social Saliency Prediction |
37 | Beyond Principal Components: Deep Boltzmann Machines for Face Modeling |
38 | Statistical Inference Models for Image Datasets With Systematic Variations |
39 | Beyond Frontal Faces: Improving Person Recognition Using Multiple Cues |
40 | Superpixel-Based Video Object Segmentation Using Perceptual Organization and Location Prior |
41 | Robust Image Filtering Using Joint Static and Dynamic Guidance |
42 | Solving Multiple Square Jigsaw Puzzles With Missing Pieces |
43 | A Dynamic Convolutional Layer for Short Range Weather Prediction |
44 | SWIFT: Sparse Withdrawal of Inliers in a First Trial |
45 | VIP: Finding Important People in Images |
46 | Dataset Fingerprints: Exploring Image Collections Through Data Mining |
47 | Transport-Based Single Frame Super Resolution of Very Low Resolution Face Images |
48 | 3D Reconstruction in the Presence of Glasses by Acoustic and Stereo Fusion |
49 | Deep Sparse Representation for Robust Image Registration |
50 | Real-Time Part-Based Visual Tracking via Adaptive Correlation Filters |
51 | Beyond Spatial Pooling: Fine-Grained Representation Learning in Multiple Domains |
52 | HC-Search for Structured Prediction in Computer Vision |
53 | Revisiting Kernelized Locality-Sensitive Hashing for Improved Large-Scale Image Retrieval |
54 | High-Speed Hyperspectral Video Acquisition With a Dual-Camera Architecture |
55 | More About VLAD: A Leap From Euclidean to Riemannian Manifolds |
56 | Camera Intrinsic Blur Kernel Estimation: A Reliable Framework |
57 | Classifier Learning With Hidden Information |
58 | Single Target Tracking Using Adaptive Clustered Decision Trees and Dynamic Multi-Level Appearance Models |
59 | Simultaneous Video Defogging and Stereo Reconstruction |
60 | Face Alignment by Coarse-to-Fine Shape Searching |
61 | Learning Deep Representations for Ground-to-Aerial Geolocalization |
62 | Unsupervised Simultaneous Orthogonal Basis Clustering Feature Selection |
63 | Space-Time Tree Ensemble for Action Recognition |
64 | Subgraph Decomposition for Multi-Target Tracking |
65 | Understanding Image Structure via Hierarchical Shape Parsing |
66 | Coarse-To-Fine Region Selection and Matching |
67 | Label Consistent Quadratic Surrogate Model for Visual Saliency Prediction |
68 | Subgraph Matching Using Compactness Prior for Robust Feature Correspondence |
69 | Pedestrian Detection Aided by Deep Learning Semantic Tasks |
70 | Multihypothesis Trajectory Analysis for Robust Visual Tracking |
71 | Domain-Size Pooling in Local Descriptors: DSP-SIFT |
72 | Object Detection by Labeling Superpixels |
73 | Fast 2D Border Ownership Assignment |
74 | From Single Image Query to Detailed 3D Reconstruction |
75 | Fast and Flexible Convolutional Sparse Coding |
76 | Iteratively Reweighted Graph Cut for Multi-Label MRFs With Non-Convex Priors |
77 | Pairwise Geometric Matching for Large-Scale Object Retrieval |
78 | Deep Convolutional Neural Fields for Depth Estimation From a Single Image |
79 | Data-Driven Sparsity-Based Restoration of JPEG-Compressed Images in Dual Transform-Pixel Domain |
80 | TVSum: Summarizing Web Videos Using Titles |
81 | Understanding Deep Image Representations by Inverting Them |
82 | Single Image Super-Resolution From Transformed Self-Exemplars |
83 | Constrained Planar Cuts - Object Partitioning for Point Clouds |
84 | A Weighted Sparse Coding Framework for Saliency Detection |
85 | Handling Motion Blur in Multi-Frame Super-Resolution |
86 | Approximate Nearest Neighbor Fields in Video |
87 | Inverting RANSAC: Global Model Detection via Inlier Rate Estimation |
88 | Robust Multi-Image Based Blind Face Hallucination |
89 | On Learning Optimized Reaction Diffusion Processes for Effective Image Restoration |
90 | A Flexible Tensor Block Coordinate Ascent Scheme for Hypergraph Matching |
91 | TILDE: A Temporally Invariant Learned DEtector |
92 | A Maximum Entropy Feature Descriptor for Age Invariant Face Recognition |
93 | Sense Discovery via Co-Clustering on Images and Text |
94 | An Approximate Shading Model for Object Relighting |
95 | Deep Domain Adaptation for Describing People Based on Fine-Grained Clothing Attributes |
96 | A Convolutional Neural Network Cascade for Face Detection |
97 | Visual Vibrometry: Estimating Material Properties From Small Motion in Video |
98 | Jointly Learning Heterogeneous Features for RGB-D Activity Recognition |
99 | Convolutional Neural Networks at Constrained Time Cost |
100 | Fine-Grained Histopathological Image Analysis via Robust Segmentation and Large-Scale Retrieval |
101 | L0TV: A New Method for Image Restoration in the Presence of Impulse Noise |
102 | Modeling Video Evolution for Action Recognition |
103 | Long-Term Correlation Tracking |
104 | Joint Tracking and Segmentation of Multiple Targets |
105 | RGBD-Fusion: Real-Time High Precision Depth Recovery |
106 | Modeling Deformable Gradient Compositions for Single-Image Super-Resolution |
107 | Generalized Video Deblurring for Dynamic Scenes |
108 | Active Pictorial Structures |
109 | Ego-Surfing First-Person Videos |
110 | Visual Saliency Based on Multiscale Deep Features |
111 | Recovering Inner Slices of Translucent Objects by Multi-Frequency Illumination |
112 | Local High-Order Regularization on Data Manifolds |
113 | Fine-Grained Classification of Pedestrians in Video: Benchmark and State of the Art |
114 | Curriculum Learning of Multiple Tasks |
115 | How Many Bits Does it Take for a Stimulus to Be Salient? |
116 | Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction |
117 | SOLD: Sub-Optimal Low-rank Decomposition for Efficient Video Segmentation |
118 | On the Appearance of Translucent Edges |
119 | On Pairwise Costs for Network Flow Multi-Object Tracking |
120 | Fine-Grained Recognition Without Part Annotations |
121 | Robust Reconstruction of Indoor Scenes |