|
|
[8:45-09:45] Oral Session 1A - Vision and Language |
|
| | Ask Your Neurons: A Neural-Based Approach to Answering Questions About Images |
| | Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing |
| | Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books |
| | Learning Query and Image Similarities With Ranking Canonical Correlation Analysis |
|
|
[09:45-12:15] Poster Session 1A - Recognition, Low-Level Vision, and Biomedical Image Analysis |
|
| 1 | Learning to See by Moving |
| 2 | Object Detection Using Generalization and Efficiency Balanced Co-Occurrence Features |
| 3 | Mining And-Or Graphs for Graph Matching and Object Discovery |
| 4 | Pose Induction for Novel Object Categories |
| 5 | Dynamic Texture Recognition via Orthogonal Tensor Dictionary Learning |
| 6 | Convolutional Channel Features |
| 7 | Local Convolutional Features With Unsupervised Training for Image Retrieval |
| 8 | RIDE: Reversal Invariant Descriptor Enhancement |
| 9 | Discrete Tabu Search for Graph Matching |
| 10 | Discriminative Learning of Deep Convolutional Feature Point Descriptors |
| 11 | Amodal Completion and Size Constancy in Natural Scenes |
| 12 | Learning Where to Position Parts in 3D |
| 13 | Query Adaptive Similarity Measure for RGB-D Object Recognition |
| 14 | Listening With Your Eyes: Towards a Practical Visual Speech Recognition System Using Deep Boltzmann Machines |
| 15 | Cluster-Based Point Set Saliency |
| 16 | A Comprehensive Multi-Illuminant Dataset for Benchmarking of the Intrinsic Image Algorithms |
| 17 | PatchMatch-Based Automatic Lattice Detection for Near-Regular Textures |
| 18 | A Data-Driven Metric for Comprehensive Evaluation of Saliency Models |
| 19 | A Matrix Decomposition Perspective to Multiple Graph Matching |
| 20 | Fast and Effective L0 Gradient Minimization by Region Fusion |
| 21 | Generic Promotion of Diffusion-Based Salient Object Detection |
| 22 | Nighttime Haze Removal With Glow and Multiple Light Colors |
| 23 | Conformal and Low-Rank Sparse Representation for Image Restoration |
| 24 | Patch Group Based Nonlocal Self-Similarity Prior Learning for Image Denoising |
| 25 | Automatic Thumbnail Generation Based on Visual Representativeness and Foreground Recognizability |
| 26 | SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks |
| 27 | A Novel Sparsity Measure for Tensor Recovery |
| 28 | Oriented Object Proposals |
| 29 | Learning Nonlinear Spectral Filters for Color Image Reconstruction |
| 30 | Beyond White: Ground Truth Colors for Color Constancy Correction |
| 31 | RGB-Guided Hyperspectral Image Upsampling |
| 32 | Projection Onto the Manifold of Elongated Structures for Accurate Extraction |
| 33 | Naive Bayes Super-Resolution Forest |
| 34 | POP Image Fusion - Derivative Domain Image Fusion Without Reintegration |
| 35 | Adaptive Spatial-Spectral Dictionary Learning for Hyperspectral Image Denoising |
| 36 | Fully Connected Guided Image Filtering |
| 37 | Segment Graph Based Image Filtering: Fast Structure-Preserving Smoothing |
| 38 | Deep Networks for Image Super-Resolution With Sparse Prior |
| 39 | Convolutional Color Constancy |
| 40 | Learning Ordinal Relationships for Mid-Level Vision |
| 41 | Thin Structure Estimation With Curvature Regularization |
| 42 | HARF: Hierarchy-Associated Rich Features for Salient Object Detection |
| 43 | Deep Colorization |
| 44 | Image Matting With KL-Divergence Based Sparse Sampling |
| 45 | Intrinsic Decomposition of Image Sequences From Local Temporal Variations |
| 46 | Low-Rank Tensor Approximation With Laplacian Scale Mixture Modeling for Multiframe Image Denoising |
| 47 | Learning Parametric Distributions for Image Super-Resolution: Where Patch Matching Meets Sparse Coding |
| 48 | Improving Image Restoration With Soft-Rounding |
| 49 | See the Difference: Direct Pre-Image Reconstruction and Pose Estimation by Differentiating HOG |
| 50 | An Efficient Statistical Method for Image Noise Level Estimation |
| 51 | Contour Detection and Characterization for Asynchronous Event Sensors |
| 52 | Class-Specific Image Deblurring |
| 53 | High-for-Low and Low-for-High: Efficient Boundary Detection From Deep Object Features and its Applications to High-Level Vision |
| 54 | Variational Depth Superresolution Using Example-Based Edge Representations |
| 55 | Conditioned Regression Models for Non-Blind Single Image Super-Resolution |
| 56 | Video Super-Resolution via Deep Draft-Ensemble Learning |
| 57 | Pan-Sharpening With a Hyper-Laplacian Penalty |
| 58 | Video Restoration Against Yin-Yang Phasing |
| 59 | Rolling Shutter Super-Resolution |
| 60 | Learning Large-Scale Automatic Image Colorization |
| 61 | Compression Artifacts Reduction by a Deep Convolutional Network |
| 62 | Multiple-Hypothesis Affine Region Estimation With Anisotropic LoG Filters |
| 63 | A Self-Paced Multiple-Instance Learning Framework for Co-Saliency Detection |
| 64 | External Patch Prior Guided Internal Clustering for Image Denoising |
| 66 | Illumination Robust Color Naming via Label Propagation |
| 67 | Unsupervised Cross-Modal Synthesis of Subject-Specific Scans |
| 68 | Learning to Boost Filamentary Structure Segmentation |
| 69 | Weakly-Supervised Structured Output Learning With Flexible and Latent Graphs Using High-Order Loss Functions |
| 70 | Efficient Classifier Training to Minimize False Merges in Electron Microscopy Segmentation |
| 71 | On Statistical Analysis of Neuroimages With Imperfect Registration |
| 72 | [From Oral 1A] Ask Your Neurons: A Neural-Based Approach to Answering Questions About Images |
| 73 | [From Oral 1A] Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing |
| 74 | [From Oral 1A] Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books |
| 75 | [From Oral 1A] Learning Query and Image Similarities With Ranking Canonical Correlation Analysis |
|
|
[12:15-13:15] Special Session 1A - Plenary Session |
|
| | Convex Optimization With Abstract Linear Operators |
|
|
[14:45-17:15] Poster Session 1B - Recognition and 3D Computer Vision I |
|
| 1 | Building Dynamic Cloud Maps From the Ground Up |
| 2 | A Versatile Learning-Based 3D Temporal Tracker: Scalable, Robust, Online |
| 3 | Realtime Edge-Based Visual Odometry for a Monocular Camera |
| 4 | Fill and Transfer: A Simple Physics-Based Approach for Containability Reasoning |
| 5 | On Linear Structure From Motion for Light Field Cameras |
| 6 | 3D Object Reconstruction From Hand-Object Interactions |
| 7 | Minimal Solvers for 3D Geometry From Satellite Imagery |
| 8 | An Efficient Minimal Solution for Multi-Camera Motion |
| 9 | Learning Shape, Motion and Elastic Models in Force Space |
| 10 | A Versatile Scene Model With Differentiable Visibility Applied to Generative Pose Estimation |
| 11 | Semantic Pose Using Deep Networks Trained on Synthetic RGB-D |
| 12 | Exploiting High Level Scene Cues in Stereo Reconstruction |
| 13 | Point Triangulation Through Polyhedron Collapse Using the l∞ Norm |
| 14 | Optimizing the Viewing Graph for Structure-From-Motion |
| 15 | Intrinsic Scene Decomposition From RGB-D images |
| 16 | 3D Hand Pose Estimation Using Randomized Decision Forest With Segmentation Index Points |
| 17 | Accurate Camera Calibration Robust to Defocus Using a Smartphone |
| 18 | High Quality Structure From Small Motion for Rolling Shutter Cameras |
| 19 | Photogeometric Scene Flow for High-Detail Dynamic 3D Reconstruction |
| 20 | Blur-Aware Disparity Estimation From Defocus Stereo Images |
| 21 | Global Structure-From-Motion by Similarity Averaging |
| 22 | Massively Parallel Multiview Stereopsis by Surface Normal Diffusion |
| 23 | Variational PatchMatch MultiView Reconstruction and Refinement |
| 24 | As-Rigid-As-Possible Volumetric Shape-From-Template |
| 25 | General Dynamic Scene Reconstruction From Multiple View Video |
| 26 | The Joint Image Handbook |
| 27 | Direct, Dense, and Deformable: Template-Based Non-Rigid 3D Reconstruction From RGB Video |
| 28 | Single Image Pop-Up From Discriminatively Learned Parts |
| 29 | Learning Informative Edge Maps for Indoor Scene Layout Prediction |
| 30 | Multi-View Convolutional Neural Networks for 3D Shape Recognition |
| 31 | Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images |
| 32 | 3D Surface Profilometry Using Phase Shifting of De Bruijn Pattern |
| 33 | A Deep Visual Correspondence Embedding Model for Stereo Matching Costs |
| 34 | Learning Concept Embeddings With Combined Human-Machine Expertise |
| 35 | Deep Multi-Patch Aggregation Network for Image Style, Aesthetics, and Quality Estimation |
| 36 | Towards Computational Baby Learning: A Weakly-Supervised Approach for Object Detection |
| 37 | Improving Image Classification With Location Context |
| 38 | HICO: A Benchmark for Recognizing Human-Object Interactions in Images |
| 39 | Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification |
| 40 | Continuous Pose Estimation With a Spatial Ensemble of Fisher Regressors |
| 41 | Adaptive Hashing for Fast Similarity Search |
| 42 | Single Image 3D Without a Single 3D Image |
| 43 | Cross-Domain Image Retrieval With a Dual Attribute-Aware Ranking Network |
| 44 | Attribute-Graph: A Graph Based Approach to Image Ranking |
| 45 | Contextual Action Recognition With R*CNN |
| 46 | What Makes an Object Memorable? |
| 47 | kNN Hashing With Factorized Neighborhood Representation |
| 48 | Multi-View Complementary Hash Tables for Nearest Neighbor Search |
| 49 | Scalable Person Re-Identification: A Benchmark |
| 50 | MMSS: Multi-Modal Sharable and Specific Feature Learning for RGB-D Object Recognition |
| 51 | Object Detection via a Multi-Region and Semantic Segmentation-Aware CNN Model |
| 52 | Neural Activation Constellations: Unsupervised Part Model Discovery With Convolutional Networks |
| 53 | Cascaded Sparse Spatial Bins for Efficient and Effective Generic Object Detection |
| 54 | Probabilistic Label Relation Graphs With Ising Models |
| 55 | Predicting Good Features for Image Geo-Localization Using Per-Bundle VLAD |
| 56 | Task-Driven Feature Pooling for Image Classification |
| 57 | Cutting Edge: Soft Correspondences in Multimodal Scene Parsing |
| 58 | One Shot Learning via Compositions of Meaningful Patches |
| 59 | FASText: Efficient Unconstrained Scene Text Detector |
| 60 | Multi-Scale Recognition With DAG-CNNs |
| 61 | Relaxed Multiple-Instance SVM With Application to Object Discovery |
| 62 | Im2Calories: Towards an Automated Mobile Vision Food Diary |
| 63 | LEWIS: Latent Embeddings for Word Images and their Semantics |
| 64 | Per-Sample Kernel Adaptation for Visual Recognition and Grouping |
| 65 | Fine-Grained Change Detection of Misaligned Scenes With Varied Illuminations |
| 66 | Aggregating Local Deep Features for Image Retrieval |
| 67 | Learning Deep Object Detectors From 3D Models |
| 68 | Harvesting Discriminative Meta Objects With Deep CNN Features for Scene Classification |
| 69 | Scalable Nonlinear Embeddings for Semantic Category-Based Image Retrieval |
| 70 | Person Re-Identification Ranking Optimisation by Discriminant Context Information Analysis |
| 71 | Unsupervised Generation of a Viewpoint Annotated Car Dataset From Videos |
| 72 | [From Oral 1B] Structured Indoor Modeling |
| 73 | [From Oral 1B] 3D Time-Lapse Reconstruction From Internet Photos |
| 74 | [From Oral 1B] Global, Dense Multiscale Reconstruction for a Billion Points |
| 75 | [From Oral 1B] On the Visibility of Point Clouds |
|
|
[17:15-18:15] Oral Session 1B - 3D Vision |
|
| | Structured Indoor Modeling |
| | 3D Time-Lapse Reconstruction From Internet Photos |
| | Global, Dense Multiscale Reconstruction for a Billion Points |
| | On the Visibility of Point Clouds |
|
|
[8:30-10:00] Oral Session 2A - Segmentation, Edges and Saliency |
|
| | Weakly Supervised Graph Based Semantic Segmentation by Learning Communities of Image-Parts |
| | Piecewise Flat Embedding for Image Segmentation |
| | Semantic Image Segmentation via Deep Parsing Network |
| | Human Parsing With Contextualized Convolutional Neural Network |
| | Holistically-Nested Edge Detection |
| | Minimum Barrier Salient Object Detection at 80 FPS |
|
|
[10:30-12:00] Oral Session 2B - Learning Representations and Attributes |
|
| | Learning Image Representations Tied to Ego-Motion |
| | Unsupervised Visual Representation Learning by Context Prediction |
| | Webly Supervised Learning of Convolutional Networks |
| | Fast R-CNN |
| | Bilinear CNN Models for Fine-Grained Visual Recognition |
| | Discovering the Spatial Extent of Relative Attributes |
|
|
[13:30-15:00] Oral Session 2C - Statistical Methods and Learning |
|
| | Deep Neural Decision Forests |
| | Deep Fried Convnets |
| | Semantic Component Analysis |
| | Low-Rank Matrix Factorization Under General Mixture Noise Distributions |
| | Web-Scale Image Clustering Revisited |
| | Learning Discriminative Reconstructions for Unsupervised Outlier Removal |
|
|
[15:00-17:30] Poster Session 2A - Optimization, Segmentation, and Recognition |
|
| 1 | Learning Deconvolution Network for Semantic Segmentation |
| 2 | Conditional Random Fields as Recurrent Neural Networks |
| 3 | The One Triangle Three Parallelograms Sampling Strategy and Its Application in Shape Regression |
| 4 | Boosting Object Proposals: From Pascal to COCO |
| 5 | Secrets of GrabCut and Kernel K-Means |
| 6 | Video Matting via Sparse and Low-Rank Representation |
| 7 | Joint Object and Part Segmentation Using Deep Learned Potentials |
| 8 | Low-Rank Tensor Constrained Multiview Subspace Clustering |
| 9 | BodyPrint: Pose Invariant 3D Shape Matching of Human Bodies |
| 10 | The Middle Child Problem: Revisiting Parametric Min-Cut and Seeds for Object Proposals |
| 11 | Contour Guided Hierarchical Model for Shape Matching |
| 12 | Robust Image Segmentation Using Contour-Guided Color Palettes |
| 13 | Joint Optimization of Segmentation and Color Clustering |
| 14 | BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation |
| 15 | Detection and Segmentation of 2D Curved Reflection Symmetric Structures |
| 16 | Unsupervised Tube Extraction Using Transductive Learning and Dense Trajectories |
| 17 | Compositional Hierarchical Representation of Shape Manifolds for Classification of Non-Manifold Shapes |
| 18 | Shell PCA: Statistical Shape Modelling in Shell Space |
| 19 | Learning to Combine Mid-Level Cues for Object Proposal Generation |
| 20 | Enhancing Road Maps by Parsing Aerial Images Around the World |
| 21 | Probabilistic Appearance Models for Segmentation and Classification |
| 22 | A Randomized Ensemble Approach to Industrial CT Segmentation |
| 23 | Semi-Supervised Normalized Cuts for Image Segmentation |
| 24 | StereoSnakes: Contour Based Consistent Object Extraction For Stereo Images |
| 25 | Semantic Segmentation of RGBD Images With Mutex Constraints |
| 26 | Weakly- and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation |
| 27 | Efficient Decomposition of Image and Mesh Graphs by Lifted Multicuts |
| 28 | Parsimonious Labeling |
| 29 | Volumetric Bias in Segmentation and Reconstruction: Secrets and Solutions |
| 30 | Entropy Minimization for Convex Relaxation Approaches |
| 31 | Adaptively Unified Semi-Supervised Dictionary Learning With Active Points |
| 32 | Constrained Convolutional Neural Networks for Weakly Supervised Segmentation |
| 33 | A Multiscale Variable-Grouping Framework for MRF Energy Minimization |
| 34 | Inferring M-Best Diverse Labelings in a Single One |
| 35 | Convolutional Sparse Coding for Image Super-Resolution |
| 36 | A Wavefront Marching Method for Solving the Eikonal Equation on Cartesian Grids |
| 37 | A Projection Free Method for Generalized Eigenvalue Problem With a Nonsmooth Regularizer |
| 38 | Optimizing Expected Intersection-Over-Union With Candidate-Constrained CRFs |
| 39 | Higher-Order Inference for Multi-Class Log-Supermodular Models |
| 40 | Depth-Based Hand Pose Estimation: Data, Methods, and Challenges |
| 41 | Adaptive Dither Voting for Robust Spatial Verification |
| 42 | Alternating Co-Quantization for Cross-Modal Hashing |
| 43 | Learning Deep Representation With Large-Scale Attributes |
| 44 | Deep Learning Strong Parts for Pedestrian Detection |
| 45 | Flowing ConvNets for Human Pose Estimation in Videos |
| 46 | Top Rank Supervised Binary Coding for Visual Search |
| 47 | BubbLeNet: Foveated Imaging for Visual Discovery |
| 48 | PQTable: Fast Exact Asymmetric Distance Neighbor Search for Product Quantization Using Hash Tables |
| 49 | Lending A Hand: Detecting Hands and Recognizing Activities in Complex Egocentric Interactions |
| 50 | Fast and Accurate Head Pose Estimation via Random Projection Forests |
| 51 | An MRF-Poselets Model for Detecting Highly Articulated Humans |
| 52 | Beyond Tree Structure Models: A New Occlusion Aware Graphical Model for Human Pose Estimation |
| 53 | Relaxing From Vocabulary: Robust Weakly-Supervised Deep Learning for Vocabulary-Free Image Tagging |
| 54 | Visual Phrases for Exemplar Face Detection |
| 55 | Spatial Semantic Regularisation for Large Scale Object Detection |
| 56 | Human Pose Estimation in Videos |
| 57 | Contour Box: Rejecting Object Proposals Without Explicit Closed Contours |
| 58 | [From Oral 2A] Weakly Supervised Graph Based Semantic Segmentation by Learning Communities of Image-Parts |
| 59 | [From Oral 2A] Piecewise Flat Embedding for Image Segmentation |
| 60 | [From Oral 2A] Semantic Image Segmentation via Deep Parsing Network |
| 61 | [From Oral 2A] Human Parsing With Contextualized Convolutional Neural Network |
| 62 | [From Oral 2A] Holistically-Nested Edge Detection |
| 63 | [From Oral 2A] Minimum Barrier Salient Object Detection at 80 FPS |
| 64 | [From Oral 2B] Learning Image Representations Tied to Ego-Motion |
| 65 | [From Oral 2B] Unsupervised Visual Representation Learning by Context Prediction |
| 66 | [From Oral 2B] Webly Supervised Learning of Convolutional Networks |
| 67 | [From Oral 2B] Fast R-CNN |
| 68 | [From Oral 2B] Bilinear CNN Models for Fine-Grained Visual Recognition |
| 69 | [From Oral 2B] Discovering the Spatial Extent of Relative Attributes |
| 70 | [From Oral 2C] Deep Neural Decision Forests |
| 71 | [From Oral 2C] Deep Fried Convnets |
| 72 | [From Oral 2C] Semantic Component Analysis |
| 73 | [From Oral 2C] Low-Rank Matrix Factorization Under General Mixture Noise Distributions |
| 74 | [From Oral 2C] Web-Scale Image Clustering Revisited |
| 75 | [From Oral 2C] Learning Discriminative Reconstructions for Unsupervised Outlier Removal |
|
|
[8:30-09:45] Oral Session 3A - Registration, Alignment and Stereo |
|
| | Registering Images to Untextured Geometry Using Average Shading Gradients |
| | Robust Nonrigid Registration by Convex Optimization |
| | Robust and Optimal Sum-of-Squares-Based Point-to-Plane Registration of Image Sets and Structured Scenes |
| | MeshStereo: A Global Stereo Model With Mesh Alignment Regularization for View Interpolation |
| | CV-HAZOP: Introducing Test Data Validation for Computer Vision |
|
|
[09:45-12:15] Poster Session 3A - Recognition and 3D Computer Vision II |
|
| 1 | Structure From Motion Using Structure-Less Resection |
| 2 | Joint Camera Clustering and Surface Segmentation for Large-Scale Multi-View Stereo |
| 3 | Higher-Order CRF Structural Segmentation of 3D Reconstructed Surfaces |
| 4 | Hyperpoints and Fine Vocabularies for Large-Scale Location Recognition |
| 5 | Globally Optimal 2D-3D Registration From Points or Lines Without Correspondences |
| 6 | The HCI Stereo Metrics: Geometry-Aware Performance Analysis of Stereo Algorithms |
| 7 | Merging the Unmatchable: Stitching Visually Disconnected SfM Models |
| 8 | 3D Fragment Reassembly Using Integrated Template Guidance and Fracture-Region Matching |
| 9 | Procedural Editing of 3D Building Point Clouds |
| 10 | Semantically-Aware Aerial Reconstruction From Multi-Modal Data |
| 11 | Guaranteed Outlier Removal for Rotation Search |
| 12 | Peeking Template Matching for Depth Extension |
| 13 | Deformable 3D Fusion: From Partial Dynamic 3D Observations to Complete 4D Models |
| 14 | Non-Parametric Structure-Based Calibration of Radially Symmetric Cameras |
| 15 | Exploiting Object Similarity in 3D Reconstruction |
| 16 | You Are Here: Mimicking the Human Thinking Process in Reading Floor-Plans |
| 17 | MAP Disparity Estimation Using Hidden Markov Trees |
| 18 | Wide Baseline Stereo Matching With Convex Bounded Distortion Constraints |
| 19 | Interactive Visual Hull Refinement for Specular and Transparent Object Surface Reconstruction |
| 20 | Hierarchical Higher-Order Regression Forest Fields: An Application to 3D Indoor Scene Labelling |
| 21 | Classical Scaling Revisited |
| 22 | Dense Continuous-Time Tracking and Mapping With Rolling Shutter RGB-D Cameras |
| 23 | Dense Image Registration and Deformable Surface Reconstruction in Presence of Occlusions and Minimal Texture |
| 25 | Reflection Modeling for Passive Stereo |
| 26 | Detailed Full-Body Reconstructions of Moving People From Monocular RGB-D Sequences |
| 27 | Efficient Solution to the Epipolar Geometry for Radially Distorted Cameras |
| 28 | Learning a Descriptor-Specific 3D Keypoint Detector |
| 29 | Component-Wise Modeling of Articulated Objects |
| 30 | A Collaborative Filtering Approach to Real-Time Hand Pose Estimation |
| 31 | On the Equivalence of Moving Entrance Pupil and Radial Distortion for Camera Calibration |
| 32 | A Linear Generalized Camera Calibration From Three Intersecting Reference Planes |
| 33 | Towards Pointless Structure From Motion: 3D Reconstruction and Camera Parameters From General 3D Curves |
| 34 | Attributed Grammars for Joint Estimation of Human Attributes, Part and Pose |
| 35 | Real-Time Pose Estimation Piggybacked on Object Detection |
| 36 | Understanding and Predicting Image Memorability at a Large Scale |
| 37 | Multiple Granularity Descriptors for Fine-Grained Categorization |
| 38 | Guiding the Long-Short Term Memory Model for Image Caption Generation |
| 39 | Just Noticeable Differences in Visual Attributes |
| 40 | VQA: Visual Question Answering |
| 41 | Localize Me Anywhere, Anytime: A Multi-Task Point-Retrieval Approach |
| 42 | Dense Optical Flow Prediction From a Static Image |
| 43 | Unsupervised Domain Adaptation for Zero-Shot Learning |
| 44 | Visual Madlibs: Fill in the Blank Description Generation and Question Answering |
| 45 | Actions and Attributes From Wholes and Parts |
| 46 | DeepBox: Learning Objectness With Convolutional Networks |
| 47 | Active Object Localization With Deep Reinforcement Learning |
| 48 | Scene-Domain Active Part Models for Object Representation |
| 49 | A Unified Multiplicative Framework for Attribute Learning |
| 50 | Contractive Rectifier Networks for Nonlinear Maximum Margin Classification |
| 51 | Augmenting Strong Supervision Using Web Data for Fine-Grained Categorization |
| 52 | Learning Like a Child: Fast Novel Visual Concept Learning From Sentence Descriptions of Images |
| 53 | Learning Common Sense Through Visual Abstraction |
| 54 | Domain Generalization for Object Recognition With Multi-Task Autoencoders |
| 55 | Square Localization for Efficient and Accurate Object Detection |
| 56 | Box Aggregation for Proposal Decimation: Last Mile of Object Detection |
| 57 | DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers |
| 58 | Semantic Segmentation With Object Clique Potential |
| 59 | Automatic Concept Discovery From Parallel Text and Visual Corpora |
| 60 | Simpler Non-Parametric Methods Provide as Good or Better Results to Multiple-Instance Learning |
| 61 | Monocular Object Instance Segmentation and Depth Ordering With CNNs |
| 62 | Multimodal Convolutional Neural Networks for Matching Image and Sentence |
| 63 | Structural Kernel Learning for Large Scale Multiclass Object Co-Detection |
| 64 | Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models |
| 65 | Predicting Depth, Surface Normals and Semantic Labels With a Common Multi-Scale Convolutional Architecture |
| 66 | AttentionNet: Aggregating Weak Directions for Accurate Object Detection |
| 67 | Common Subspace for Model and Similarity: Phrase Learning for Caption Generation From Images |
| 68 | [From Oral 3A] Registering Images to Untextured Geometry Using Average Shading Gradients |
| 69 | [From Oral 3A] Robust Nonrigid Registration by Convex Optimization |
| 70 | [From Oral 3A] Robust and Optimal Sum-of-Squares-Based Point-to-Plane Registration of Image Sets and Structured Scenes |
| 71 | [From Oral 3A] MeshStereo: A Global Stereo Model With Mesh Alignment Regularization for View Interpolation |
| 72 | [From Oral 3A] CV-HAZOP: Introducing Test Data Validation for Computer Vision |
| 73 | [From Oral 3B] 3D-Assisted Feature Synthesis for Novel Views of an Object |
| 74 | [From Oral 3B] Lost Shopping! Monocular Localization in Large Indoor Spaces |
| 75 | [From Oral 3B] Camera Pose Voting for Large-Scale Image-Based Localization |
|
|
[12:15:13:15] Oral Session 3B - 3D Representations for Recognition and Localization |
|
| | 3D-Assisted Feature Synthesis for Novel Views of an Object |
| | Render for CNN: Viewpoint Estimation in Images Using CNNs Trained With Rendered 3D Model Views |
| | Lost Shopping! Monocular Localization in Large Indoor Spaces |
| | Camera Pose Voting for Large-Scale Image-Based Localization |
|
|
[14:45-17:15] Poster Session 3B - Statistical Methods and Learning, Motion and Tracking, and Video Analysis I |
|
| 1 | MANTRA: Minimum Maximum Latent Structural SVM for Image Classification and Ranking |
| 2 | DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving |
| 3 | Active Transfer Learning With Zero-Shot Priors: Reusing Past Datasets for Future Tasks |
| 4 | HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition |
| 5 | Learning The Structure of Deep Convolutional Networks |
| 6 | FlowNet: Learning Optical Flow With Convolutional Networks |
| 7 | Learning Semi-Supervised Representation Towards a Unified Optimization Framework for Semi-Supervised Learning |
| 8 | Context-Guided Diffusion for Label Propagation on Graphs |
| 9 | Learning to Rank Based on Subsequences |
| 10 | Unsupervised Learning of Visual Representations Using Videos |
| 11 | A Nonparametric Bayesian Approach Toward Stacked Convolutional Independent Component Analysis |
| 12 | Robust Principal Component Analysis on Graphs |
| 13 | Projection Bank: From High-Dimensional Data to Medium-Length Binary Codes |
| 14 | Robust Optimization for Deep Regression |
| 15 | Multi-Class Multi-Annotator Active Learning With Robust Gaussian Process for Visual Recognition |
| 16 | Maximum-Margin Structured Learning With Deep Networks for 3D Human Pose Estimation |
| 17 | An Exploration of Parameter Redundancy in Deep Networks With Circulant Projections |
| 18 | Additive Nearest Neighbor Feature Maps |
| 19 | Understanding Deep Features With Computer-Generated Imagery |
| 20 | Interpolation on the Manifold of K Component GMMs |
| 21 | Context-Aware CNNs for Person Head Detection |
| 22 | Mode-Seeking on Hypergraphs for Robust Geometric Model Fitting |
| 23 | Highly-Expressive Spaces of Well-Behaved Transformations: Keeping It Simple |
| 24 | Entropy-Based Latent Structured Output Prediction |
| 25 | Fast Orthogonal Projection Based on Kronecker Product |
| 26 | PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization |
| 27 | Predicting Multiple Structured Visual Interpretations |
| 28 | Look and Think Twice: Capturing Top-Down Visual Attention With Feedback Convolutional Neural Networks |
| 29 | Matrix Backpropagation for Deep Networks With Structured Layers |
| 30 | Introducing Geometry in Active Learning for Image Segmentation |
| 31 | Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition |
| 32 | Direct Intrinsics: Learning Albedo-Shading Decomposition by Convolutional Regression |
| 33 | Face Flow |
| 34 | Discriminative Low-Rank Tracking |
| 35 | SOWP: Spatially Ordered and Weighted Patch Descriptor for Visual Tracking |
| 36 | Live Repetition Counting |
| 37 | Near-Online Multi-Target Tracking With Aggregated Local Flow Descriptor |
| 38 | Multi-Kernel Correlation Filter for Visual Tracking |
| 39 | Joint Probabilistic Data Association Revisited |
| 40 | Tracking-by-Segmentation With Online Gradient Boosting Decision Tree |
| 41 | Exploring Causal Relationships in Visual Object Tracking |
| 42 | Hierarchical Convolutional Features for Visual Tracking |
| 43 | Robust Non-Rigid Motion Tracking and Surface Reconstruction Using L0 Regularization |
| 44 | Online Object Tracking With Proposal Selection |
| 45 | Understanding and Diagnosing Visual Tracking Systems |
| 46 | Integrating Dashcam Views Through Inter-Video Mapping |
| 47 | Visual Tracking With Fully Convolutional Networks |
| 48 | Multiple Feature Fusion via Weighted Entropy for Visual Tracking |
| 49 | Pedestrian Travel Time Estimation in Crowded Scenes |
| 50 | Unsupervised Synchrony Discovery in Human Interaction |
| 51 | Efficient Video Segmentation Using Parametric Graph Partitioning |
| 52 | Learning to Track for Spatio-Temporal Action Localization |
| 53 | Unsupervised Object Discovery and Tracking in Video Collections |
| 54 | Car That Knows Before You Do: Anticipating Maneuvers via Learning Temporal Driving Models |
| 55 | Activity Auto-Completion: Predicting Human Activities From Partial Videos |
| 56 | Person Re-Identification With Correspondence Structure Learning |
| 57 | Adaptive Exponential Smoothing for Online Filtering of Pixel Prediction Maps |
| 58 | P-CNN: Pose-Based CNN Features for Action Recognition |
| 59 | Fully Connected Object Proposals for Video Segmentation |
| 60 | Video Segmentation With Just a Few Strokes |
| 61 | Actionness-Assisted Recognition of Actions |
| 62 | COUNT Forest: CO-Voting Uncertain Number of Targets Using Random Forest for Crowd Density Estimation |
| 63 | Multi-Cue Structure Preserving MRF for Unconstrained Video Segmentation |
| 64 | Motion Trajectory Segmentation via Minimum Cost Multicuts |
| 65 | Action Localization in Videos Through Context Walk |
| 66 | RGB-W: When Vision Meets Wireless |
| 67 | Action Detection by Implicit Intentional Motion Clustering |
| 68 | Simultaneous Foreground Detection and Classification With Hybrid Features |
| 69 | SpeDo: 6 DOF Ego-Motion Sensor Using Speckle Defocus Imaging |
| 70 | The Likelihood-Ratio Test and Efficient Robust Estimation |
| 71 | [From Oral 3B] Render for CNN: Viewpoint Estimation in Images Using CNNs Trained With Rendered 3D Model Views |
| 72 | [From Oral 3C] Training a Feedback Loop for Hand Pose Estimation |
| 73 | [From Oral 3C] Opening the Black Box: Hierarchical Sampling Optimization for Estimating Human Hand Pose |
| 74 | [From Oral 3C] Panoptic Studio: A Massively Multiview System for Social Motion Capture |
| 75 | [From Oral 3C] Where to Buy It: Matching Street Clothing Photos in Online Shops |
| 76 | [From Oral 3C] Multi-Task Recurrent Neural Network for Immediacy Prediction |
| 77 | [From Oral 3C] Learning Complexity-Aware Cascades for Deep Pedestrian Detection |
|
|
[17:15-18:45] Oral Session 3C - Vision and People |
|
| | Training a Feedback Loop for Hand Pose Estimation |
| | Opening the Black Box: Hierarchical Sampling Optimization for Estimating Human Hand Pose |
| | Panoptic Studio: A Massively Multiview System for Social Motion Capture |
| | Where to Buy It: Matching Street Clothing Photos in Online Shops |
| | Multi-Task Recurrent Neural Network for Immediacy Prediction |
| | Learning Complexity-Aware Cascades for Deep Pedestrian Detection |
|
|
[8:30-09:45] Oral Session 4A - Computational Photography and Image Enhancement |
|
| | Polarized 3D: High-Quality Depth Sensing With Polarization Cues |
| | Airborne Three-Dimensional Cloud Tomography |
| | Leave-One-Out Kernel Optimization for Shadow Detection |
| | Removing Rain From a Single Image via Discriminative Sparse Coding |
| | Mutual-Structure for Joint Filtering |
|
|
[09:45-12:15] Poster Session 4A - Computational Photography, Face and Gesture, and Vision for X |
|
| 1 | Photometric Stereo in a Scattering Medium |
| 2 | Resolving Scale Ambiguity Via XSlit Aspect Ratio Analysis |
| 3 | Single-Shot Specular Surface Reconstruction With Gonio-Plenoptic Imaging |
| 4 | TransCut: Transparent Object Segmentation From a Light-Field Image |
| 5 | Depth Recovery From Light Field Using Focal Stack Symmetry |
| 6 | Depth Map Estimation and Colorization of Anaglyph Images Using Local Color Prior and Reverse Intensity Distribution |
| 7 | Learning Data-Driven Reflectance Priors for Intrinsic Image Decomposition |
| 8 | Photometric Stereo With Small Angular Variations |
| 9 | Occlusion-Aware Depth Estimation Using Light-Field Cameras |
| 10 | Oriented Light-Field Windows for Scene Flow |
| 11 | Extended Depth of Field Catadioptric Imaging Using Focal Sweep |
| 12 | Intrinsic Depth: Improving Depth Transfer With Intrinsic Images |
| 13 | Separating Fluorescent and Reflective Components by Using a Single Hyperspectral Image |
| 14 | Frequency-Based Environment Matting by Compressive Sensing |
| 15 | Complementary Sets of Shutter Sequences for Motion Deblurring |
| 16 | Hyperspectral Compressive Sensing Using Manifold-Structured Sparsity Prior |
| 17 | A Gaussian Process Latent Variable Model for BRDF Inference |
| 18 | Active One-Shot Scan for Wide Depth Range Using a Light Field Projector Based on Coded Aperture |
| 19 | Model-Based Tracking at 300Hz Using Raw Time-of-Flight Observations |
| 20 | Hyperspectral Super-Resolution by Coupled Spectral Unmixing |
| 21 | Depth Selective Camera: A Direct, On-Chip, Programmable Technique for Depth Selectivity in Photography |
| 22 | A Groupwise Multilinear Correspondence Optimization for 3D Faces |
| 23 | Selective Encoding for Recognizing Unreliably Localized Faces |
| 24 | Confidence Preserving Machine for Facial Action Unit Detection |
| 25 | Learning Social Relation Traits From Face Images |
| 26 | Robust Heart Rate Measurement From Video Using Select Random Patches |
| 27 | Robust Model-Based 3D Head Pose Estimation |
| 28 | Robust Facial Landmark Detection Under Significant Head Poses and Occlusion |
| 29 | Conditional Convolutional Neural Network for Modality-Aware Face Recognition |
| 30 | From Facial Parts Responses to Face Detection: A Deep Learning Approach |
| 31 | Efficient PSD Constrained Asymmetric Metric Learning for Person Re-Identification |
| 32 | Pose-Invariant 3D Face Alignment |
| 33 | From Emotions to Action Units With Hidden and Semi-Hidden-Task Learning |
| 34 | Automated Facial Trait Judgment and Election Outcome Prediction: Social Dimensions of Face |
| 35 | Simultaneous Local Binary Feature Learning and Encoding for Face Recognition |
| 36 | Deep Learning Face Attributes in the Wild |
| 37 | Multi-Task Learning With Low Rank Attribute Embedding for Person Re-Identification |
| 38 | Regressing a 3D Face Shape From a Single Image |
| 39 | Rendering of Eyes for Eye-Shape Registration and Gaze Estimation |
| 40 | Multi-Scale Learning for Low-Resolution Person Re-Identification |
| 41 | Learning to Transfer: Transferring Latent Task Structures and Its Application to Person-Specific Facial Action Unit Detection |
| 42 | Pairwise Conditional Random Forests for Facial Expression Recognition |
| 43 | Multi-Conditional Latent Variable Model for Joint Facial Action Unit Detection |
| 44 | Leveraging Datasets With Varying Annotations for Face Alignment via Deep Regression Network |
| 45 | A Spatio-Temporal Appearance Representation for Viceo-Based Pedestrian Re-Identification |
| 46 | Two Birds, One Stone: Jointly Learning Binary Code for Large-Scale Face Image Retrieval and Attributes Prediction |
| 47 | An Accurate Iris Segmentation Framework Under Relaxed Imaging Constraints Using Total Variation Model |
| 48 | Discriminative Pose-Free Descriptors for Face and Object Matching |
| 49 | Bi-Shifting Auto-Encoder for Unsupervised Domain Adaptation |
| 50 | Regressive Tree Structured Model for Facial Landmark Localization |
| 51 | Person Recognition in Personal Photo Collections |
| 52 | Robust Statistical Face Frontalization |
| 53 | PIEFA: Personalized Incremental and Ensemble Face Alignment |
| 54 | Understanding Everyday Hands in Action From RGB-D Images |
| 55 | Example-Based Modeling of Facial Texture From Deficient Data |
| 56 | Learning to Predict Saliency on Face Images |
| 57 | Group Membership Prediction |
| 58 | Extraction of Virtual Baselines From Distorted Document Images Using Curvilinear Projection |
| 59 | Robust RGB-D Odometry Using Point and Line Features |
| 60 | Learning a Discriminative Model for the Perception of Realism in Composite Images |
| 61 | What Makes Tom Hanks Look Like Tom Hanks |
| 62 | Wide-Area Image Geolocalization With Aerial Reference Imagery |
| 63 | Personalized Age Progression With Aging Dictionary |
| 64 | FaceDirector: Continuous Control of Facial Performance in Video |
| 65 | Synthesizing Illumination Mosaics From Internet Photo-Collections |
| 66 | Hot or Not: Exploring Correlations Between Appearance and Temperature |
| 67 | Self-Calibration of Optical Lenses |
| 68 | [From Oral 4A] Polarized 3D: High-Quality Depth Sensing With Polarization Cues |
| 69 | [From Oral 4A] Airborne Three-Dimensional Cloud Tomography |
| 70 | [From Oral 4A] Leave-One-Out Kernel Optimization for Shadow Detection |
| 71 | [From Oral 4A] Removing Rain From a Single Image via Discriminative Sparse Coding |
| 72 | [From Oral 4A] Mutual-Structure for Joint Filtering |
| 73 | [From Oral 4B] SPM-BP: Sped-up PatchMatch Belief Propagation for Continuous MRFs |
| 74 | [From Oral 4B] Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation |
| 75 | [From Oral 4B] Dense Semantic Correspondence Where Every Pixel is a Classifier |
| 76 | [From Oral 4B] Multi-Image Matching via Fast Alternating Minimization |
|
|
[12:15-13:15] Oral Session 4B - Motion and Correspondence |
|
| | SPM-BP: Sped-up PatchMatch Belief Propagation for Continuous MRFs |
| | Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation |
| | Dense Semantic Correspondence Where Every Pixel is a Classifier |
| | Multi-Image Matching via Fast Alternating Minimization |
|
|
[14:45-17:15] Poster Session 4B - Statistical Methods and Learning, Motion and Tracking, and Video Analysis II |
|
| 1 | Differential Recurrent Neural Networks for Action Recognition |
| 2 | Similarity Gaussian Process Latent Variable Model for Multi-Modal Data Analysis |
| 3 | Learning Ensembles of Potential Functions for Structured Prediction With Latent Variables |
| 4 | Simultaneous Deep Transfer Across Domains and Tasks |
| 5 | Low Dimensional Explicit Feature Maps |
| 6 | Unsupervised Learning of Spatiotemporally Coherent Metrics |
| 7 | Multi-Label Cross-Modal Retrieval |
| 8 | Improving Ferns Ensembles by Sparsifying and Quantising Posterior Probabilities |
| 9 | Beyond Gauss: Image-Set Matching on the Riemannian Manifold of PDFs |
| 10 | Unsupervised Domain Adaptation With Imbalanced Cross-Domain Data |
| 11 | Secrets of Matrix Factorization: Approximations, Numerics, Manifold Optimization and Random Restarts |
| 12 | Geometry-Aware Deep Transform |
| 13 | Learning Binary Codes for Maximum Inner Product Search |
| 14 | ML-MG: Multi-Label Learning With Missing Labels Using a Mixed Graph |
| 15 | Zero-Shot Learning via Semantic Similarity Embedding |
| 16 | Bayesian Model Adaptation for Crowd Counts |
| 17 | An NMF Perspective on Binary Hashing |
| 18 | Multi-View Domain Generalization for Visual Recognition |
| 19 | Infinite Feature Selection |
| 20 | Semi-Supervised Zero-Shot Classification With Label Representation Learning |
| 21 | A Supervised Low-Rank Method for Learning Invariant Subspaces |
| 22 | Recursive Fréchet Mean Computation on the Grassmannian and its Applications to Computer Vision |
| 23 | Multi-View Subspace Clustering |
| 24 | Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions |
| 25 | Structured Feature Selection |
| 26 | Conditional High-Order Boltzmann Machine: A Supervised Learning Model for Relation Learning |
| 27 | Learning Image and User Features for Recommendation in Social Networks |
| 28 | Dual-Feature Warping-Based Motion Model Estimation |
| 29 | An Adaptive Data Representation for Robust Point-Set Registration and Merging |
| 30 | Local Subspace Collaborative Tracking |
| 31 | Learning Spatially Regularized Correlation Filters for Visual Tracking |
| 33 | Unsupervised Trajectory Clustering via Adaptive Multi-Kernel-Based Shrinkage |
| 34 | TRIC-track: Tracking by Regression With Incrementally Learned Cascades |
| 35 | Recurrent Network Models for Human Dynamics |
| 36 | Contour Flow: Middle-Level Motion Estimation by Combining Motion Segmentation and Contour Alignment |
| 37 | FollowMe: Efficient Online Min-Cost Flow Tracking With Bounded Memory and Computation |
| 38 | Learning to Divide and Conquer for Online Multi-Target Tracking |
| 39 | Minimizing Human Effort in Interactive Tracking by Incremental Learning of Model Parameters |
| 40 | A Novel Representation of Parts for Accurate 3D Object Detection and Tracking in Monocular Images |
| 41 | Linearization to Nonlinear Learning for Visual Tracking |
| 42 | Self-Occlusions and Disocclusions in Causal Video Object Segmentation |
| 43 | Large Displacement 3D Scene Flow With Occlusion Reasoning |
| 44 | Co-Interest Person Detection From Multiple Wearable Camera Videos |
| 45 | Sparse Dynamic 3D Reconstruction From Unsynchronized Videos |
| 46 | Category-Blind Human Action Recognition: A Practical Recognition System |
| 47 | Temporal Subspace Clustering for Human Motion Segmentation |
| 48 | Weakly-Supervised Alignment of Video With Text |
| 49 | Learning Temporal Embeddings for Complex Video Analysis |
| 50 | Unsupervised Semantic Parsing of Video Collections |
| 51 | Learning Spatiotemporal Features With 3D Convolutional Networks |
| 52 | Temporal Perception and Prediction in Ego-Centric Video |
| 53 | Describing Videos by Exploiting Temporal Structure |
| 54 | Person Re-Identification With Discriminatively Trained Viewpoint Invariant Dictionaries |
| 55 | Storyline Representation of Egocentric Videos With an Applications to Story-Based Search |
| 56 | Sequence to Sequence – Video to Text |
| 57 | Context Aware Active Learning of Activity Recognition Models |
| 58 | Action Recognition by Hierarchical Mid-Level Action Elements |
| 59 | Selecting Relevant Web Trained Concepts for Automated Event Retrieval |
| 60 | Beyond Covariance: Feature Representation With Nonlinear Kernel Matrices |
| 61 | Multiresolution Hierarchy Co-Clustering for Semantic Segmentation in Sequences With Small Variations |
| 62 | Objects2action: Classifying and Localizing Actions Without Any Video Example |
| 63 | Human Action Recognition Using Factorized Spatio-Temporal Convolutional Networks |
| 64 | Bayesian Non-Parametric Inference for Manifold Based MoCap Representation |
| 65 | Semantic Video Entity Linking Based on Visual Content and Metadata |
| 66 | Love Thy Neighbors: Image Annotation by Exploiting Image Metadata |
| 67 | Unsupervised Extraction of Video Highlights Via Robust Recurrent Auto-Encoders |
| 68 | Learning Visual Clothing Style With Heterogeneous Dyadic Co-Occurrences |
| 69 | Text Flow: A Unified Text Detection System in Natural Scene Images |
| 70 | [From Oral 4C] Uncovering Interactions and Interactors: Joint Estimation of Head, Body Orientation and F-Formations From Surveillance Videos |
| 71 | [From Oral 4C] Generating Notifications for Missing Actions: Don't Forget to Turn the Lights Off! |
| 72 | [From Oral 4C] Partial Person Re-Identification |
| 73 | [From Oral 4C] Shape Interaction Matrix Revisited and Robustified: Efficient Subspace Clustering With Corrupted and Incomplete Data |
| 74 | [From Oral 4C] Multiple Hypothesis Tracking Revisited |
| 75 | [From Oral 4C] Learning to Track: Online Multi-Object Tracking by Decision Making |
|
|
[17:15-18:45] Oral Session 4C - Video: Actions, Surveillance and Tracking |
|
| | Uncovering Interactions and Interactors: Joint Estimation of Head, Body Orientation and F-Formations From Surveillance Videos |
| | Generating Notifications for Missing Actions: Don't Forget to Turn the Lights Off! |
| | Partial Person Re-Identification |
| | Shape Interaction Matrix Revisited and Robustified: Efficient Subspace Clustering With Corrupted and Incomplete Data |
| | Multiple Hypothesis Tracking Revisited |
| | Learning to Track: Online Multi-Object Tracking by Decision Making |